-
Notifications
You must be signed in to change notification settings - Fork 360
Clarify invariants and fix constructors and is_standard_layout #543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify invariants and fix constructors and is_standard_layout #543
Conversation
1b5d312 to
f3c4441
Compare
| let dim_size = dim | ||
| .size_checked() | ||
| .ok_or_else(|| d.error("overflow calculating array size"))?; | ||
| if len != dim_size { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This reminds me, it's more than time to drop rustc-serialize I think :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was thinking the same thing. :)
| /// or `self.fortran_strides()`, then the invariants are met to construct | ||
| /// an owned `Array` from the `Vec`, `dim`, and `strides`. (See the docs of | ||
| /// `can_index_slice_not_custom` for a slightly more general case.) | ||
| fn size_checked(&self) -> Option<usize> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see why this is needed, yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this seems good.
Isn't there an option to let this method just work with the overflow check and then check the isize::MAX requirement in constructors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, whoops. I didn't realize that size_checked was a public method.
Initially, I didn't do this because I thought size_checked was private, and I noticed that all places but one where .size_checked() was called needed to perform the isize::MAX check too. (The only exception is this line in indices_iter_f.)
I'll change it back to the original behavior and add a separate checking function for internal use.
|
I agree with your thoughts about the tradeoffs. We don't need to prioritize zero size elements much at all, and I'll look if we can do this without making docs and code so much more complicated. |
| _ => true, | ||
| } | ||
| }); | ||
| debug_assert!(dimension::can_index_slice(&v, &dim, &strides).is_ok()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this now correct? There were some negative stride issues we couldn't check properly before (we'd have a false positive, which is fine, it was a "missing feature", but we can't debug assert on that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below. If we want to allow negative strides, we need to provide some way for the user to specify the ptr since it's always different from vec.as_mut_ptr() when there are negative strides for axes with length > 1.
| /// | ||
| /// 2. The product of non-zero axis lengths must not exceed `isize::MAX`. | ||
| /// | ||
| /// 3. For axes with length > 1, the stride must be nonnegative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about why we require this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be unchecked? We don't implement the proper debug check for negative strides, but if the strides are correct, they should be fine to be used here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If any axes with length > 1 have negative strides, moving along those axes would result in offsetting the array's pointer backwards outside the Vec. (from_shape_vec_unchecked is implemented in terms of from_vec_dim_stride_unchecked, which creates the array's pointer from v.as_mut_ptr().) We could allow negative strides if we allow the user to specify the array's ptr (or, equivalently, the offset of ptr relative to v.as_mut_ptr()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh great point, I just never considered that about these constructors. Thanks for the explanation.
| /// A pointer into the buffer held by data, may point anywhere | ||
| /// in its range. | ||
| /// A non-null and aligned pointer into the buffer held by `data`; may | ||
| /// point anywhere in its range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good
|
Very nice work on this PR 😄 |
| // bounds or one byte past the end of a single allocation with element | ||
| // type `A`. The only exceptions are if the array is empty or the element | ||
| // type is zero-sized. In these cases, `ptr` may be dangling, but it must | ||
| // still be safe to [`.offset()`] the pointer along the axes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I guess we must accept a dangling pointer, as in the one we get from an empty Vec
| // methods/functions on the array while it violates the constraints. | ||
| // | ||
| // Users of the `ndarray` crate cannot rely on these constraints because they | ||
| // may change in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whole block is super
|
By the way, one thing that surprised me was that before this PR, |
447f654 to
6adb1e4
Compare
|
@jturner314 ok about can_index_slice, so the argument for allowing more general strides for zero length axes would be that they can come up in normal operation somehow? Otherwise it seems good to ban them |
Here's an example demonstrating that arbitrary strides can occur for zero-length axes in normal operation: extern crate ndarray;
use ndarray::{array, s, Axis};
fn main() {
let mut a = array![[1, 2, 3], [4, 5, 6]];
a.invert_axis(Axis(0));
println!("a =\n{:?}", a);
let s = a.slice(s![0..0, ..]);
println!("s =\n{:?}", s);
}It prints: The stride of axis 0 is We could still ban them in constructors, but I don't see a good reason to if they can occur in normal operation. |
|
@jturner314 sounds good |
8df9cfb to
8850395
Compare
|
I think I've resolved all of the comments, and it finally passes CI. :)
Making the assumption that slices contain at most |
|
Monumental effort, it's great! I'll try my hand at adding a comment more |
8850395 to
9435c8f
Compare
|
ready to merge when you think so |
9435c8f to
4bcbce5
Compare
4bcbce5 to
da99962
Compare
da99962 to
8727b8a
Compare
|
I fixed some minor issues (comments/docs) and merged the PR. 🎉 |
Various method and iterator implementations in
ndarrayimplicitly made some assumptions that were not enforced. For example,In a number of places, methods/iterators violated the safety constraints on
.offset(). For example,AxisIterCoreoffsets the pointer along the axis of iteration even if one of the other axes is zero-length. As a result, before this PR, this program would have undefined behavior:The stride of axis 1 is 1, so
AxisItercallsptr.offset(0 * 1),ptr.offset(1 * 1), andptr.offset(2 * 1). The first and second offsets are fine because they're "in bounds or one byte past the end of the same allocated object", but the last offset is undefined behavior.This PR states the invariant that offsets along all axes must be safe, even if there are zero-length axes, and it checks this property in array constructors.
In some places, implementations made the assumption that the axis lengths would always be less than or equal to
isize::MAX, such as in thestride_offsetfunction in thedimensionmodule. However, it is possible to create aVec/slice with length greater thanisize::MAXof zero-size elements, and the implementation ofsize_checkedallowed creation of arrays with axis lengths greater thanisize::MAX(e.g.ArrayView2::<i32>::from_shape((isize::MAX as usize + 1, 0), &[]).unwrap(). I have trouble finding a place where this would cause undefined behavior, but these cases could cause unexpected panics in debug mode due to overflow, and the fact that I've seen code making this invalid assumption makes me uncomfortable that there may be a soundness issue I haven't found.This PR changes the constructors to enforce that the product of non-zero axis lengths must not exceed
isize::MAX.This PR explicitly states the invariants that
ArrayBasemust uphold and methods can depend on. It also adds the necessary checks to enforce these invariants when arrays are created. The invariants are carefully designed so that slices/subviews/reshapes/etc. preserve them, so checks are needed only in the constructors and not in method/iterator implementations.This PR does make some changes to the behavior of constructors, the default strides for empty arrays, and
is_standard_layoutin some edge cases. However, I would still consider it a backwards-compatible change because it's enforcing assumptions that were made in previous versions that should have been checked. In other words, it's a bug fix.A few notes about possible alternatives:
AxisIterand.subview_inplace(). These checks would be easy-to-forget, which is problematic because missing a check could result in undefined behavior. I much prefer performing the necessary checks in constructors so that all the other method/iterator implementations can be simple.isize. This would be slightly more flexible because, for example, it be possible possible to create a 1-D array ofusize::MAXzero-size elements. However, this additional flexibility doesn't have practical applications and would require us to explicitly check for overflow in a lot more places.ArrayPtr/ArrayPtrMutin Add raw array pointer types #496 more flexible. However, method implementations already make this assumption (e.g..as_slice()), allowing null pointers isn't significantly more useful than allowing dangling pointers, and forbidding null pointers will allow us to switchArrayBase.ptrto useNonNullin the future (which should remove the need for "reborrow" methods (Add .reborrow() methods to array views #412)).