Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchy of Sized traits #3729

Open
wants to merge 73 commits into
base: master
Choose a base branch
from

Conversation

davidtwco
Copy link
Member

@davidtwco davidtwco commented Nov 15, 2024

All of Rust's types are either sized, which implement the Sized trait and have a statically known size during compilation, or unsized, which do not implement the Sized trait and are assumed to have a size which can be computed at runtime. However, this dichotomy misses two categories of type - types whose size is unknown during compilation but is a runtime constant, and types whose size can never be known. Supporting the former is a prerequisite to stable scalable vector types and supporting the latter is a prerequisite to unblocking extern types. This RFC proposes a hierarchy of Sized traits in order to be able to support these use cases.

This RFC relies on experimental, yet-to-be-RFC'd const traits, so this is blocked on that. I haven't squashed any of the previous revisions but can do so if/when this is approved. Already discussed in the 2024-11-13 t-lang design meeting with feedback incorporated.

See this comment for the most recent summary of changes to this RFC since it was opened.

Rendered

@davidtwco davidtwco added the T-lang Relevant to the language team, which will review and decide on the RFC. label Nov 15, 2024
Co-authored-by: León Orell Valerian Liehr <[email protected]>
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
text/3729-sized-hierarchy.md Outdated Show resolved Hide resolved
@tmandry tmandry added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Nov 15, 2024
This was referenced Jan 2, 2025
@workingjubilee workingjubilee added the A-dst Proposals re. DSTs label Jan 2, 2025
@Ericson2314
Copy link
Contributor

Ericson2314 commented Jan 2, 2025

Wow, a funny (but good!) feeling to think that this might finally get resolved after a decade. :)

Instead of introducing a new marker trait, std::ptr::Pointee could be re-used if there were some mechanism to indicate that associated types or methods could only be referred to with fully-qualified syntax. Alternatively, it would be possible to introduce forward-compatibility lints in current edition, the new traits were introduced in the next edition and the edition migration previously described in the next next edition.

I think std::ptr::Pointee should definitely be reused. I am not sure which alternative is, but it seems to me like the right thing to do is:

  • Use std::ptr::Pointee
  • Do the hack to avoid new ambiguity in the current addition
  • Remove the hack in the next addition

Also, I agree with @RalfJung than the ? avoidance is quite confusing, and we are mixing up syntax and semantics. The bottom line is that semantically, the trait hierarchy is the free lattice over the trait partial order. Decoding the complex math speak: that means the minimum element (or maxiumum, whatever way you want the arrows to point) is not a trait but the empty set of traits. So we need to specify what that means.

After we do that, we we can debate syntax.

In particular, the "Why have Pointee?" is a bit misleading in that semantically there currently is indeed no reason, it is just there for syntactic purposes.

However, if we reuse std::ptr::Pointee then there is a semantic reason, namely the associated type. ({} / no traits has no associated types or any other trait members, because it has no traits!) So here is an extra incentive to reuse std::ptr::Pointee --- one can side-step what @RalfJung and I are saying :).

At the point I still think it is good education to talk about the empty trait set as:

  • You can't use it behind a pointer
  • You can't use it as a value
  • You could use it in PhantomData, but probably this is left as future work
  • Maybe you can't quantify over it all (as opposed to you can, but then you can't use the variable anywhere, and then its an unused type parameter error)

Oh and a final thing on semantics, I might also also present the back compat / editions story this way:

  • Old semantic hierarchy
  • New semantic hierarchy
  • mapping from old to new semantics, (old {} (no traits) is mapped not to new {} but to const MetaSized) --- in particular, do not speak of ?Size when describing this semantic mapping.
  • mapping from new syntax in next edition maps to new semantics.

In particular, the existing old syntax maps to the new semantics via the old semantics; we can cleanly factor that into those two steps, and I think this brings conceptual clarity.

@programmerjake
Copy link
Member

  • Use std::ptr::Pointee
  • Do the hack to avoid new ambiguity in the current addition
  • Remove the hack in the next addition

imo having an associated type on practically every type will be annoying, so I think it might be useful to require writing <T as Pointee>::Metadata instead of T::Metadata on all editions.

@workingjubilee
Copy link
Member

the trait's unstable, if it's just a matter of bikeshedding we can rename the type PtrMetadata or sth

davidtwco referenced this pull request in oli-obk/rfcs Jan 8, 2025
@davidtwco
Copy link
Member Author

I think std::ptr::Pointee should definitely be reused. I am not sure which alternative is, but it seems to me like the right thing to do is:

  • Use std::ptr::Pointee
  • Do the hack to avoid new ambiguity in the current addition
  • Remove the hack in the next addition

I've added this as one of the unresolved questions.

In particular, the "Why have Pointee?" is a bit misleading in that semantically there currently is indeed no reason, it is just there for syntactic purposes.

I've rewritten this section, it should have been updated earlier with the addition of the alternatives which keep the ?Sized syntax and clarified that it's just a syntactic difference.

At the point I still think it is good education to talk about the empty trait set as:

  • You can't use it behind a pointer
  • You can't use it as a value
  • You could use it in PhantomData, but probably this is left as future work
  • Maybe you can't quantify over it all (as opposed to you can, but then you can't use the variable anywhere, and then its an unused type parameter error)

Oh and a final thing on semantics, I might also also present the back compat / editions story this way:

  • Old semantic hierarchy
  • New semantic hierarchy
  • mapping from old to new semantics, (old {} (no traits) is mapped not to new {} but to const MetaSized) --- in particular, do not speak of ?Size when describing this semantic mapping.
  • mapping from new syntax in next edition maps to new semantics.

In particular, the existing old syntax maps to the new semantics via the old semantics; we can cleanly factor that into those two steps, and I think this brings conceptual clarity.

I haven't made these changes at the moment - I've had good feedback about the explanations in the RFC and would prefer not to make major changes to that until the proposal itself needs changing.

@davidtwco
Copy link
Member Author

I've updated the previous summary comment so it is still accurate.

@davidtwco
Copy link
Member Author

I've pushed one small change with a reference to the now-upstream const traits RFC in #3762.

@traviscross traviscross added I-lang-radar Items that are on lang's radar and will need eventual work or consideration. and removed I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. labels Jan 26, 2025
@safinaskar
Copy link

Linux has at its syscall boundary struct statmount. See here for details: https://brauner.io/2024/12/16/list-all-mounts.html . Its first field is size of whole struct itself. And it is u32 on all platforms! So, yes, I think we should support such structs. Note: this is not size of DST in its end, it is size of whole struct.

There is even more exotic case: struct dirent ( https://www.man7.org/linux/man-pages/man3/readdir.3.html ). (Yes, I already talked about it in previous comment, but I want to add more.) It embeds its own size, too. But not as a first field, but as a third. :) And that field (d_reclen) is not usize, it is u16! And again, this is not size of last DST field, this is size of struct itself.

Note: manpage says that size of last field is 256. This is a lie. In fact, this is DST, and it can be both bigger and smaller than 256.

Also, the string in the last field is always proper null-terminated C string, so we have two different ways of determining struct size: by calling strlen on d_name (with necessary calculations) and by reading d_reclen. So, technically d_reclen field is redundant. And to properly support this struct in Rust, Rust should have at least one of the following two features:

  • ThinCStr we all dream about. It should be allowed to be located in the end of a struct (this will support this d_name field)
  • Embedding size of a struct (of a whole struct!) inside struct itself. Not necessary as a first field. And not necessary as a usize. u16 should be supported

Not supporting both features will lead to horrible hacks, such as this: https://github.com/rust-lang/rust/blob/c1132470a6986b12503e8000e322e9164c1f03ac/library/std/src/sys/pal/unix/fs.rs#L730-L768 (this is from standard library!)

And again: to support statmount we need possibility to embed size of struct inside struct itself. As a first field of size u32 (even on 64-bit platforms). In this case calling strlen on last field will not help, because it is allowed to contain many embedded null bytes

@RalfJung
Copy link
Member

RalfJung commented Feb 3, 2025

Not supporting both features will lead to horrible hacks, such as this:

In the latest version, this hack is mostly gone. What remains is not so bad: https://github.com/rust-lang/rust/blob/58d6301cad7ac20407e708cd9d1ec499d1804015/library/std/src/sys/pal/unix/fs.rs#L726-L753
(rust-lang/rust#136479 cleans this up further by removing the macro.)

IMO it is not worth contorting the language to natively express esoteric types such as those, as long as we have some way of interfacing with C code using such types (and we do).

@davidtwco
Copy link
Member Author

davidtwco commented Feb 6, 2025

Another summary comment!

Here are the previous update summaries copied below so you don't need to uncollapse GitHub comments to find it:

2024-11-25

Summary comment: #3729 (comment)

For those following along or catching up, these are the notable the changes to the RFC since this was posted:

  • Clarify proposed behaviour for ?Trait syntax for non-Sized, which is currently accepted
  • Stop re-using std::ptr::Pointee and make Pointee its own new marker trait to avoid backwards incompatibility
  • Clarify backwards compatibility implications of ?Sized syntax and add alternatives to removing the default bound using positive bounds which continue to use ?Sized
  • Add that relaxing existing bounds in trait methods would be backwards incompatible
  • Elaborate on necessity of implicit const ValueSized bound on Self type of traits
  • Add MetaSized alternative to ValueSized which would resolve interactions with mutexes
  • Clarified that bounds on return types can never be relaxed.

And these are all the other smaller changes that don't materially impact what is being proposed:

  • Fixed some minor wording errors where supertrait/subtrait were used backwards
  • Removed HackMD's rust= syntax from codeblocks
  • Fixed referring to the introduction of a const Sized trait, but rather adding a const modifier to the existing Sized trait
  • Added some background/context on dynamic stack allocation
  • Use current experimental const trait syntax
  • Corrected incorrect syntax for traits
  • Listed all alternate bounds (adding ~const ValueSized to a list of bounds that it was missing from)
  • Fixed bound in description of size_of_val changes
  • Corrected description of current size_of_val and align_of_val behaviour
  • Corrected description of extern type usage in structs
  • Mention Rust for Linux's interest in extern type
  • Weaken language in the externref future possibility to make it clear this proposal would not be sufficient on its own to support these
  • Re-write Aligned future possibility so that it is clear Aligned couldn't be added to the proposed hierarchy

At the moment, I prefer the following alternatives to the primary proposal of the RFC, and may re-write to incorporate these as the primary proposal:

2024-12-04

Summary comment: #3729 (comment)

These are the major changes since the last summary:

  • Change from ValueSized to MetaSized
  • Add unresolved question about whether re-using std::ptr::Pointee is important.

These are the minor changes since the last summary:

  • Clarify that bounds on parameters used as return types can never be relaxed (return types still need to implement Sized)
  • Clarify that non-const Sized is not nameable in the current edition due to the backwards compatibility migration, only in the next edition
  • Clarify which Pointee (marker trait Pointee or std::ptr::Pointee) is referenced in the Custom DSTs future possibility
  • Improve clarity of the table describing alternative syntax proposals for relaxing bounds
  • Strengthen wording describing use of runtime-sized types in a const context as unsound
  • Expand on motivation for MetaSized
  • Specify that methods of the new traits would be backwards-incompatible too, not just associated types
  • Use the same indentation in code blocks throughout
  • Fix typo in list of when each trait gets implemented
  • Add a concrete example of code that would break if a supertrait of Clone were relaxed
  • Added note about potential reasons why delaying relaxation of size_of's bound might be desirable
  • Fixed typo, replacing "alignment" with "size"
  • Rewrote "Why have a Pointee trait?" section to clarify that the trait is only necessary due to the proposed removal of ?Sized syntax
  • Updated reference to const traits now that it has an open RFC
  • Fixed a broken link
  • Fixed broken indentation on a bullet list

These are the alternatives described in the RFC that I think are worth consideration as the primary proposal:


We had a design meeting with the language team yesterday (Feb 5th) which led to the changes described below, a rough summary:

  • No significant concerns were raised with the proposal itself (i.e. the specific traits proposed, the edition migration proposed, etc)
  • There was a clear consensus on some of the unresolved questions and no strong opinion on others (see 5806017)
  • t-lang would be okay with a experiment proceeding with this, but the preference of the const traits implementors is that no experiment take place while const traits' syntax is undecided, as this would increase the work they have to do when making any potential syntax changes requested by t-lang
  • It is still unclear how to proceed with regards to the const traits dependency
    • Further experimentation following the meeting (see ) has reinforced the necessity of the const traits dependency
    • There is disagreement within the language team whether the not-yet-accepted const traits dependency blocks this from being accepted (without a possibility stabilisation w/out const traits stabilisation)

I've started an implementation of this - I've more-or-less finished implementing the non-const parts of the proposal, and will proceed with the const parts. It won't land upstream until const traits implementors are okay with it.

These are links to the compare with previous versions of the document (switch to the "Files changed" tab and then the rich diff):

These are the major changes since the last summary:

  • Add note describing a niche but unavoidable backwards incompatibility
  • Note that const traits isn't a strict dependency
  • Describe the perspective of the language team following our meeting on the unresolved questions (leaving them unresolved but expressing a bias in a direction)

These are the minor changes since the last summary:

  • Describe another alternative which would enable re-using std::ptr::Pointer
  • Note that these traits can be stabilised independently of each other

Copy link

@Skepfyr Skepfyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the summary comments they're incredibly useful. I've re-read this and looked through the updates and this proposal feels closer, however I still have a few comments...

Positive trait bounds are growing on me though, I like that they stop us having to worry about what "the empty set of traits" means, however I'm not convinced that's possible, because of supertrait bounds (as described in a comment).


This property trivially falls out of the previous proposals in this RFC:
runtime-sized types are `Sized` (but not `const Sized`) and therefore can
implement `Clone` and `Copy`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this RFC is implicitly dependent on unsized locals here, the only way I can see the Clone or Copy traits supporting Sized types is with it. Consider <svint8_t as Clone>::clone that returns an svint8_t but that doesn't have a stack size known at compile time, so you need a bunch of the unsized locals machinery to support it right?

```

Once this interaction with constness was realised, this RFC's dependency on const
traits was introduced. Due to const traits (at the time of writing this RFC) being
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like surprisingly weak motivation for all the const stuff that's making this hierarchy complicated. I wonder if we can replace Clone with:

trait Clone {
    fn clone(&self) -> Self;
    fn clone_from(&mut self, other: &Self) where Self: Sized { ... }
}

and replace Clone bounds with Clone + Sized, admittedly this is more edition shenanigans that feels a bit dirty (and I haven't completely thought it through).

I do agree that having const trait stuff would give you the ability to do some stuff: size_of on non-const Sized types, and have arrays&slices of non-const Sized types (although this would require some complier work). I'm just not clear how useful those are.

which is now equivalent to a `MetaSized` bound in non-`const fn`s and
`~const MetaSized` in `const fn`s and will be deprecated in the next edition.

Traits now have an implicit default bound on `Self` of `const MetaSized`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still very nervous about this, I believe this is done solely for backwards compatibility reasons but I want to make sure it's clear this is sacrificing forwards compatibility.

If traits gain (debatably keep) an implicit const MetaSized bound then:

  • Everything keeps working in a nice way
  • People writing libraries will just write trait Foo {} rather than trait Foo: Pointee {} unless they're specifically thinking about unsized types for some reason making them a second class citizen in the ecosystem (and relaxing the bund when asked is a breaking change)
  • If we ever add another layer below Pointee (eg externref) then no existing traits will work with them and it would be breaking to relax them. I think this means no std traits could ever support them.

If traits have no implicit bound (also debatably keep, depending on how you imagine this) then:

  • You'd have to do this over an edition, and add in : const MetaSized everywhere so that default method implementations keep working. (removing that bound is safe for std, and dubious for crates).
  • You'd need to write : Pointee etc if you need to write default method impls, but I think that's a good thing.
  • This does force us to work out what you can do with "the empty set of traits", because that what these types have in default trait impls (with no supertraits) (this makes me less favourable towards positive bounds)

| `Pointee` | `T: Pointee` | `T: ?(const Sized)` |

In other words, `?(const Sized)` fully opts out of all default bounds and then one has to
explicitly opt back in.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was straw-manned slightly in the design meeting, I disagree with the phrasing:

'I don't want to think of the "next most precise thing"'

and

"one level below" in the hierarchy

Neither proposal makes you think like that (ignoring the ?MetaSized below), the ?Sized variant requires you to write the positive bound always (and I'd argue that the T: Pointee case should be written T: ?(const Sized) + Pointee) so the only negative thinking you have to do is "do I need to disable the default const Sized bound?".

However, I agree this discussion can be punted until stabilisation given it has essentially no impact on the semantics of what's going on.

P.S The idea of negative trait aliases (trait Foo = ?Sized;) terrifies me, it feels like it'd be incredibly confusing at the use-site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dst Proposals re. DSTs I-lang-radar Items that are on lang's radar and will need eventual work or consideration. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.