https://stdrs.dev/nightly/x86_64-unknown-linux-gnu/src/std/s...
1. An object that’s really a Rust Vec under the hood is awkward when you try to pass ownership to C. ISTM you end up solving it by creating what is, in effect, a handle, and freeing that using a custom free function. Is this really a problem with Rust FFI? This style of API is very common in native C code, and it’s a lot more flexible and future-proof than having a library user call free() directly.
2. You’re consuming the API in Rust and being sad that there’s no native defer statement. But Rust can do RAII — could you build a little wrapper struct that implements Drop?
Valgrind is good to run as a sanity check, but I prefer a month under a regression test hammer to validate something isn't going to slowly frag itself before scheduled maintenance. =3
Apparently they did away with jemalloc in favor of the system allocator that same year but nonetheless when I came back to it years later I was very happy to learn of its removal.
[1]: https://github.com/rust-lang/rust/pull/55238
[2]: https://blog.rust-lang.org/2019/01/17/Rust-1.32.0/
At the time in 2018, however, I was using Linux but now use Windows. I just now learned that Rust used the system allocator in Windows longer than for Linux [2] where it was still using jemalloc for the majority of 2018.
Why?
It also does not give a good first impression at all to newcomers to see their hello world project built in release mode take up almost 2MiB of space. Today it's a much more (subjectively) tolerable 136kiB on Windows (considering that Rust std is statically linked).
Right, but, so what? I can't imagine a practical reason this matters for the vast majority of situations. So either you're doing something pretty niche, or it's a purely aesthetic complaint.
Those are certainly valid. There are entire Linux distros whose mere existence is as the answer to what is only an aesthetic complaint.
If nothing else doing otherwise is likely to cause compatibility issues for someone at some point. For example see all the problems Go had with DIY syscalls pretty much everywhere except for Linux.
There's a legitimate question of whether the kernel ABI or the libc API qualifies as the system provided facilities on a linux box. But that uncertainty only furthers my latter point about compatibility.
This means that e.g. a growable container type doesn't have to guess that your allocator probably loves powers of 2 and so it should try growing to 256 bytes not 185 bytes, it can ask for 185 bytes, get 256 and then pass that on to the user. Significant performance is left on the table when everybody is guessing and can't pass on what they know due to ABI limitations.
Rust containers such as Vec are already prepared to do this - for example Vec::reserve_exact does not promise you're getting exactly the capacity you asked for, it won't do the exponential growth trick because (unlike Vec::reserve) you've promised you don't want that, but it would be able to take advantage of a larger capacity provided by the allocator.
I don't really care if the memory region is past capacity * size of::<T>(), but I do want to be able to check if .len() == .capacity() and not be surprised
The documentation for Vec already explains exactly what it's offering you, but lets explore, what exactly is the problem? You've said this is "commonly done" so doubtless you can point at examples for reference.
Suppose a Goose is 40 bytes in size, and we aim to store say 4 of them, for some reason we decide to Vec::new() and then Vec::reserve_exact(..., 4) rather than more naturally (but with the same effect) asking Vec::with_capacity(4) but alas the allocator underpinning our system has 128 or 256 byte blocks to give, 4x40 = 160, too big for 128, so a 256 byte block is allocated and (a hypothetical future) Vec sets capacity to 6 anyway.
Now, what disaster awaits in the common code you're talking about? Capacity is 6 and... there's capacity for 6 entries instead of 4
> Now, what disaster awaits in the common code you're talking about? Capacity is 6 and... there's capacity for 6 entries instead of 4
The capacity was expected to be 4 and not 6, which may be a logical error in code that requires it to be. If this wasn't a problem the docs wouldn't call it out as a potential problem.
The resize + shrink_to_fit incantation sounds to me a lot like one of those "Sprinkle the volatile keyword until it works" ritualistic C++ practices not based in any facts.
As someone who primarily writes C++ I would not expect that to work. I mean it's great if it does I guess (I don't really see the point?) but that would honestly surprise me.
I would _always_ expect to use >= for capacity comparisons and I don't understand what the downside would be. The entire point of these data structures is that they manage the memory for you. If you need precise control over memory layout then these are the wrong tools for the job.
Yes and no. Even within libstd, some things require A=GlobalAlloc, eg `std::io::Read::read_to_end(&mut Vec<u8>)` will only accept Vec<u8, GlobalAlloc>. It cannot be changed to work with Vec<u8, A> because that change would make it not dyn-compatible (nee "object-safe").
And as you said it will cut you off from much of the third-party crates ecosystem that also assumes A=GlobalAlloc.
But if the subset of libstd you need supports A=!GlobalAlloc then yes it's helpful.
Not sure if that is a breaking change though, it probably is because of a small detail, I'm not a rustc dev.
Second, no a `&mut Vec<u8, A>` is not convertible to `&mut Vec<u8, dyn Allocator>`. This kind of unsized coercion cannot work because it'll require a whole different Vec to be constructed, one which has an `allocator: dyn Allocator` field (which is unsized, and thus makes the Vec unsized) instead of an `allocator: A` field. The unsized coercion you're thinking of is for converting trait object references to unsized trait object references; here we're talking about a field behind a reference.
And no, it is possible. Here is an example that does it with BufReader, which has T: ?Sized and uses it as a field: https://play.rust-lang.org/?version=stable&mode=debug&editio...
Though it comes with a caveat that you can't take self by value, which is perfectly fine for this use case & is what a normal allocator-aware language does anyway.
That said, other than the problem of this being a breaking API change for Read::read_to_end, another problem is that Vec's layout is { RawVec, len } and the allocator is inside RawVec, so the allocator is not the last field of Vec, which is required for structs to contain unsized fields. It would require reordering Vec's fields to { len, RawVec } which may not be something libstd wants to do (since it'll make data ptr access have an offset from Vec ptr), or to inline RawVec into Vec as { ptr, cap, len, allocator }.
It’s also very possible I’m missing something!
Yeah no this is basically all I'm saying. I'm excited for this.
Here's what I got when I asked ChatGPT to rewrite your first comment to be as constructive as possible:
"Totally agree — relying on C and Rust to interoperate at the allocator level is risky at best. Allocating in C and freeing in Rust (or vice versa) assumes a level of compatibility that just isn’t guaranteed. Even within a single ecosystem, allocator behavior can change across versions — whether it’s different versions of libc or updates to Rust’s std. So expecting consistent behavior across language boundaries is, at the very least, unreliable. If you need to free a C-allocated pointer, the safest and cleanest approach is to call back into C to do it."
That's not a drop-in substitute for your original comment and "totally agree" is over-the-top cloying in the usual ChatGPT obsequious way, but I still think it's helpful in suggesting alternative framings.
However, not all C constructs have idomatic Rust equivalents. For example, bitfields don't exist in Rust, and unlike Rust enums, C enums can have any value of the underlying type. And for ABI reasons, it's very commom in C APIs to use a pointer to an opaque type paired with what are effectively accessor function and methods, so mapping them to accessors and methods on a "Handle" type in Rust often is the most idomatic Rust representation of the C interface.
`#[repr(C)]` instructs the compiler to lay the struct out exactly according to C’s rules: order, alignment, padding, size, etc. Without this, the compiler is allowed a lot more freedom when laying out a struct.
I didn't understand what was being said here; was he suggesting that you call libc free using FFI; which would be fine? I understand the interviewer asked about using Rust dealloc though. I think the FFI bit is confusing me.
> Interviewer: “What happens if you allocate memory with C’s malloc and try to free it with Rust’s dealloc, if you get a pointer to the memory from C?”
> Me: “If we do it via FFI then there’s a possibility the program may continue working (because the underlying structs share the same memory layout? right? …right?)”
I don't know if the author used AI to write this, but if they didn't, this is the person AI agents decided to copy the writing style of.
Edit: Reddit thread somewhere in the comments here to a post from the author pretty much confirmed my suspicions, this article is heavily AI generated and plain wrong in several cases. A good reminder not to use AI slop to learn new topics, because LLMs bullshit half the time and you need to know what you're doing to spot the lies.
> “Memory oppresses me.” - Severian, The Book of the New Sun
That sort of artistic/humourous flourish isn't in character for an LLM.
“It’s not just _________. It’s _________________.”
This was in almost every response doubling down on the users ideas and blowing things out of proportion. Stuff like…
“It’s not just a good idea. It’s a ground up rewriting of modern day physics.”
- Your program continues running with a corrupted heap - a time bomb that will explode unpredictably later.
- You’re not just getting 64 bytes of memory. You’re entering into a complex contract with a specific allocator implementation.
- The Metadata Mismatch
- If it finds glibc’s metadata instead, the best case is an immediate crash. The worst case? Silent corruption that manifests as mysterious bugs hours later.
- Virtual Memory: The Grand Illusion
- CPU Cache Architecture: The Hidden Performance Layer
- Spoiler: it’s even messier than you might think.
He's not the only one to point out these things that LLMs (currently) tend to output, but this is one of the shorter overviews of the tells you can spot.
https://www.reddit.com/r/rust/comments/1mh7q73/comment/n6uan...
The reply to that comment is also a good explainer of why the post has such a strong LLM smell for many.
BTW that Reddit post also has replies confirming my suspicions that the technical content wasn't trustworthy, if anyone felt like I was just being snobby about the LLM writing: https://www.reddit.com/r/rust/comments/1mh7q73/comment/n6ubr...
In addition to the emoji, things that jumped out at me were the pervasive use of bullet lists with bold labels and some specific text choices like
> Note: The bash scripts in tools/ dynamically generate Rust code for specialized analysis. This keeps the main codebase clean while allowing complex experiments.
But I did just edit my post to walk it back slightly.
As a non-native English speaker, 90% of my vocabulary come from technical books and SF and Fantasy novels. And due to an education done in French, I tend to prefer slightly complicated sentences forms.
If someone uses LLM to give their posts clarity or for spellchecking, I would aplaud them. What I don’t agree with, LLM use or no, is meandering and inconsistency.
It's not a binary thing, of course, but it's definitely an LLM smell, IMO.
Seriously though: I already knew the “don’t mix allocators” rule, but I really enjoyed seeing such a careful and hands-on exploration of why it’s dangerous. Thanks for sharing it.