FilterHN

https://stdrs.dev/nightly/x86_64-unknown-linux-gnu/src/std/s...

veber-alex

26 days ago

[-]

The reason you are not seeing crashes when allocating with Rust and freeing with C (or vice versa) is that by default Rust also uses the libc allocator.

broken_broken_

26 days ago

[-]

Miri and Valgrind will usually catch this kind of issue. I did lots of work mixing C and Rust and that tripped me as well at the beginning. I wrote about it if someone is interested: https://gaultier.github.io/blog/perhaps_rust_needs_defer.htm...

amluto

26 days ago

[-]

I’m a bit confused by this writeup. It seems to me that you’re experiencing two issues:

1. An object that’s really a Rust Vec under the hood is awkward when you try to pass ownership to C. ISTM you end up solving it by creating what is, in effect, a handle, and freeing that using a custom free function. Is this really a problem with Rust FFI? This style of API is very common in native C code, and it’s a lot more flexible and future-proof than having a library user call free() directly.

2. You’re consuming the API in Rust and being sad that there’s no native defer statement. But Rust can do RAII — could you build a little wrapper struct that implements Drop?

Joel_Mckay

26 days ago

[-]

Valgrind sometimes has problems with encapsulated C libraries slow anonymous memory leaks. For example, the session caching in modern libcurl is obfuscated fairly well from novices.

Valgrind is good to run as a sanity check, but I prefer a month under a regression test hammer to validate something isn't going to slowly frag itself before scheduled maintenance. =3

CupricTea

26 days ago

[-]

It's funny. When I first tried Rust in 2018 they were still statically linking jemalloc into every binary rustc compiled by default, and that alone very much put me off of the language for a while.

Apparently they did away with jemalloc in favor of the system allocator that same year but nonetheless when I came back to it years later I was very happy to learn of its removal.

saati

26 days ago

[-]

By 2018 it was the system malloc by default, so that does not add up.

[2]: https://blog.rust-lang.org/2019/01/17/Rust-1.32.0/

CupricTea

26 days ago

[-]

Jemalloc was removed in November 2018 [1] and didn't land in stable where it defaulted to the system allocator until Rust 1.32 released in January 2019 [2]

[1]: https://github.com/rust-lang/rust/pull/55238

At the time in 2018, however, I was using Linux but now use Windows. I just now learned that Rust used the system allocator in Windows longer than for Linux [2] where it was still using jemalloc for the majority of 2018.

26 days ago

[-]

> that alone very much put me off of the language for a while

Why?

CupricTea

26 days ago

[-]

Jemalloc added over a megabyte to every project for only questionable gains, and it was awkward and unwieldy to remove it. While there are good reasons to use a different allocator depending on the project, Rust defaulting to this type of behavior failed a certain personal litmus test on what it wanted to be as a language in that it felt like it was fighting the system rather than integrating with it.

It also does not give a good first impression at all to newcomers to see their hello world project built in release mode take up almost 2MiB of space. Today it's a much more (subjectively) tolerable 136kiB on Windows (considering that Rust std is statically linked).

26 days ago

[-]

> Jemalloc added over a megabyte to every project

Right, but, so what? I can't imagine a practical reason this matters for the vast majority of situations. So either you're doing something pretty niche, or it's a purely aesthetic complaint.

mpyne

26 days ago

[-]

> or it's a purely aesthetic complaint

Those are certainly valid. There are entire Linux distros whose mere existence is as the answer to what is only an aesthetic complaint.

bigstrat2003

26 days ago

[-]

So it's wasted space if you don't need it. Being a small amount of space to waste doesn't make it reasonable to do that.

cozzyd

26 days ago

[-]

Imagine you had a bunch of rust binaries on your system instead of one. And you're running on a 2 GB emmc.

25 days ago

[-]

Okay, that's a niche scenario. I agree Rust during the statically linked jemalloc era might not be ideal in that scenario. But just because you can concoct some possible scenario in which it's not the best language to use doesn't mean you should write it off completely.

cozzyd

25 days ago

[-]

I would think Embedded Linux would be one of the most desired use cases for rust (certainly it's a place where higher level languages are often not an option due to resource usage).

estebank

24 days ago

[-]

It's a matter of what use cases are catered to by default, and what use cases are possible. For example, having debug symbols on by default makes it easier to start using the language and debug your applications, at the cost of more disk space and having to configure your project if your needs are different. I think the discussion would have more merit if the defaults weren't configurable. But of course, the choice of defaults is one that will be contentious no matter what. It should still be done following the Hippocratic Oath of "first, do no harm". And some of us need to accept that our usecases are more niche than others, including mine.

fc417fc802

26 days ago

[-]

Seems like overstepping bounds. I expect a language runtime to make use of system provided facilities wherever feasible. I also expect it to provide hooks for me to link in alternatives.

If nothing else doing otherwise is likely to cause compatibility issues for someone at some point. For example see all the problems Go had with DIY syscalls pretty much everywhere except for Linux.

There's a legitimate question of whether the kernel ABI or the libc API qualifies as the system provided facilities on a linux box. But that uncertainty only furthers my latter point about compatibility.

PhilipRoman

25 days ago

[-]

There are embedded Linux systems with a total image size of ~10MB (not image as in docker "image" but an actual disk image). It would be quite hard to justify the additional dependency.

25 days ago

[-]

Okay, then don't use a language with statically-linked jemalloc for those. Why is that a reason to write off the entire language for all purposes?

ainiriand

26 days ago

[-]

Sorry, but why that can be a downside in 2018?

mwkaufma

26 days ago

[-]

Lots of detail, little substance, and misleading section headers. GPT-generated red flags.

sjmulder

24 days ago

[-]

The interjected bullet point sections seem to be entirely LLM written and don't add anything, just meaningless interruption

eatonphil

26 days ago

[-]

One of the areas I wonder about this a lot is when integrating Rust code into Postgres which has its own allocator system. Mostly right now when we need to have complex data structures (non-Postgres data structures) that must live outside of the lexical scope we put them somewhere global and return a handle to the C code to reference the object. But with the upcoming support for passing an allocator to any data structure (in the Rust standard library anyway) I think this gets a lot easier?

tialaramex

26 days ago

[-]

For me the most interesting thing in Allocator is that it's allowed to say OK, you wanted 185 bytes but I only have a 256 byte allocation here, so, here is 256 bytes.

This means that e.g. a growable container type doesn't have to guess that your allocator probably loves powers of 2 and so it should try growing to 256 bytes not 185 bytes, it can ask for 185 bytes, get 256 and then pass that on to the user. Significant performance is left on the table when everybody is guessing and can't pass on what they know due to ABI limitations.

Rust containers such as Vec are already prepared to do this - for example Vec::reserve_exact does not promise you're getting exactly the capacity you asked for, it won't do the exponential growth trick because (unlike Vec::reserve) you've promised you don't want that, but it would be able to take advantage of a larger capacity provided by the allocator.

IshKebab

26 days ago

[-]

There's so much more information that code could give allocators but doesn't due to being stuck with ancient C APIs. Is the allocation likely to be short lived? Is speed or efficiency more important? Is it going to be accessed by multiple threads? Is it likely to grow in future?

duped

26 days ago

[-]

That seems suspect to me. If I call reserve_exact I do actually mean reserve_exact and I want .capacity() to return with the argument I passed to reserve_exact(). This is commonly done when using Vec as a fixed capacity buffer and you don't want to add another field to whatever owns it that's semantically equivalent to .capacity().

I don't really care if the memory region is past capacity * size of::<T>(), but I do want to be able to check if .len() == .capacity() and not be surprised

tialaramex

26 days ago

[-]

> This is commonly done when using Vec as a fixed capacity buffer and you don't want to add another field to whatever owns it that's semantically equivalent to .capacity().

The documentation for Vec already explains exactly what it's offering you, but lets explore, what exactly is the problem? You've said this is "commonly done" so doubtless you can point at examples for reference.

Suppose a Goose is 40 bytes in size, and we aim to store say 4 of them, for some reason we decide to Vec::new() and then Vec::reserve_exact(..., 4) rather than more naturally (but with the same effect) asking Vec::with_capacity(4) but alas the allocator underpinning our system has 128 or 256 byte blocks to give, 4x40 = 160, too big for 128, so a 256 byte block is allocated and (a hypothetical future) Vec sets capacity to 6 anyway.

Now, what disaster awaits in the common code you're talking about? Capacity is 6 and... there's capacity for 6 entries instead of 4

duped

26 days ago

[-]

The condescension isn't appropriate here. I'm talking about using `Vec` as a convenient temporary storage without additional bookkeeping on top if the capacity() is meaningful. Like you said, Rust doesn't guarantee that because `reserve_exact` is not `reserve_exact`. In C++, the pattern is to resize() and shrink_to_fit(), which is implementation defined but when it's defined to do what it says, you can rely on it.

> Now, what disaster awaits in the common code you're talking about? Capacity is 6 and... there's capacity for 6 entries instead of 4

The capacity was expected to be 4 and not 6, which may be a logical error in code that requires it to be. If this wasn't a problem the docs wouldn't call it out as a potential problem.

tialaramex

26 days ago

[-]

The condescension you've detected is because I doubt your main premise - that what you've described is "common" and so the defined behaviour will have a significant negative outcome. It's no surprise to me that you can offer no evidence for that premise whatsoever and instead just retreat to insisting you were correct anyway.

The resize + shrink_to_fit incantation sounds to me a lot like one of those "Sprinkle the volatile keyword until it works" ritualistic C++ practices not based in any facts.

fc417fc802

25 days ago

[-]

> the pattern is to resize() and shrink_to_fit(),

As someone who primarily writes C++ I would not expect that to work. I mean it's great if it does I guess (I don't really see the point?) but that would honestly surprise me.

I would _always_ expect to use >= for capacity comparisons and I don't understand what the downside would be. The entire point of these data structures is that they manage the memory for you. If you need precise control over memory layout then these are the wrong tools for the job.

26 days ago

[-]

>But with the upcoming support for passing an allocator to any data structure (in the Rust standard library anyway) I think this gets a lot easier?

Yes and no. Even within libstd, some things require A=GlobalAlloc, eg `std::io::Read::read_to_end(&mut Vec<u8>)` will only accept Vec<u8, GlobalAlloc>. It cannot be changed to work with Vec<u8, A> because that change would make it not dyn-compatible (nee "object-safe").

And as you said it will cut you off from much of the third-party crates ecosystem that also assumes A=GlobalAlloc.

But if the subset of libstd you need supports A=!GlobalAlloc then yes it's helpful.

RGBCube

26 days ago

[-]

If the `A` generic parameters were changed to be ?Sized, it would still be possible to make `read_to_end` support custom allocators by changing the signature to `read_to_end(&mut dyn Vec<u8, Allocator>)`

Not sure if that is a breaking change though, it probably is because of a small detail, I'm not a rustc dev.

26 days ago

[-]

First of all, `dyn Vec` is impossible. Vec is a concrete type, not a trait. I assume you meant `Vec<u8, dyn Allocator>`.

Second, no a `&mut Vec<u8, A>` is not convertible to `&mut Vec<u8, dyn Allocator>`. This kind of unsized coercion cannot work because it'll require a whole different Vec to be constructed, one which has an `allocator: dyn Allocator` field (which is unsized, and thus makes the Vec unsized) instead of an `allocator: A` field. The unsized coercion you're thinking of is for converting trait object references to unsized trait object references; here we're talking about a field behind a reference.

RGBCube

25 days ago

[-]

Sorry, I meant `&Vec<T, dyn Allocator>`.

And no, it is possible. Here is an example that does it with BufReader, which has T: ?Sized and uses it as a field: https://play.rust-lang.org/?version=stable&mode=debug&editio...

Though it comes with a caveat that you can't take self by value, which is perfectly fine for this use case & is what a normal allocator-aware language does anyway.

20 days ago

[-]

I stand corrected. I didn't know rustc supported such a coercion automatically. Now I see it is documented in CoerceUnsized + Unsize.

That said, other than the problem of this being a breaking API change for Read::read_to_end, another problem is that Vec's layout is { RawVec, len } and the allocator is inside RawVec, so the allocator is not the last field of Vec, which is required for structs to contain unsized fields. It would require reordering Vec's fields to { len, RawVec } which may not be something libstd wants to do (since it'll make data ptr access have an offset from Vec ptr), or to inline RawVec into Vec as { ptr, cap, len, allocator }.

steveklabnik

26 days ago

[-]

I’m not sure what those two things have to do with each other, though I did just wake up. The only thing the new allocator stuff would give you is the ability to allocate a standard library data structure with the Postgres allocator. Scoping and handles and such wouldn’t change, and using your own data structures wouldn’t change.

It’s also very possible I’m missing something!

eatonphil

26 days ago

[-]

> The only thing the new allocator stuff would give you is the ability to allocate a standard library data structure with the Postgres allocator.

Yeah no this is basically all I'm saying. I'm excited for this.

steveklabnik

26 days ago

[-]

Ah yeah, well it's gonna be a good feature for sure when it ships!

26 days ago

[-]

Allocating memory with C and freeing it with Rust is silly. If you want to free a C-allocated pointer in Rust, just have Rust call back in to C. Expecting that allocators work identically in both runtimes is unreasonable and borderline insane. Heck, I wouldn't expect allocators to work the same even across releases of libc from the same vendor (or across releases of Rust's std).

rectang

26 days ago

[-]

I don't agree with your contemptuous framing. It's incorrect, and per the post's author, "dangerous" — but depending on your background it's not "silly" or "borderline insane". It's just naive, and writing a slab allocator as an exercise or making honest explorations like in this blog post will help cure the naivete.

25 days ago

[-]

It’s undefined behavior. It will never be stable. Investigating every permutation of zero-utility undefined behavior in the universe is borderline insane. Will the author next investigate exactly how a 2002 Fiat becomes inoperable after a head on collision with a 2025 Volkswagen? These are all deep dives into infinite chaos.

rectang

25 days ago

[-]

I agree that we shouldn't mix allocators, and so does the post's author. Can you put your technical arguments into terms which aren't so dismissive of people's honest efforts to learn? We ought to be all on the same page. You could affirm and refine the post's conclusions and bring us together, rather than ridicule someone for ever entertaining a notion you consider "insane".

Here's what I got when I asked ChatGPT to rewrite your first comment to be as constructive as possible:

"Totally agree — relying on C and Rust to interoperate at the allocator level is risky at best. Allocating in C and freeing in Rust (or vice versa) assumes a level of compatibility that just isn’t guaranteed. Even within a single ecosystem, allocator behavior can change across versions — whether it’s different versions of libc or updates to Rust’s std. So expecting consistent behavior across language boundaries is, at the very least, unreliable. If you need to free a C-allocated pointer, the safest and cleanest approach is to call back into C to do it."

That's not a drop-in substitute for your original comment and "totally agree" is over-the-top cloying in the usual ChatGPT obsequious way, but I still think it's helpful in suggesting alternative framings.

benmmurphy

26 days ago

[-]

usually when interfacing with a library written in c the library will export functions for object destruction. it makes sense for that to be part of the interface instead of using the system allocator because it also gives the library freedom to do extra work during object destruction. if you have simple objects then its possible to just use the system allocator, but if you have graphs or trees of objects then its necessary to have a custom destroy function and there is always some risk in the future you might be forced to need to allocate more complex data structures that require multiple allocations.

26 days ago

[-]

The article is about how and why mixing allocators fails, not if it fails or how to fix the problem.

phkahler

26 days ago

[-]

Something I'd like to know for mixing Rust and C. I know it's possible to access a struct from both C and Rust code and have seen examples. But those all use accessor functions on the Rust side rather than accessing the members directly. Is it possible to define a structure in one of the languages and then via some wrapper or definitions be able to access it idiomatically in the other language? Can you point to some blog or documentation explaining how?

GrantMoyer

26 days ago

[-]

Rust bindgen[1] will automatically generate native Rust stucts (and unions) from C headers where possible. Note that c_int, c_char, etc. are just aliases for the corresponding native Rust types.

However, not all C constructs have idomatic Rust equivalents. For example, bitfields don't exist in Rust, and unlike Rust enums, C enums can have any value of the underlying type. And for ABI reasons, it's very commom in C APIs to use a pointer to an opaque type paired with what are effectively accessor function and methods, so mapping them to accessors and methods on a "Handle" type in Rust often is the most idomatic Rust representation of the C interface.

[1]: https://github.com/rust-lang/rust-bindgen

oconnor663

26 days ago

[-]

Here's one of my recorded talks going through an example of using a `#[repr(C)]` struct (in this case one that's auto-generated by Bindgen): https://youtu.be/LLAUzghhNHg?t=2168

26 days ago

[-]

I don't know what examples you've been seeing. The interop structs are just regular Rust structs with the `#[repr(C)]` attribute applied to them, to ensure that the Rust compiler lays the struct out exactly as the C compiler for that target ABI would. Rust code can access their fields just fine. There's no strict need for accessor functions.

stouset

26 days ago

[-]

And vice versa. Rust code and C code can both operate on each other’s structs natively.

`#[repr(C)]` instructs the compiler to lay the struct out exactly according to C’s rules: order, alignment, padding, size, etc. Without this, the compiler is allowed a lot more freedom when laying out a struct.

https://doc.rust-lang.org/nomicon/other-reprs.html

pmalynin

26 days ago

[-]

Like, repr(C)?

broken_broken_

26 days ago

[-]

I wrote a blog article about exactly this, using a C++ class from C++, C and Rust: https://gaultier.github.io/blog/rust_c++_interop_trick.html

IshKebab

26 days ago

[-]

It's idiomatic to access struct fields directly in both languages. What more do you want?

commandersaki

26 days ago

[-]

Me: “If we do it via FFI then there’s a possibility the program may continue working (because the underlying structs share the same memory layout? right? …right?)”

I didn't understand what was being said here; was he suggesting that you call libc free using FFI; which would be fine? I understand the interviewer asked about using Rust dealloc though. I think the FFI bit is confusing me.

tracker1

26 days ago

[-]

Interesting read... and definitely good to know base of knowledge especially if you're working in transitional or mixed codebases.

sesm

26 days ago

[-]

Section named "The Interview Question That Started Everything" doesn't contain the interview question.

hyperbrainer

26 days ago

[-]

That's the first thing on the page.

> Interviewer: “What happens if you allocate memory with C’s malloc and try to free it with Rust’s dealloc, if you get a pointer to the memory from C?”

> Me: “If we do it via FFI then there’s a possibility the program may continue working (because the underlying structs share the same memory layout? right? …right?)”

sesm

26 days ago

[-]

That's fair. Personally, I've skipped that entire pre-section thinking it's a long quote from some book.

PoignardAzur

26 days ago

[-]

It is, but yeah, the entire article's formatting is pretty weird.

jeroenhd

26 days ago

[-]

The entire blog post feels formatted like AI output to me. Repeated checklists with restated points, tables and full blocks of code spread across the page in a very specific way.

I don't know if the author used AI to write this, but if they didn't, this is the person AI agents decided to copy the writing style of.

Edit: Reddit thread somewhere in the comments here to a post from the author pretty much confirmed my suspicions, this article is heavily AI generated and plain wrong in several cases. A good reminder not to use AI slop to learn new topics, because LLMs bullshit half the time and you need to know what you're doing to spot the lies.

ryanf

26 days ago

[-]

This article looked interesting, but I bounced off it because the author appears to have made heavy use of an LLM to generate the text. How can I trust that the content is worth reading if a person didn't care enough to write it themselves?

rectang

26 days ago

[-]

I find it hard to believe that an LLM would have come up with this quote to start the article:

> “Memory oppresses me.” - Severian, The Book of the New Sun

That sort of artistic/humourous flourish isn't in character for an LLM.

jml7c5

26 days ago

[-]

It looks like a mix of LLM and human-written content. The (human) author would have been the one who chose to put that quote there.

gblargg

26 days ago

[-]

Which is even worse. It's like mixing broken glass with food. Why even waste food if it's going to be inedible anyway?

eviks

26 days ago

[-]

But it's easy to believe that this is one of the few things the author added. Doesn't have to be 0% or 100%

zem

26 days ago

[-]

it sounds nothing like AI to me! or AI has advanced to the point where it is hard to tell - e.g. I wouldn't expect a sentence like "You’re not just getting 64 bytes of memory. You’re entering into a complex contract with a specific allocator implementation." from one.

pests

26 days ago

[-]

While I usually hate all the accusations of writings being LLM generated, I find your example a bit odd as that phasing is very typical of ChatGPT, especially when it was glazing everyone after that one update they had to reverent.

“It’s not just _________. It’s _________________.”

This was in almost every response doubling down on the users ideas and blowing things out of proportion. Stuff like…

“It’s not just a good idea. It’s a ground up rewriting of modern day physics.”

rocky_raccoon

26 days ago

[-]

I picked up on it very quickly as well. Here are some more phrases that match that same LLM pattern. Sure, you could argue that someone actually writes like this, but after a while, it becomes excessive.

- Your program continues running with a corrupted heap - a time bomb that will explode unpredictably later.

- You’re not just getting 64 bytes of memory. You’re entering into a complex contract with a specific allocator implementation.

- The Metadata Mismatch

- If it finds glibc’s metadata instead, the best case is an immediate crash. The worst case? Silent corruption that manifests as mysterious bugs hours later.

- Virtual Memory: The Grand Illusion

- CPU Cache Architecture: The Hidden Performance Layer

- Spoiler: it’s even messier than you might think.

zem

26 days ago

[-]

huh, interesting, I guess I haven't read enough of it to pick up on the patterns

ziml77

26 days ago

[-]

Atomic Shrimp has an aside in a recent video about how to identify AI writing. It's worth a look https://youtu.be/VeD9dUUFl-E?t=668

He's not the only one to point out these things that LLMs (currently) tend to output, but this is one of the shorter overviews of the tells you can spot.

stevenhuang

26 days ago

[-]

E.g. the not x but why slop leader board

TechDebtDevin

26 days ago

[-]

Do you see Emojis in tables/code now and assume the person is using an llm? I dont really see it.

https://www.reddit.com/r/rust/comments/1mh7q73/comment/n6uan...

shiftingleft

26 days ago

[-]

The author admits to it.

The reply to that comment is also a good explainer of why the post has such a strong LLM smell for many.

ryanf

26 days ago

[-]

Yeah, I completely agree with that reply, thanks for the link.

BTW that Reddit post also has replies confirming my suspicions that the technical content wasn't trustworthy, if anyone felt like I was just being snobby about the LLM writing: https://www.reddit.com/r/rust/comments/1mh7q73/comment/n6ubr...

ryanf

26 days ago

[-]

Maybe I'm too paranoid! If it's not LLM then I don't think it's a very well-organized post though.

In addition to the emoji, things that jumped out at me were the pervasive use of bullet lists with bold labels and some specific text choices like

> Note: The bash scripts in tools/ dynamically generate Rust code for specialized analysis. This keeps the main codebase clean while allowing complex experiments.

But I did just edit my post to walk it back slightly.

skydhash

26 days ago

[-]

Not TFA’s author

As a non-native English speaker, 90% of my vocabulary come from technical books and SF and Fantasy novels. And due to an education done in French, I tend to prefer slightly complicated sentences forms.

If someone uses LLM to give their posts clarity or for spellchecking, I would aplaud them. What I don’t agree with, LLM use or no, is meandering and inconsistency.

OmarAssadi

26 days ago

[-]

Personally, it is one of the flags, yeah. It's been a while since I've tried ChatGPT or some of the others, but the structure and particular usage felt a lot like what I'd have gotten out of deepseek.

It's not a binary thing, of course, but it's definitely an LLM smell, IMO.

mvieira38

26 days ago

[-]

I mean, are we supposed not to? This doesn't read like a blog at all, it even has the dreaded "Key Takeaways" end section... The content is good and seems genuinely researched, but the text looks "AI enhanced", that's all

jokoon

26 days ago

[-]

Any insight on the quantity of paid rust job out there?