Rusty.hpp: A Borrow Checker and Memory Ownership System for C++20
124 points
13 days ago
| 12 comments
| github.com
| HN
macgyverismo
13 days ago
[-]
This borrow checker runs at runtime, which I find not as interesting. Everything starts to look a lot like std::unique_ptr which I think is mostly unneeded as it ads pointer indirection.

Could someone explain to me when one would use this? Is it for educational purposes perhaps?

reply
jandrewrogers
13 days ago
[-]
I don't think it is intended to be used in a real system, this was more of an experiment to see what was possible. C++ as a language isn't well-suited to supporting a compile-time borrow checker. The difficulty of retrofitting C++20 modules to the language is probably just a glimmer of the pain that would be involved in making a borrow checker work.

There is a place for runtime borrow checking. Some safe cases in well-designed code are intrinsically un-checkable at compile-time. C++ is pretty amenable to addressing these cases using the type system to dynamically guarantee that references through a unique_ptr-like object are safe at the point of dereference. Much of what the borrow checker does at compile-time could potentially be done at runtime with the caveat that it has an overhead.

This has more than a passing resemblance to how deadlock-free locking systems work. They don't actually prevent the possibility of deadlocks, as that may not be feasible, but they can detect deadlock conditions and automatically edit/repair the execution graph to eliminate the deadlock instance. If a deadlock occurs in a database and no one notices, did it really happen?

reply
jmax01
13 days ago
[-]
Hey, I am the author of this, I made this mostly for the purpose of experimenting and playing around and trying out things rather than actually using this for production projects. Making a proper compile time checker is pretty complicated(possibly impossible) without actually getting into the compiler, this just intends emulate that behavior to some extent and have a similar interface. "educational purposes" -> well kinda, I had some free time and had an interesting idea perhaps
reply
38
13 days ago
[-]
> pretty complicated(possibly impossible)

Rust does it at compile time, so why cant C++? to me this detail completely kills the usefulness of this project

reply
steveklabnik
13 days ago
[-]
C++ cannot because it does not have the necessary information present in its syntax. It’s really that simple. C++ could add such syntax, but outside of what Circle is doing, I’m not aware of any real proposal to add it.

Also, Google (more specifically, the Chrome folks) tried to make it work via templates, but found that it was not possible. There’s a limit to template magic, even.

reply
arc619
13 days ago
[-]
Although it's not as extensive as Rust's lifetime management, Nim manages to infer lifetimes without specific syntax, so is it really a syntax issue? As you say, though, C++ template magic definitely has its limits.
reply
steveklabnik
13 days ago
[-]
Nim has a garbage collector.

That said, you're right on some level that it's truly semantics that matter, not syntax, but you need syntax to control the semantics.

reply
arc619
13 days ago
[-]
Nim is stack allocated unless you specifically mark a type as a reference, and "does not use classical GC algorithms anymore but is based on destructors and move semantics": https://nim-lang.org/docs/destructors.html

Where Rust won't compile when a lifetime can't be determined, IIRC Nim's static analysis will make a copy (and tell you), so it's more as a performance optimisation than for correctness.

Regardless of the details and extent of the borrow checking, however, it shows that it's possible in principle to infer lifetimes without explicit annotation. So, perhaps C++ could support it.

As you say, it's the semantics of the syntax that matter. I'm not familiar with C++'s compiler internals though so it could be impractical.

reply
steveklabnik
12 days ago
[-]
I did not hear that Nim made ORC the default, thanks for that!

I still think that my overall point stands: sure, you can treat this as an optimization pass, but that kind of overhead isn't acceptable in the C++/Rust world. And syntax is how you communicate programmer intent, to resolve the sorts of ambiguous cases described in some other comments here.

I am again reminded of escape analysis https://steveklabnik.com/writing/borrow-checking-escape-anal...

reply
aw1621107
12 days ago
[-]
> Where Rust won't compile when a lifetime can't be determined, IIRC Nim's static analysis will make a copy (and tell you), so it's more as a performance optimisation than for correctness.

Wait, how does that work? For example, take the following Rust function with insufficient lifetime specifiers:

    pub fn lt(x: &i32, y: &i32) -> &i32 {
        if x < y { x } else { y }
    }
You're saying Nim will change one/all of those references to copies and will also emit warnings saying it did that?
reply
j-james
12 days ago
[-]
It will not emit warnings saying it did that. The static analysis is not very transparent. (If you can get the right incantation of flags working to do so and it works, let me know! The last time I did that it was quite bugged.)

Writing an equivalent program is a bit weird because: 1) Nim does not distinguish between owned and borrowed types in the parameters (except wrt. lent which is bugged and only for optimizations), 2) Nim copies all structures smaller than $THRESHOLD regardless (the threshold is only slightly larger than a pointer but definitely includes all integer types - it's somewhere in the manual) and 3) similarly, not having a way to explicitly return borrows cuts out much of the complexity of lifetimes regardless, since it'll just fall back on reference counting. The TL;DR here though is no, unless I'm mistaken, Nim will fall back on reference counting here (were points 1 and 2 changed).

For clarity as to Nim's memory model: it can be thought of as ownership-optimized reference counting. It's basically the same model as Koka (a research language from Microsoft). If you want to learn more about it, because it is very neat and an exceptionally good tradeoff between performance/ease of use/determinism IMO, I would suggest reading the papers on Perseus as the Nim implementation is not very well-documented. (IIRC the main difference between Koka and Nim's implementation is that Nim frees at the end of scope while Koka frees at the point of last use.)

reply
aw1621107
10 days ago
[-]
Oh, that's interesting. I think not distinguishing between owned and borrowed types clears things up for me; it makes a lot more sense for copying to be an optimization here if reference-ness is not (directly?) exposed to the programmer.

Thanks for the explanation and the reading suggestions! I'll see about taking a look.

reply
arc619
12 days ago
[-]
> It will not emit warnings saying it did that.

You're right. I was sure I read that it would announce when it does a copy over a sink but now I look for it I can't find it!

> The static analysis is not very transparent.

There is '--expandArc' which shows the compile time transformations performed but that's a bit more in depth.

reply
gpderetta
13 days ago
[-]
I'm pretty sure you could embed a language with lifetimes in a dsl built with c++ templates. You wouldn't want to use it beyond toy programs though.
reply
steveklabnik
13 days ago
[-]
Maybe, but nobody has demonstrated that it's actually possible. And even then, toys are fun, but still, at the end of the day, not good enough.
reply
gpderetta
13 days ago
[-]
Of course, it would be completely impractical. Nobody has demonstrated it because they were interested in a practical solution.
reply
jmax01
13 days ago
[-]
Well thats how the current C++ compilers/standard is. There is a limit to what a header/library can do
reply
saurik
12 days ago
[-]
> pretty complicated(possibly impossible) without actually getting into the compiler
reply
ramon156
13 days ago
[-]
I think it's more an "can i do this" project, rather than a product that can be used in prod
reply
Ygg2
13 days ago
[-]
> Could someone explain to me when one would use this?

For memes, obviously.

Me: I want Rust!

Tech lead: We have Rust at home!

Rust at home: rusty.hpp

reply
CaptainOfCoit
13 days ago
[-]
> Could someone explain to me when one would use this? Is it for educational purposes perhaps?

The goal/why is, as almost always, explained in the README:

> rusty.hpp as the time or writing this is a very experimental thing. Its primary purpose is to experiment and test out different coding styles and exploring a different than usual C++ workspace.

TLDR: it's a experiment

reply
cogman10
13 days ago
[-]
> Everything starts to look a lot like std::unique_ptr which I think is mostly unneeded as it ads pointer indirection.

Interesting, why is this? I would have assumed the compiler could have optimized away that indirection.

[1] https://godbolt.org/z/9Pqqqz5a7

reply
steveklabnik
12 days ago
[-]
reply
zozbot234
13 days ago
[-]
Rust does "borrow checking at runtime" with RefCell<>.
reply
38
13 days ago
[-]
right, but RefCell is optional. if you dont use that, you get checking at compile time.
reply
cherryteastain
13 days ago
[-]
What's the point of adding Option<T>, Result<T,E> and Rc/Arc when std::optional, std::expected and std::shared_ptr exist?
reply
tialaramex
13 days ago
[-]
std::optional is a poor shadow of Option. It's what happens when C++ programmers who've seen a Maybe type in a window (years ago by the way, this isn't inspired by Rust, it was just stuck in the standardization process until C++ 17) but are starved of proper types and basic features like pattern matching try to imitate what they saw.

As a result for example std::optional<&T> doesn't exist, because to a C++ programmer it seems as though this might have assign-through semantics (!) and so WG21 decided to kick this can down the road. C++ 26 might get std::optional<&T>

reply
Calavar
13 days ago
[-]
The lack of support for optional<T&> is not an issue at all in my opinion. The actual issue is that std::optional is not a monadic type in the vein of Rust's Option or Haskell's Maybe. So really, what does it buy you over std::pair<T, bool>? Except being unsafe by default since it allows you to access an unconstructed T. Basic monadic operations don't arrive for std::optional until C++23, which is an unforced error. They should have been there from the beginning.
reply
AHTERIX5000
13 days ago
[-]
Funny how this just keeps happening in the C++ world. I've seen ten different promise/task frameworks successfully used in production with neat APIs but somehow std::future is still just a toy. Even std::expected was released without the usual map/then.
reply
mgaunard
13 days ago
[-]
std optional is based on boost optional which was written in 2003 before any sort of lambdas made monadic operations usable.

The main concern with that component was ensuring we can allocate stack storage for an object that may or may not be initialized.

The reference is easily achievable by using T* so is of minimal value, but also poses some more semantic problems since a reference is not copyable while an optional is.

reply
tialaramex
12 days ago
[-]
I actually don't care that much about the monadic functions.

For me the important use case is pattern matching, which C++ doesn't yet have. Pattern matching really changes how you see the entire language.

reply
mgaunard
12 days ago
[-]
C++ has pattern matching through overloading.
reply
tialaramex
10 days ago
[-]
How do you figure?
reply
mgaunard
10 days ago
[-]
I don't understand the question.

Here is the first example I found on Google if that helps you understand.

    std::variant<Fluid, LightItem, HeavyItem, FragileItem> package;

    std::visit(overload{
        [](Fluid& )       { std::cout << "fluid\n"; },
        [](LightItem& )   { std::cout << "light item\n"; },
        [](HeavyItem& )   { std::cout << "heavy item\n"; },
        [](FragileItem& ) { std::cout << "fragile\n"; }
    }, package);
reply
tialaramex
9 days ago
[-]
But that's not really even a pattern match? Here's what a pattern match looks like: [This is from day 10 of last year's Advent of Code.]

            match (state, pipe) {
                (State::None, Pipe::Ground) => {
                    if inside {
                        n += 1;
                    }
                }
                (State::None, Pipe::Vert) => {
                    inside = !inside;
                }
                (State::None, Pipe::Se) => {
                    state = State::South;
                }
                (State::None, Pipe::Ne) => {
                    state = State::North;
                }

                // Horizontal lines make no difference to anything
                (State::North | State::South, Pipe::Horiz) => {}

                // U-turns
                (State::South, Pipe::Sw) | (State::North, Pipe::Nw) => {
                    state = State::None;
                }

                // Form a vertical line
                (State::South, Pipe::Nw) | (State::North, Pipe::Sw) => {
                    inside = !inside;
                    state = State::None;
                }

                _ => {
                    panic!("Unexpected sequence {state:?} {pipe:?}");
                }
            }
reply
mgaunard
8 days ago
[-]
This is the exact same thing except you're visiting two arguments at a time.

Guess what, the same syntax I gave supports exactly that as well.

reply
meindnoch
11 days ago
[-]
Easy. When you want std::optional<T&> just use T*.
reply
jeroenhd
13 days ago
[-]
The Option type seems to have various standard Rust methods like expect() implemented that I don't believe std::optional has.

I haven't checked recent C++ standards, but I don't believe you can use partial classes/extensions in C++ like some other OO languages to add these methods to a native type. Many helper functions commonly used in Rust also only seem to exist in C++23, which not ever project can be compiled under yet.

In normal C++ code, the native types would probably be better to use, but if you're going full Rust style code, you may as well use these new types.

reply
blegr
13 days ago
[-]
> The Option type seems to have various standard Rust methods like expect()

Isn't that value()?

reply
tialaramex
13 days ago
[-]
pub fn expect(self, msg: &str) -> T

So that says it's a method (its first parameter is the type itself, but named self rather than as a normal parameter so we can use method syntax instead of calling the function Option::expect) but it also takes an immutable reference to a string slice.

That second parameter, msg, is the text for a diagnostic if/ when you're wrong.

So, in a sense it's like value() but the diagnostic text means, when I was wrong...

  let goose = found.expect("Our goose finder should always find a goose");
... I get a diagnostic saying that the problem is with "Our goose finder should always find a goose". Huh. I think we know where to start trouble shooting.
reply
blegr
13 days ago
[-]
Right, but that's redundant with the stack trace. It's not actually helpful to run a big program I don't know very well and panic with a single "your goose isn't cromulent!" message from a call 20 levels deep.

In your example, it's likely that the person who sees this message won't have enough context to understand it; it's more like a debugging assert. Since you'll need a debugger and a breakpoint anyway, the message isn't very helpful.

reply
tialaramex
13 days ago
[-]
The nature of expect is that this is a bug. The person who wrote this code was wrong, they expected that this optional has Some value but it does not.

In most cases then, if you don't know this code very well, that's fine because it's not your bug. In the edge case that you just got handed a pile of poorly documented code somebody else wrote, perhaps over several years, well, at least you know what they thought is supposed to happen here and that they're wrong.

And no, I don't find it better to be told "It broke, break out a debugger and try to reproduce the fault". With this text we can revisit the Goose wrangling code and maybe, now that we're staring at it knowing a real customer saw this fault, we are inspired and realise that sometimes it won't find a Goose, then decide what to do about that.

reply
filleduchaos
13 days ago
[-]
Maybe it's just me but a note from the developer stating why it's important that some particular value be present is exactly the sort of help I would like when looking at a call stack that's dozens of levels deep. Especially considering that a panic terminates execution - I very much would like to know what was so critical that the program had to preemptively crash up front and not after pawing through code and docs.

I think it's pretty odd to use a quick example someone rattled off on a web forum to explain a function's behaviour as evidence of its usefulness or lack thereof, as if the only thing a person could possibly write in a freeform error message is "Our goose finder should always find a goose".

reply
blegr
13 days ago
[-]
I see your point, but my experience is that you need the stack trace first, and the developer’s explanation second. Asserts crashing with a message that makes perfect sense in its context but is completely useless for debugging are the bane of my workweek.

Now I appreciate a clear explanation for an uncommon assert and for example, OpenCV could do with more of those, but in most functions, seeing the line that throws the error is enough to understand.

reply
vips7L
13 days ago
[-]
Are there no stack traces? Wouldn’t that point you to where to start trouble shooting?
reply
tialaramex
13 days ago
[-]
No, you are not guaranteed a stack trace, in an optimised release build it may not even be possible to construct a valid trace. If you can reproduce the problem you can say you want this run to have a stack trace, but if your release builds just exit immediately on panic then there's no reason for them to be able to provide a stack trace of the fault.

On the other hand expect will provoke the message you wrote if it fails. Of course if it's inside a consumer's fitness tracker it probably doesn't have any way to show the message to a human, but that's a different problem - the fitness tracker presumably can't display stack traces either.

reply
jeroenhd
13 days ago
[-]
.expect() also takes a message that it prints when the value is empty.

Unlike C++, Rust doesn't support throwing exceptions, so expect() failing would panic. By default, this means dumping a stack trace and terminating the program, and the message provided in "expect" would be printed right before the stack trace.

For example:

    fn main() {
        let x: Option<i32> = None;
        x.expect("Oh no!");
    }

will print:

    thread 'main' panicked at src/main.rs:3:7:
    Oh no!
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
With RUST_BACKTRACE set to "full", it'll print:

    $ RUST_BACKTRACE=full ./target/release/demo
    thread 'main' panicked at src/main.rs:3:7:
    Oh no!
    stack backtrace:
       0:     0x5ff43befa755 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::ha52e99bffe3c0898
       1:     0x5ff43bf1769b - core::fmt::write::h5fdd5156f2480a24
       2:     0x5ff43bef8a5f - std::io::Write::write_fmt::ha2c0b019f448d2c3
       3:     0x5ff43befa52e - std::sys_common::backtrace::print::he84813a4ed1c2825
       4:     0x5ff43befb7e9 - std::panicking::default_hook::{{closure}}::h033521c27c9929b1
       5:     0x5ff43befb52d - std::panicking::default_hook::had42987aad9de78c
       6:     0x5ff43befbc83 - std::panicking::rust_panic_with_hook::h80fc1b429f5a5699
       7:     0x5ff43befbb64 - std::panicking::begin_panic_handler::{{closure}}::h5aa7b89233b1ae33
       8:     0x5ff43befac19 - std::sys_common::backtrace::__rust_end_short_backtrace::h0e4c5e6cee7f8a24
       9:     0x5ff43befb897 - rust_begin_unwind
      10:     0x5ff43bee0b63 - core::panicking::panic_fmt::h3bea7be9b6a41ace
      11:     0x5ff43bf16c6c - core::panicking::panic_display::h20da06138ce63f85
      12:     0x5ff43bee0b2c - core::option::expect_failed::h92448d4f1092eaaa
      13:     0x5ff43bee127a - demo::main::ha244b8f1ce6eaa44
      14:     0x5ff43bee1223 -     std::sys_common::backtrace::__rust_begin_short_backtrace::hfc5c93265480da58
      15:     0x5ff43bee1239 - std::rt::lang_start::{{closure}}::h988fdfb65ef3da3b
      16:     0x5ff43bef6be6 - std::rt::lang_start_internal::h64c4082ce77a6bd6
      17:     0x5ff43bee12a5 - main
      18:     0x741d2a628150 - __libc_start_call_main
                               at ./csu/../sysdeps/nptl/libc_start_call_main.h:    58:16
      19:     0x741d2a628209 - __libc_start_main_impl
                                   at ./csu/../csu/libc-start.c:360:3
      20:     0x5ff43bee1155 - _start
      21:                0x0 - <unknown>
Whether you would want to replicate this behaviour in C++, I don't know; I find panic!() to be quite destructive, and catastrophic when it's used in libraries or frameworks. I think the C++ implementation just throws an exception, but Rust's .expect() does not behave like .value() in C++.
reply
n_plus_1_acc
13 days ago
[-]
std::optional<T> is fundamentally broken because it has an imolicit conversion (or operator* or something) to T. If you forget to check if it's empty you get UB.
reply
SilasX
13 days ago
[-]
Love the typo, and it's fitting here. I'm going to use it for any time the implicit behavior risks burning (immolating) you, as the sibling comments note applies std::optional.

Implicit conversion that immolates: imolicit conversion.

reply
lorenzhs
13 days ago
[-]
There is no implicit conversion (except to bool, but that tells you whether the optional contains a value), and operator* / operator-> throw std::bad_optional_access if it’s empty. See https://en.cppreference.com/w/cpp/utility/optional
reply
tialaramex
13 days ago
[-]
You're describing what it would do in a sane world where WG21 cared about safety.

In this world, as the document you've linked says: "The behavior is undefined if *this does not contain a value."

The operators for such access are actually `noexcept` - the exception you're apparently relying on would be illegal.

reply
lorenzhs
13 days ago
[-]
Should’ve checked my own link instead of relying on memory — I might have some code to revisit on Monday. That’s insane, thanks for correcting me!
reply
tialaramex
12 days ago
[-]
No problem, if I caused you to fix a bug before it happened that's great. Yes, I find that reading sources I'm about to cite is often eye-opening. Our memories are not as good as we think they are, and our condensed understanding of a complex situation may have ignored something which is now crucial.

Once in a while I go down a rabbit hole, but hey, it's not as though HN isn't a rabbit hole anyway.

reply
kzrdude
13 days ago
[-]
Can we salvage this by forbidding * on optional with compiler warnings (as errors)?
reply
lorenzhs
12 days ago
[-]
clang-tidy has a check for this -- it's not a compiler check but with clangd and LSP, almost every code editor can show an inline warning: https://clang.llvm.org/extra/clang-tidy/checks/bugprone/unch...
reply
masklinn
13 days ago
[-]
> and operator* / operator-> throw std::bad_optional_access if it’s empty.

Of course not, they’re literally `noexcept`, what they do is UB if empty.

value() will throw.

reply
formerly_proven
13 days ago
[-]
Step 1 of API design: Always make the easiest and shortest way the wrong way.
reply
steveklabnik
13 days ago
[-]
They were really in a pickle here. It’s easy to be snarky, but both options (no pun intended) have downsides. In short, do you choose consistency by default, or safety by default?

This feels like an easy choice in isolation, but at the time this was being developed (and arguably even now), there’s no definitive plan to holistically move C++ code to being safe by default. So whenever that happens, a ton of things will need to be dealt with, and there’s always the possibility that being an odd API here makes that overall move harder not easier. And C++ is regularly criticized for being inconsistent. Do you deepen those criticisms just so that one tiny corner of an API is better?

If I’m honest with myself, I probably would have made the same choices they did in this situation.

reply
tialaramex
13 days ago
[-]
> If I’m honest with myself, I probably would have made the same choices they did in this situation.

Some of the more modern proposals (std::optional is quite old) actually make an explicit appeal to WG21 not to choose consistency at the price of safety because it just needlessly makes the language worse. "But we made the language worse before" is more like a plea for help than an excuse.

Barry Revzin did this in his "do expressions" which are an attempt to kludge compound expressions into C++ which really wants them to be compound statements instead. For consistency, all the obvious mistakes you'll make in do expressions could introduce UB like they would in equivalent C++ core features, but Barry argues they should be Ill-Formed instead - resulting in your mistakes not compiling rather than having undefined behaviour.

reply
masklinn
13 days ago
[-]
C++ APIs follow the principle of most astonishement.
reply
blegr
13 days ago
[-]
It sucks but it's easy to review and avoid, probably could be checked statically by linters too.
reply
lorenzhs
12 days ago
[-]
reply
HarHarVeryFunny
13 days ago
[-]
Can someone familiar with both please explain the benefit of Rust's borrow checker memory management model over C++'s std::unique_ptr and shared_ptr ? Is there some safety argument to prefer Rust's model, or is it something else ?

I'm not aware of any C++ compiler doing it, but it seems smart pointer overhead could be automatically and safely reduced (in same way one can do it manually) by the compiler lowering the generated code to use raw pointers where permissible.

reply
mrtracy
13 days ago
[-]
The C++ smart pointers dont prevent multiple threads from mutating the pointed-to data at the same time; multiple threads can both access a unique_ptr at the same time and mutate its contents. Rust requires shared pointers (Arc) to also explicitly implement some sort of Mutex-equivalent runtime safety check in order to mutate the data. Rust also has explicit notion of thread ownership, and whether individual types are safe to pass to different threads; if a construct is not thread safe, Rust will prevent you from using it in multiple threads.

As a benefit of the thread-safety notion, Rust can have two reference-counting pointer types: Arc, which uses atomic reference counting and is roughly equivalent to std::shared_ptr, and Rc which does not use atomics. Rc cannot be used across multiple threads at the same time, and the borrow checker will prevent you from doing this.

Rc is appropriate for data structures which internally benefit from multiple pointers (e.g. graphs) but where all of that information is internal to a single data structure - this becomes available without paying the price of atomics.

reply
HarHarVeryFunny
13 days ago
[-]
Thanks.

So basically Rust is combining object ownership and thread safety while C++ keeps thread safety separate, which would seem to provide more flexibility, but also lets you shoot yourself in the foot.

Just thinking out loud, I wonder if C++ could better address this by also having a class of thread-aware smart pointers? -- but the problem is that C++ always has the old/new (C, C++) way of doing things - pthreads vs std::thread, std::mutex, etc, so even if the language provides easier ways of writing bug free code, there is no way to force developers to use those facilities.

In C++ there is also the issue of how to make statically allocated data structures thread safe in an enforceable way. Another kind of smart reference object, perhaps? Disallow global objects not accessed by such references?

C++ (which I have used since long before C++11) really wants to be two conflicting things - encompassing C's low level role as the ultimate systems programming language with no guardrails, while also wanting to compete as a much higher-level safer language for application developers. Perhaps the two safe+unsafe roles can be better combined into one language if one were to start from scratch. I'm not sure that Rust gets it right either - erring in the other direction by not being flexible enough.

reply
umanwizard
13 days ago
[-]
In what ways do you think Rust is not flexible enough?

I ask because I can think of a few ways it’s less flexible than C, but I also think that effect is massively overstated by people who aren’t familiar with the language. There are OS kernels written in Rust, for example.

reply
HarHarVeryFunny
13 days ago
[-]
From what I've read it seems that certain types of data structure (incl. anything with potentially circular references) are difficult to write in Rust - you are more fighting the language than it helping you. I'm really comparing to C++ rather than C (where of course anything is possible, as long as you DIY).
reply
umanwizard
13 days ago
[-]
Yes, data structures with cyclic references are a bit harder to write in Rust than in C or C++. But it’s not impossible. And IMO, you write those so rarely that it really doesn’t matter.
reply
plq
13 days ago
[-]
So ARC is something like the following?

    template <typename T>
    struct Locker {
        using M = std::shared_mutex;
        struct Locked {
            Locked(mtx, value) : m_lock(mtx), m_value(value) {}
            // operator->, operator*, get, etc.
        private:
            std::lock_guard<M> m_lock;
            std::shared_ptr<T> m_value;
        };

        struct Shared {
            Shared(mtx, value) : m_lock(mtx), m_value(value) {}
            // operator->, operator*, get, etc.
        private:
            std::shared_lock<M> m_lock;
            std::shared_ptr<const T> m_value;
        };
    
        Shared shared() { return Shared{m_mutex, m_value}; }
        Locked locked() { return Locked{m_mutex, m_value}; }

        // a nice forwarding ctor that prevents null m_value

    private:
        std::shared_ptr<T> m_value;
        M m_mutex;
    };
reply
fathyb
13 days ago
[-]
Rust `Box` = C++ `std::unique_ptr`, both have the same ABI (just pointers)

Rust `Arc` = C++ `std::shared_ptr`

Rust `Rc` = C++ `std::shared_ptr` but using a simple integer instead of an atomic so it is not thread safe

`Arc` and `Rc` do not allow you to mutate their contents directly so instead you should use "interior mutability" using something like a `Mutex` (thread-safe) or `RefCell` (not thread-safe), which have runtime checks to ensure no undefined behaviour is introduced. So `Arc<Mutex<T>>` makes it possible to mutate `T`, but `Arc<T>` cannot. Some types like atomics do not require mutability at all, so an `Arc<AtomicBool>` can be mutated directly.

An example of a big C++ codebase using something similar is Chromium, where `std::shared_ptr` is forbidden and `base::RefCounted` (Rust `Rc`) and `base::RefCountedThreadSafe` (Rust `Arc`) should be used instead. WebKit does this too.

reply
steveklabnik
13 days ago
[-]
> both have the same ABI (just pointers)

This is not actually true, but it's close enough for your purposes here.

But just to be clear about it, see stuff like this: https://stackoverflow.com/questions/58339165/why-can-a-t-be-...

reply
fathyb
12 days ago
[-]
Another reason it is not true: Rust has fat pointers, eg. `std::unique_ptr<const uint8[]>` and `Box<[u8]>` both contain the same allocation data, but `Box` will be 128-bit on 64-bit systems.
reply
HarHarVeryFunny
12 days ago
[-]
What's the utility of having a 128-bit pointer on a 64-bit system ?
reply
fathyb
12 days ago
[-]
`Box<[u8]>` stores the pointer and its length (2 x size_t), `std::unique_ptr<const uint8_t[]>` only stores the pointer.

That's for slices, for dynamically sized types (eg. `Box<dyn ToString>`) it contains a pointer to the virtual table.

https://doc.rust-lang.org/nomicon/exotic-sizes.html

reply
bitcharmer
12 days ago
[-]
Where can I find details like this about Rust?
reply
fathyb
12 days ago
[-]
The Rustonomicon is a good start, on fat pointers: https://doc.rust-lang.org/nomicon/exotic-sizes.html

> Because they lack a statically known size, these types can only exist behind a pointer. Any pointer to a DST consequently becomes a wide pointer consisting of the pointer and the information that "completes" them (more on this below).

reply
plq
13 days ago
[-]
You say:

> Rust `Arc` = C++ `std::shared_ptr`

GP says:

> Rust requires shared pointers (Arc) to also explicitly implement some sort of Mutex-equivalent runtime safety check in order to mutate the data.

Which is it?

> An example of a big C++ codebase using something similar is Chromium ...

Chromium's smart pointers are similar to their standard counterparts -- no mutexes for write access to pointed data.

Also, tangent but interesting: From https://www.chromium.org/developers/smart-pointer-guidelines...:

> Reference-counted objects make it difficult to understand ownership and destruction order, especially when multiple threads are involved. There is almost always another way to design your object hierarchy to avoid refcounting

reply
fathyb
13 days ago
[-]
Both are true, Rust just has more restrictions. It’s not completely equivalent, but you can think of `Arc<T>` as `std::shared_ptr<const T>` as in if you use `unsafe` or `const_cast` you can bypass mutability restrictions. Otherwise to mutate you need another abstraction doing `unsafe` things for you, such as `Mutex`.

I mentioned Chromium because they also differentiate between thread safe and non-thread safe shared pointers.

If anything, Rust shared pointers are more similar to C++ std pointers because in Chromium the reference count is inside the class, which is very handy because you can reconstruct a smart pointer from a raw pointer (like `this`), at the cost of needing `T` to extend `base::RefCounted`.

reply
plq
12 days ago
[-]
Perhaps I am not making myself clear here:

- RefCounted: It's like shared_ptr but refcount load/modify/store operation is not atomic, thus not thread-safe. No synchronization for pointed data.

- RefCountedThreadSafe: It's like shared_ptr. This means refcount load/modify/store is atomic, so has overhead, yet safe to pass across thread boundaries. Again, just like shared_ptr, no synchronization for pointed data.

- Locker class above: It's an (incomplete) wrapper around shared_ptr where read-only access goes through a shared lock and rw access goes through an exclusive lock. I suppose this is what rust's ARC guarantees at compile-time with less overhead the sketch above?

So;

> Both are true, Rust just has more restrictions.

No, both are not true, my understanding of ARC ~= Locker && ARC > shared_ptr

reply
fathyb
12 days ago
[-]
I think that's where you're confused: `Arc` does not do any synchronization, again it's pretty much the same as `std::shared_ptr` (hence the name Arc: Atomically Reference Counted).

Your `Locker` does not do what `Arc` does, even at compile time, because it does not allow concurrent access, like an `Arc<AtomicBool>` would. Your `Locker` is more like an `Arc<RwLock<T>>`.

Best equivalent you can get in C++ is `Arc<T>` = `std::shared_ptr<const T>`.

https://doc.rust-lang.org/std/sync/struct.Arc.html

> Shared references in Rust disallow mutation by default, and Arc is no exception: you cannot generally obtain a mutable reference to something inside an Arc. If you need to mutate through an Arc, use Mutex, RwLock, or one of the Atomic types.

I guess you could get the final pieces to get something similar by creating `Send` and `Sync` traits in C++: https://doc.rust-lang.org/nomicon/send-and-sync.html. I think the main pain point here is that you cannot auto-derive `Send` and `Sync` so it would end up being very verbose.

reply
dasyatidprime
12 days ago
[-]
FWIW, in C++11, a class C can similarly cooperate to enable reconstructing a shared_ptr from a raw one by deriving from std::enable_shared_from_this<C>.
reply
umanwizard
13 days ago
[-]
No, Arc doesn’t require a mutex if you don’t plan on mutating the underlying value.
reply
lionkor
13 days ago
[-]
The point is that it keeps track of multiple references and disallows mutable and immutable references at the same time across threads, for example, and disallows multiple mutable references altogether.

The rust borrow checker works on values, and all that, not just on objects with RAII.

reply
umanwizard
12 days ago
[-]
Mutable references can never coexist with other references in Rust, regardless of whether they're on different threads.

This will not compile:

    let x = 42;
    let r1 = &x;
    let m1 = &mut x;
    println!("{r1}");
reply
helloooooooo
13 days ago
[-]
Read this: https://alexgaynor.net/2019/apr/21/modern-c++-wont-save-us/

It will help you understand why "smart pointers" still won't help you.

reply
HarHarVeryFunny
13 days ago
[-]
I read that more as a valid criticism of other parts of C++ rather than about smart pointers as a way to track ownership.

e.g. std::string_view seems broken by design in wanting to support both raw-pointer based strings with zero ownership semantics as well as std::string. A string view (abstract concept) really needs to either have shared ownership of the underlying string, or have a non-owning reference that knows when it has been invalidated.

reply
umanwizard
12 days ago
[-]
Well the string view type you wish existed seems to be exactly what Rust gives you, no? Non-owning references that "know when they have been invalidated" (or rather, the compiler prevents you from using them after they have been invalidated).

I'm not sure why this means you shouldn't be able to create a string_view on top of std::string, though. You can create a Rust &str on top of String, it just doesn't participate in ownership.

reply
HarHarVeryFunny
12 days ago
[-]
My comment was just a reply to the parent - that the linked article wasn't really about smart pointers. I was just using string_view as an example.

There are lot's of places where C++'s long history shows it's ragged edges - where newer features really don't play so nice with older ones. One would certainly hope that a new language like Rust is at least initially more consistent.. the question is what will it look like in 20 years time, if it's still being actively developed at that time?

reply
umanwizard
12 days ago
[-]
Rust's &str is basically identical to C++'s string_view, for what it's worth. I still don't understand your point about how string_view is inconsistent. The only reason &str is so much easier to use than string_view is because Rust supports borrow checking, making it safe to use, whereas C++ does not.
reply
HarHarVeryFunny
12 days ago
[-]
What I meant about "inconsistency" is that there are std::string_view constructors that accept raw pointers to indicate the range, and others that accept iterators. It's a mix of old (C) & new (C++) data structures, with neither indicating the ownership or longevity of the underlying object.

This is somewhat typical of where C++ is at nowadays - layering new functionality on top of old that wasn't designed to accommodate it. In an ideal world the language and libraries would be refactored and rationalized, but of course backwards compatibility precludes that. This is the fate of old languages - stay unchanged and become obsolete, or keep layering on new functionality and become messy and inconsistent.

reply
lifthrasiir
13 days ago
[-]
> it seems smart pointer overhead could be automatically and safely reduced (in same way one can do it manually) by the compiler lowering the generated code to use raw pointers where permissible.

The sheer difficulty of doing this is one of the motivations behind Rust's borrow checker, which uses a combination of type system and static analyses to prove the safety without running anything. In fact this problem is probably easier to solve for languages where everything is GC-managed; those languages would have a heavy runtime which can transparently handle that in principle!

reply
umanwizard
13 days ago
[-]
Rust has unique and shared pointers too (Box and Arc/Rc). But using them when unnecessary results in extra heap allocations. I’m not aware of C++ compiler that can consistently rewrite uses of unique_ptr to heap-allocated objects to use raw pointers to stack-allocated objects instead.
reply
HarHarVeryFunny
13 days ago
[-]
I didn't mean trying to rewrite code to change dynamically allocated objects to stack based ones. That sounds more like an optimization that a managed language like C# might do.

C++'s unique_ptr and shared_ptr both have a get() method that will return you a raw pointer to the managed object, which can be a safe optimization within a function holding ownership to the object, as well as allowing you to use legacy functions on it that take raw pointer arguments.

I was thinking the C++ compiler could itself realize when it is safe to do so, save the raw pointer to a temp variable, and "rewrite" smart pointer accesses to use this temporary raw pointer. One could even imagine the compiler changing smart pointer function parameters to raw pointers in some circumstances.

reply
projektfu
11 days ago
[-]
As these are templates, processed statically, isn't this essentially happening already?

https://godbolt.org/z/ennj65v9z

reply
umanwizard
13 days ago
[-]
Then I’m afraid I don’t know what your point is. Rust’s borrow-checker isn’t a replacement for shared/unique pointers in C++. It’s a replacement for raw pointers.
reply
HarHarVeryFunny
13 days ago
[-]
My point was that overhead is one common objection to C++'s shared/unique pointers - everything is a method call - but that could be mitigated by the compiler itself doing the type of raw-pointer lowering, when safe, that the get() method permits.

From other replies in this thread is seems that Rust's borrow-checker addresses the high level issues of object ownership and thread safety - it's not just a replacement for raw pointers (i.e. a smart pointer), which is exactly what C++' shared/unique pointers are.

reply
tialaramex
13 days ago
[-]
borrowck is a semantic check. So, it's not a replacement for some particular C++ feature per se, it's not a feature in the sense you mean at all, it's just that while C++ and Rust both have these same semantic rules in place, Rust checks them and C++ does not. When you as a programmer inevitably get something wrong and break the rules, in Rust your program won't compile, in C++ it just has some arbitrary misbehaviour, maybe you notice, maybe you don't, maybe it matters, maybe it seems benign... until 8:26 tomorrow morning when suddenly it blows up and makes your customer very angry.

In C++ the result of breaking semantic rules (not just those checked by the borrowck, most of the semantic rules in the language) is IFNDR - your program is Ill Formed, No Diagnostic Required - your entire program has no particular meaning, there is no explanation for what it does, shrug. In Rust it doesn't compile.

For people whose overriding mission is to get the code to compile, C++ is very attractive. Broken garbage? Meaningless nonsense? Not my problem it compiled so I went home. If you want to write software that works, that seems like you didn't do the hard part.

reply
HarHarVeryFunny
12 days ago
[-]
That last paragraph destroys your whole argument.

If you really believe that Google and FaceBook (etc, etc) hire morons who don't care if their code works, then you are not qualified to talk about programming languages.

reply
umanwizard
12 days ago
[-]
Not sure about Google, but Meta has tons of C++ code because it was written before Rust became a viable alternative. And of course, rewriting those millions of lines of code would be too expensive now.

But Rust now has a large amount of mindshare there and is being used a lot in new projects.

reply
umanwizard
13 days ago
[-]
The main overhead of using shared/unique ptr for everything where you could have used stack allocation is not the extra method call for get etc, it’s the extra heap allocation. Compilers can probably inline get, but they can’t change heap allocations to stack allocations in general.
reply
HarHarVeryFunny
12 days ago
[-]
If you're declaring an object on the stack, then there is no reason to be using a pointer to refer to it. You could take the address of it and assign that to a raw pointer if you wanted to for some (perverse!) reason, but you'd never then assign that to a shared/unique_ptr since that implies ownership.

T t1; // stack, reference as t1

T* t2 = new T(); // heap, raw pointer, reference as * t2

std::unique_ptr<T> t3 = std::make_unique<T>(); // heap, smart pointer, reference as * t3

T* pt = &t1; // Create a raw pointer to t1! Bad idea!

reply
umanwizard
12 days ago
[-]
> If you're declaring an object on the stack, then there is no reason to be using a pointer to refer to it.

Why not? What if you have some function f(T *) that you want to call?

But anyway, we're not _just_ talking about stack allocations, but also extra levels of indirection on the heap. For example, vectors store their elements in a heap-allocated buffer directly. If they kept them all in shared pointers, there would be an extra level of indirection. This means e.g. vector::operator[] has to return a reference (which is basically the same thing as a pointer under the hood); it can't return shared_ptr or similar (because storing all its elements as shared pointers would make it way slower due to the extra allocations).

In Rust, vector access is safe (due to the borrow checker), but in C++, it's not.

    vector<int> v {1, 2, 3};
    int& x = v[0];
    v.push_back(4);
    printf("%d\n");
This code is UB in C++. In Rust, it's impossible to write something like this.

    fn main() {
        let mut v = vec![1, 2, 3];
        let x = &v[0];
        v.push(4);
        println!("{x}");
    }
This code fails to compile.
reply
HarHarVeryFunny
12 days ago
[-]
> Why not? What if you have some function f(T *) that you want to call?

In C++ (vs C), if the intent is to pass something large efficiently, then you'd use a reference parameter, not a pointer.

You seem to be confused about the meaning of C++ smart pointers - the whole point of them (as a replacement for C's raw pointers) is that they control and indicate ownership. You can't just assign a smart pointer to something you don't own (like an element of a vector). You can copy a shared_ptr to create an additional reference, or move a unique_ptr to move ownership.

A C++ compiler might generate a warning for that invalidated reference. clang++ is generally much better than g++, but I agree it'd be nice if a conforming compiler was forced to at least flag it, if not reject it.

The problem with doing this in the general case, where it's a user-defined (or library defined, as here) data structure, rather than one defined by the language, is that the compiler needs to inspect the implementation of that "push" method and realize that it might do something to invalidate references (& iterators). In the case of a library the compiler won't have access to the implementation to figure that out. How would Rust handle this if "vec" were a user-defined type where only the definition (not implementation) was available - how would it know that the push() was unsafe?

reply
umanwizard
12 days ago
[-]
> In C++ (vs C), if the intent is to pass something large efficiently, then you'd use a reference parameter, not a pointer.

Sure, sorry, I was using "pointer" and "reference" interchangeably. Indeed, references are pointers under the hood.

> You seem to be confused about the meaning of C++ smart pointers

I am not confused at all. I understand exactly what unique_ptr and shared_ptr are in C++. They are basically the equivalent of Rust's Box and Arc (except that they can be null), but I used C++ before Rust so I learned about unique_ptr and shared_ptr first.

You are the one who asked what the advantage of Rust's borrow-checker is over C++-style memory management with smart pointers, but you seem to understand that it doesn't make sense to use smart pointers everywhere. Aren't you answering your own question? The advantage of Rust over C++ is that the borrow checker helps you in the cases where it doesn't make sense to use smart pointers / heap allocations.

You are the one who is maybe confused about what the borrow checker even is/does.

> A C++ compiler might generate a warning for that invalidated reference.

Neither clang nor g++ does so, even with -Wall. I just checked. How could they?

> I agree it'd be nice if a conforming compiler was forced to at least flag it, if not reject it.

If you did this then you would have basically reinvented the borrow checker.

> The problem with doing this in the general case, where it's a user-defined (or library defined, as here) data structure, rather than one defined by the language, is that the compiler needs to inspect the implementation of that "push" method and realize that it might do something to invalidate references (& iterators).

Not in Rust. It only needs to inspect the declaration. That is the whole point of the borrow checker. The fact that you think this can only be done for built-in types is what made me suspect that you don't understand what the borrow checker is.

The declaration of the indexing operator for Vec<T> is roughly (getting rid of some irrelevant details):

    fn index(&self, i: usize) -> &T
This is shorthand for

    fn index<'a>(&'a self, i: usize) -> &'a T
Those references (the `&self` and the returned `&T`) have the same lifetime. That lifetime cannot overlap with any lifetime of a _mutable_ reference to the same data. `push` can be declared like so:

    fn push(&mut self, value: T)
Because this requires a mutable reference to `self`, the compiler statically checks that it does not overlap with any other reference to the same data, which includes the reference returned by the indexing operation, which is why the example I gave won't compile. This works the same way with user-defined types; Vec is not special in any way.

The reason you can't do a similar thing in C++ is because it has no syntax for lifetimes. If you had a function on vector like

    const T& index(size_t i)
you have no idea if the returned `T` is derived from `this` or from somewhere else, so you don't know what its lifetime should be.
reply
HarHarVeryFunny
12 days ago
[-]
Interesting - so essentially calling a "non-const" (mutable) method invalidates any existing references to the object, with this being implemented at compile time by not allowing the mutable method to be called while other references are still alive ?

How exactly is this defined for something like index() which is returning a reference to a different type than the object itself, and where the declaration doesn't indicate that the referred to T is actually part of the parent object? Does the language just define that all references (of any type) returned by member functions are "invalidated" (i.e. caught by compiler borrow checker) by the mutable member call?

What happens in Rust if you attempt to use a reference to an object after the object lifetime has ended? Will that get caught at compile time too, and if so at what point (when attempt is made to use the reference, or at end of object lifetime) ?

reply
umanwizard
12 days ago
[-]
> Interesting - so essentially calling a "non-const" (mutable) method invalidates any existing references to the object, with this being implemented at compile time by not allowing the mutable method to be called while other references are still alive ?

Yes, exactly.

> How exactly is this defined for something like index() which is returning a reference to a different type than the object itself, and where the declaration doesn't indicate that the referred to T is actually part of the parent object?

Only if they have the same lifetime (the 'a in my example). For example, imagine a function that gets an element of a vector and uses that to index into another vector. You might write it like this:

    fn indirect_index<'a, 'b, T>(v1: &'a Vec<usize>, v2: &'b Vec<T>, i: usize) -> &'b T {
        let j = v1[i];
        &v2[j]
    }
The returned value is not invalidated by any future mutations of the first vector, but only the second vector, since they share the lifetime parameter 'b.

> What happens in Rust if you attempt to use a reference to an object after the object lifetime has ended?

This is prevented at compile time by the borrow checker. E.g.:

    // this takes ownership of the vec,
    // and just lets it go out of scope 
    fn drop_vec<T>(_v: Vec<T>) {
    }
    
    fn main() {
        let v = vec![1, 2, 3];
        let x = &v[0];
        drop_vec(v);
        println!("{x}");
    }
This program fails to compile with the following error:

    error[E0505]: cannot move out of `v` because it is borrowed
      --> src/main.rs:9:14
       |
    7  |     let v = vec![1, 2, 3];
       |         - binding `v` declared here
    8  |     let x = &v[0];
       |              - borrow of `v` occurs here
    9  |     drop_vec(v);
       |              ^ move out of `v` occurs here
    10 |     println!("{x}");
       |               --- borrow later used here
reply
HarHarVeryFunny
12 days ago
[-]
Thanks!
reply
HarHarVeryFunny
12 days ago
[-]
> Neither clang nor g++ does so, even with -Wall. I just checked. How could they?

Just by having built-in knowledge of standard library types such as std::vector, the same way the compiler has built-in knowledge of some library functions such as C's printf().

I wouldn't expect such policing to be perfect, but the compiler could at least catch simple cases where reference/iterator use follows an invalidating operation in the same function.

Don't get me wrong - I'm not defending C++. It's a beast of a language, and takes a lot of experience and self-discipline to use without creating bugs that are hard to find.

reply
umanwizard
12 days ago
[-]
> I'm not defending C++.

Right, but you were asking what advantage Rust has over C++, which is what I'm trying to explain. (If you had instead asked what advantage C++ has over Rust, I'd have given a very different answer!)

> It's a beast of a language, and takes a lot of experience and self-discipline to use without creating bugs that are hard to find.

Rust makes creating a certain class of these hard-to-find bugs much harder.

reply
lionkor
13 days ago
[-]
Theres nothing special about unique_ptr, if you dont want allocations and youre ok with just moving your values around directly, you use value and move semantics.
reply
umanwizard
13 days ago
[-]
Move and value (deep copy) semantics exist in Rust too, but neither of those does the same thing as passing a raw pointer (or reference). Which you can do in c++, but not safely. That’s the difference with Rust.

In C or C++ if a function/method takes a raw pointer (or some other lifetime-constrained type like string_view), I have no idea if it’s going to stash it somewhere and try to look at it again later. If it returns a raw pointer or reference, I don’t know whether it is going to get invalidated by some future call. Iterator invalidation is a huge source of UB in C++ but completely unknown in rust.

Clearly having a hash map where all the values are stored indirectly in shared_ptr would let you provide a safe access API, but would be horrible for performance. In Rust you can have the safe API without compromising on efficiency.

reply
fulafel
13 days ago
[-]
Generally in C++ this kind of data transformation faces a lot of barriers. For example, the language semantics require struct fields to have the unoptimized memory layout and contents as far as the user can observe at the byte level.

Low-level programming would be quite a different scene if there were a lot of permitted data optimizations by compilers (profile guided more concise representations of structs, replacing pointer based data structures with indexed layouts, etc).

reply
vlovich123
13 days ago
[-]
One huge caution about this - this uses RefCell*-like semantics which means that the borrow/borrow_mut checking is not thread-safe. This is dangerous because in the docs they have examples of shared_ptr in there but using that from multiple threads would be UB - there's 0 cases where this + shared_ptr makes sense unless you transparently upgraded to an atomics-based variant. Similarly, in a thread-aware implementation you'd expect more efficient handling of locks as well (i.e. borrow / borrow_mut would just acquire a lock and return a proxy without any additional borrow checks).

The other footgun is that there's no concept of a non-owning pointer which is dangerous - there are several equally dominant conventions in C++: naked pointers might be heap allocated, it might represent an optional const&, or it might be a pointer to the stack. Ingesting naked pointers should probably require an explicit annotation instead of assuming it's a new'ed pointer.

It's a neat idea, but I suspect this particular implementation is likely to introduce more UB, not less, because of the thread-safety footguns. In a single-threaded system, the borrow checker doesn't add a huge amount. The biggest gain is of lifetime enforcement which this doesn't get you. Also because you have to construct these Vals at point of initialization of your value, it's viral. Upgrading input arguments to use this can be dangerous if dealing with pointers.

* For C++ users, RefCell is a compile-time borrow checker escape hatch to do the checking at runtime instead - you can borrow immutably as many times xor borrow once mutably - anything else is a abort.

reply
tialaramex
13 days ago
[-]
The C++ type system is completely inadequate for these tasks.

I thought of a rather nice way to picture it, the C++ type system is like you have Roman Numerals, and so now the notation itself fights trying to understand important concepts about numbers (types). Languages with a better type system are like having Arabic Numerals, it's not a panacea, but the notation allows significant improvements in expressiveness and teachability.

This analogy seems especially apt because Roman Numerals lacked zero as I understand it, and the C++ type system doesn't cope well with the idea of ZSTs nor with the Empty types which are analogous to zero in type arithmetic.

reply
pjmlp
13 days ago
[-]
Actually it is more like having both Roman and Arabic Numerals on the same source code, depending on the age of the project, and the C and C++ education background of the team.
reply
tialaramex
13 days ago
[-]
I don't see any way to express something like Option<Infallible> in C++

Regardless of "age of the project" or other considerations, this doesn't seem like a particularly tricky edge case of generic programming and yet C++ is stumped AFAICT

reply
acka
13 days ago
[-]
According[0] to Perplexity.ai, you could use std::optional<std::monostate> to get a C++ approximation of your Rust type.

I am neither an expert in modern C++ nor in Rust, but I have witnessed enough of C++'s evolution over time to know that if C++ language devs find a feature desirable enough they will do whatever it takes to frobnicate the language in order to claim support for that feature.

[0] https://www.perplexity.ai/search/is-it-possible-Sd3TML68TfKv...

reply
GrantMoyer
13 days ago
[-]
std::monostate is a "unit type"[1]; there's only one value with with type monostate (the value is std::monostate{}), so all monostate values are equivalent.

Infalible is a "empty type"[2]; there are no values with type Infalible, so a value cannot be constructed, so Optional<Infalible> is always None, never Some(infallible). Importantly, the compiler knows this and can use it to reason about the correctness of code.

C++ has no empty types. Void is close, but it's sometimes used where a unit type would be used, and anyway it's not a first class type. For example, you can't use std::optional<void>. Even if it were possible to make an empty type in C++, it wouldn't give you anything, because the compiler isn't equipped to reason about them.

BTW, the rust equivalent to std::optional<std::monostate> is Optional<()>. The empty tuple is Rust's idomatic unit type.

[1]: https://en.wikipedia.org/wiki/Unit_type

[2]: https://en.wikipedia.org/wiki/Empty_type

reply
pjmlp
13 days ago
[-]
While I watch with some desmay, one of favourite languages turning beyond PL/I levels of complexity, it isn't alone in this direction.

One of the reasons I am not able to follow up on C++ as much as I did in the past, isn't directly related to its complexity, rather that my main worktools, the JVM, CLR and Web ecosystems, are reaching similar levels of complexity, specially with the 6 months release candence, and there is only so much one can keep up with.

reply
tialaramex
13 days ago
[-]
std::optional<std::monostate> has two values, Option<Infallible> has one, so by my counting that's a 100% error.

It is likely the best that can be done, but that's my point, C++ can't do this because the foundational type system isn't up to the task.

reply
pjmlp
13 days ago
[-]
That wasn't really the point of my remark, rather C with Classes C++98 style with plenty of C style coding for strings and arrays (Roman Numerals), Modern C++ best pratices with safety tooling (Arabic Numerals).
reply
jandrewrogers
13 days ago
[-]
Current version of C++ can handle empty and zero-size types quite well, though you are correct that older versions of C++ had limited support (and non-existent pre-C++11). I create and use them regularly when metaprogramming.

The bigger issue is that all of this new capability can't be easily grafted onto the old standard library. If you were to write a re-designed standard library from a C++20 baseline, and some people do, it is a dramatically different experience. Modern C++ is an amazing library-building language but the 'std' library it comes with is legacy rubbish in many regards.

reply
lsaljdljsljljsa
13 days ago
[-]
Cool idea.... I would say the secret sauce in rust is Match + Enumerations and serde... :)
reply
klabb3
13 days ago
[-]
Agreed. Maybe add immutability with copy semantics by default. And no null (through enumerations but worth pointing out).

Most of the Rust debates, praise and criticism are about higher level features, but just these sane pleasurable fundamentals is the main thing I miss in most languages (mostly Go and JS in my case).

reply
jmax01
13 days ago
[-]
I get it, the Rust enum system is such a connivence, but well the secret sauce in the readme is what the "official people" say....
reply
andrewstuart
13 days ago
[-]
All the pain of rust PLUS all the pain of C++.
reply
xyst
13 days ago
[-]
Don’t threaten me with a good time :P
reply
dietr1ch
13 days ago
[-]
The best of both worlds
reply
projektfu
11 days ago
[-]

  auto foo2 = foo0; // foo0's ownership is not transfered to foo2
was this supposed to say "now transferred"?
reply
jmax01
10 days ago
[-]
Yes thats a typo
reply
habibur
13 days ago
[-]
For other that are wondering how C++ programmers memory managed till now -- check RAII.
reply
monax
13 days ago
[-]
I built a whole operating system using ideas transplanted from Rust into C++

https://github.com/skift-org/skift

reply
senkora
13 days ago
[-]
See also Circle: https://www.circle-lang.org/

I don’t think it’s available yet, but last I heard that dev is working on a borrow checker for C++ as well.

reply
jacobgorm
13 days ago
[-]
Anything to not have to use cargo and crates.io.
reply