Could someone explain to me when one would use this? Is it for educational purposes perhaps?
There is a place for runtime borrow checking. Some safe cases in well-designed code are intrinsically un-checkable at compile-time. C++ is pretty amenable to addressing these cases using the type system to dynamically guarantee that references through a unique_ptr-like object are safe at the point of dereference. Much of what the borrow checker does at compile-time could potentially be done at runtime with the caveat that it has an overhead.
This has more than a passing resemblance to how deadlock-free locking systems work. They don't actually prevent the possibility of deadlocks, as that may not be feasible, but they can detect deadlock conditions and automatically edit/repair the execution graph to eliminate the deadlock instance. If a deadlock occurs in a database and no one notices, did it really happen?
Rust does it at compile time, so why cant C++? to me this detail completely kills the usefulness of this project
Also, Google (more specifically, the Chrome folks) tried to make it work via templates, but found that it was not possible. There’s a limit to template magic, even.
That said, you're right on some level that it's truly semantics that matter, not syntax, but you need syntax to control the semantics.
Where Rust won't compile when a lifetime can't be determined, IIRC Nim's static analysis will make a copy (and tell you), so it's more as a performance optimisation than for correctness.
Regardless of the details and extent of the borrow checking, however, it shows that it's possible in principle to infer lifetimes without explicit annotation. So, perhaps C++ could support it.
As you say, it's the semantics of the syntax that matter. I'm not familiar with C++'s compiler internals though so it could be impractical.
I still think that my overall point stands: sure, you can treat this as an optimization pass, but that kind of overhead isn't acceptable in the C++/Rust world. And syntax is how you communicate programmer intent, to resolve the sorts of ambiguous cases described in some other comments here.
I am again reminded of escape analysis https://steveklabnik.com/writing/borrow-checking-escape-anal...
Wait, how does that work? For example, take the following Rust function with insufficient lifetime specifiers:
pub fn lt(x: &i32, y: &i32) -> &i32 {
if x < y { x } else { y }
}
You're saying Nim will change one/all of those references to copies and will also emit warnings saying it did that?Writing an equivalent program is a bit weird because: 1) Nim does not distinguish between owned and borrowed types in the parameters (except wrt. lent which is bugged and only for optimizations), 2) Nim copies all structures smaller than $THRESHOLD regardless (the threshold is only slightly larger than a pointer but definitely includes all integer types - it's somewhere in the manual) and 3) similarly, not having a way to explicitly return borrows cuts out much of the complexity of lifetimes regardless, since it'll just fall back on reference counting. The TL;DR here though is no, unless I'm mistaken, Nim will fall back on reference counting here (were points 1 and 2 changed).
For clarity as to Nim's memory model: it can be thought of as ownership-optimized reference counting. It's basically the same model as Koka (a research language from Microsoft). If you want to learn more about it, because it is very neat and an exceptionally good tradeoff between performance/ease of use/determinism IMO, I would suggest reading the papers on Perseus as the Nim implementation is not very well-documented. (IIRC the main difference between Koka and Nim's implementation is that Nim frees at the end of scope while Koka frees at the point of last use.)
Thanks for the explanation and the reading suggestions! I'll see about taking a look.
You're right. I was sure I read that it would announce when it does a copy over a sink but now I look for it I can't find it!
> The static analysis is not very transparent.
There is '--expandArc' which shows the compile time transformations performed but that's a bit more in depth.
For memes, obviously.
Me: I want Rust!
Tech lead: We have Rust at home!
Rust at home: rusty.hpp
The goal/why is, as almost always, explained in the README:
> rusty.hpp as the time or writing this is a very experimental thing. Its primary purpose is to experiment and test out different coding styles and exploring a different than usual C++ workspace.
TLDR: it's a experiment
Interesting, why is this? I would have assumed the compiler could have optimized away that indirection.
As a result for example std::optional<&T> doesn't exist, because to a C++ programmer it seems as though this might have assign-through semantics (!) and so WG21 decided to kick this can down the road. C++ 26 might get std::optional<&T>
The main concern with that component was ensuring we can allocate stack storage for an object that may or may not be initialized.
The reference is easily achievable by using T* so is of minimal value, but also poses some more semantic problems since a reference is not copyable while an optional is.
For me the important use case is pattern matching, which C++ doesn't yet have. Pattern matching really changes how you see the entire language.
Here is the first example I found on Google if that helps you understand.
std::variant<Fluid, LightItem, HeavyItem, FragileItem> package;
std::visit(overload{
[](Fluid& ) { std::cout << "fluid\n"; },
[](LightItem& ) { std::cout << "light item\n"; },
[](HeavyItem& ) { std::cout << "heavy item\n"; },
[](FragileItem& ) { std::cout << "fragile\n"; }
}, package);
match (state, pipe) {
(State::None, Pipe::Ground) => {
if inside {
n += 1;
}
}
(State::None, Pipe::Vert) => {
inside = !inside;
}
(State::None, Pipe::Se) => {
state = State::South;
}
(State::None, Pipe::Ne) => {
state = State::North;
}
// Horizontal lines make no difference to anything
(State::North | State::South, Pipe::Horiz) => {}
// U-turns
(State::South, Pipe::Sw) | (State::North, Pipe::Nw) => {
state = State::None;
}
// Form a vertical line
(State::South, Pipe::Nw) | (State::North, Pipe::Sw) => {
inside = !inside;
state = State::None;
}
_ => {
panic!("Unexpected sequence {state:?} {pipe:?}");
}
}
Guess what, the same syntax I gave supports exactly that as well.
I haven't checked recent C++ standards, but I don't believe you can use partial classes/extensions in C++ like some other OO languages to add these methods to a native type. Many helper functions commonly used in Rust also only seem to exist in C++23, which not ever project can be compiled under yet.
In normal C++ code, the native types would probably be better to use, but if you're going full Rust style code, you may as well use these new types.
Isn't that value()?
So that says it's a method (its first parameter is the type itself, but named self rather than as a normal parameter so we can use method syntax instead of calling the function Option::expect) but it also takes an immutable reference to a string slice.
That second parameter, msg, is the text for a diagnostic if/ when you're wrong.
So, in a sense it's like value() but the diagnostic text means, when I was wrong...
let goose = found.expect("Our goose finder should always find a goose");
... I get a diagnostic saying that the problem is with "Our goose finder should always find a goose". Huh. I think we know where to start trouble shooting.In your example, it's likely that the person who sees this message won't have enough context to understand it; it's more like a debugging assert. Since you'll need a debugger and a breakpoint anyway, the message isn't very helpful.
In most cases then, if you don't know this code very well, that's fine because it's not your bug. In the edge case that you just got handed a pile of poorly documented code somebody else wrote, perhaps over several years, well, at least you know what they thought is supposed to happen here and that they're wrong.
And no, I don't find it better to be told "It broke, break out a debugger and try to reproduce the fault". With this text we can revisit the Goose wrangling code and maybe, now that we're staring at it knowing a real customer saw this fault, we are inspired and realise that sometimes it won't find a Goose, then decide what to do about that.
I think it's pretty odd to use a quick example someone rattled off on a web forum to explain a function's behaviour as evidence of its usefulness or lack thereof, as if the only thing a person could possibly write in a freeform error message is "Our goose finder should always find a goose".
Now I appreciate a clear explanation for an uncommon assert and for example, OpenCV could do with more of those, but in most functions, seeing the line that throws the error is enough to understand.
On the other hand expect will provoke the message you wrote if it fails. Of course if it's inside a consumer's fitness tracker it probably doesn't have any way to show the message to a human, but that's a different problem - the fitness tracker presumably can't display stack traces either.
Unlike C++, Rust doesn't support throwing exceptions, so expect() failing would panic. By default, this means dumping a stack trace and terminating the program, and the message provided in "expect" would be printed right before the stack trace.
For example:
fn main() {
let x: Option<i32> = None;
x.expect("Oh no!");
}
will print: thread 'main' panicked at src/main.rs:3:7:
Oh no!
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
With RUST_BACKTRACE set to "full", it'll print: $ RUST_BACKTRACE=full ./target/release/demo
thread 'main' panicked at src/main.rs:3:7:
Oh no!
stack backtrace:
0: 0x5ff43befa755 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::ha52e99bffe3c0898
1: 0x5ff43bf1769b - core::fmt::write::h5fdd5156f2480a24
2: 0x5ff43bef8a5f - std::io::Write::write_fmt::ha2c0b019f448d2c3
3: 0x5ff43befa52e - std::sys_common::backtrace::print::he84813a4ed1c2825
4: 0x5ff43befb7e9 - std::panicking::default_hook::{{closure}}::h033521c27c9929b1
5: 0x5ff43befb52d - std::panicking::default_hook::had42987aad9de78c
6: 0x5ff43befbc83 - std::panicking::rust_panic_with_hook::h80fc1b429f5a5699
7: 0x5ff43befbb64 - std::panicking::begin_panic_handler::{{closure}}::h5aa7b89233b1ae33
8: 0x5ff43befac19 - std::sys_common::backtrace::__rust_end_short_backtrace::h0e4c5e6cee7f8a24
9: 0x5ff43befb897 - rust_begin_unwind
10: 0x5ff43bee0b63 - core::panicking::panic_fmt::h3bea7be9b6a41ace
11: 0x5ff43bf16c6c - core::panicking::panic_display::h20da06138ce63f85
12: 0x5ff43bee0b2c - core::option::expect_failed::h92448d4f1092eaaa
13: 0x5ff43bee127a - demo::main::ha244b8f1ce6eaa44
14: 0x5ff43bee1223 - std::sys_common::backtrace::__rust_begin_short_backtrace::hfc5c93265480da58
15: 0x5ff43bee1239 - std::rt::lang_start::{{closure}}::h988fdfb65ef3da3b
16: 0x5ff43bef6be6 - std::rt::lang_start_internal::h64c4082ce77a6bd6
17: 0x5ff43bee12a5 - main
18: 0x741d2a628150 - __libc_start_call_main
at ./csu/../sysdeps/nptl/libc_start_call_main.h: 58:16
19: 0x741d2a628209 - __libc_start_main_impl
at ./csu/../csu/libc-start.c:360:3
20: 0x5ff43bee1155 - _start
21: 0x0 - <unknown>
Whether you would want to replicate this behaviour in C++, I don't know; I find panic!() to be quite destructive, and catastrophic when it's used in libraries or frameworks. I think the C++ implementation just throws an exception, but Rust's .expect() does not behave like .value() in C++.Implicit conversion that immolates: imolicit conversion.
In this world, as the document you've linked says: "The behavior is undefined if *this does not contain a value."
The operators for such access are actually `noexcept` - the exception you're apparently relying on would be illegal.
Once in a while I go down a rabbit hole, but hey, it's not as though HN isn't a rabbit hole anyway.
Of course not, they’re literally `noexcept`, what they do is UB if empty.
value() will throw.
This feels like an easy choice in isolation, but at the time this was being developed (and arguably even now), there’s no definitive plan to holistically move C++ code to being safe by default. So whenever that happens, a ton of things will need to be dealt with, and there’s always the possibility that being an odd API here makes that overall move harder not easier. And C++ is regularly criticized for being inconsistent. Do you deepen those criticisms just so that one tiny corner of an API is better?
If I’m honest with myself, I probably would have made the same choices they did in this situation.
Some of the more modern proposals (std::optional is quite old) actually make an explicit appeal to WG21 not to choose consistency at the price of safety because it just needlessly makes the language worse. "But we made the language worse before" is more like a plea for help than an excuse.
Barry Revzin did this in his "do expressions" which are an attempt to kludge compound expressions into C++ which really wants them to be compound statements instead. For consistency, all the obvious mistakes you'll make in do expressions could introduce UB like they would in equivalent C++ core features, but Barry argues they should be Ill-Formed instead - resulting in your mistakes not compiling rather than having undefined behaviour.
I'm not aware of any C++ compiler doing it, but it seems smart pointer overhead could be automatically and safely reduced (in same way one can do it manually) by the compiler lowering the generated code to use raw pointers where permissible.
As a benefit of the thread-safety notion, Rust can have two reference-counting pointer types: Arc, which uses atomic reference counting and is roughly equivalent to std::shared_ptr, and Rc which does not use atomics. Rc cannot be used across multiple threads at the same time, and the borrow checker will prevent you from doing this.
Rc is appropriate for data structures which internally benefit from multiple pointers (e.g. graphs) but where all of that information is internal to a single data structure - this becomes available without paying the price of atomics.
So basically Rust is combining object ownership and thread safety while C++ keeps thread safety separate, which would seem to provide more flexibility, but also lets you shoot yourself in the foot.
Just thinking out loud, I wonder if C++ could better address this by also having a class of thread-aware smart pointers? -- but the problem is that C++ always has the old/new (C, C++) way of doing things - pthreads vs std::thread, std::mutex, etc, so even if the language provides easier ways of writing bug free code, there is no way to force developers to use those facilities.
In C++ there is also the issue of how to make statically allocated data structures thread safe in an enforceable way. Another kind of smart reference object, perhaps? Disallow global objects not accessed by such references?
C++ (which I have used since long before C++11) really wants to be two conflicting things - encompassing C's low level role as the ultimate systems programming language with no guardrails, while also wanting to compete as a much higher-level safer language for application developers. Perhaps the two safe+unsafe roles can be better combined into one language if one were to start from scratch. I'm not sure that Rust gets it right either - erring in the other direction by not being flexible enough.
I ask because I can think of a few ways it’s less flexible than C, but I also think that effect is massively overstated by people who aren’t familiar with the language. There are OS kernels written in Rust, for example.
template <typename T>
struct Locker {
using M = std::shared_mutex;
struct Locked {
Locked(mtx, value) : m_lock(mtx), m_value(value) {}
// operator->, operator*, get, etc.
private:
std::lock_guard<M> m_lock;
std::shared_ptr<T> m_value;
};
struct Shared {
Shared(mtx, value) : m_lock(mtx), m_value(value) {}
// operator->, operator*, get, etc.
private:
std::shared_lock<M> m_lock;
std::shared_ptr<const T> m_value;
};
Shared shared() { return Shared{m_mutex, m_value}; }
Locked locked() { return Locked{m_mutex, m_value}; }
// a nice forwarding ctor that prevents null m_value
private:
std::shared_ptr<T> m_value;
M m_mutex;
};
Rust `Arc` = C++ `std::shared_ptr`
Rust `Rc` = C++ `std::shared_ptr` but using a simple integer instead of an atomic so it is not thread safe
`Arc` and `Rc` do not allow you to mutate their contents directly so instead you should use "interior mutability" using something like a `Mutex` (thread-safe) or `RefCell` (not thread-safe), which have runtime checks to ensure no undefined behaviour is introduced. So `Arc<Mutex<T>>` makes it possible to mutate `T`, but `Arc<T>` cannot. Some types like atomics do not require mutability at all, so an `Arc<AtomicBool>` can be mutated directly.
An example of a big C++ codebase using something similar is Chromium, where `std::shared_ptr` is forbidden and `base::RefCounted` (Rust `Rc`) and `base::RefCountedThreadSafe` (Rust `Arc`) should be used instead. WebKit does this too.
This is not actually true, but it's close enough for your purposes here.
But just to be clear about it, see stuff like this: https://stackoverflow.com/questions/58339165/why-can-a-t-be-...
That's for slices, for dynamically sized types (eg. `Box<dyn ToString>`) it contains a pointer to the virtual table.
> Because they lack a statically known size, these types can only exist behind a pointer. Any pointer to a DST consequently becomes a wide pointer consisting of the pointer and the information that "completes" them (more on this below).
> Rust `Arc` = C++ `std::shared_ptr`
GP says:
> Rust requires shared pointers (Arc) to also explicitly implement some sort of Mutex-equivalent runtime safety check in order to mutate the data.
Which is it?
> An example of a big C++ codebase using something similar is Chromium ...
Chromium's smart pointers are similar to their standard counterparts -- no mutexes for write access to pointed data.
Also, tangent but interesting: From https://www.chromium.org/developers/smart-pointer-guidelines...:
> Reference-counted objects make it difficult to understand ownership and destruction order, especially when multiple threads are involved. There is almost always another way to design your object hierarchy to avoid refcounting
I mentioned Chromium because they also differentiate between thread safe and non-thread safe shared pointers.
If anything, Rust shared pointers are more similar to C++ std pointers because in Chromium the reference count is inside the class, which is very handy because you can reconstruct a smart pointer from a raw pointer (like `this`), at the cost of needing `T` to extend `base::RefCounted`.
- RefCounted: It's like shared_ptr but refcount load/modify/store operation is not atomic, thus not thread-safe. No synchronization for pointed data.
- RefCountedThreadSafe: It's like shared_ptr. This means refcount load/modify/store is atomic, so has overhead, yet safe to pass across thread boundaries. Again, just like shared_ptr, no synchronization for pointed data.
- Locker class above: It's an (incomplete) wrapper around shared_ptr where read-only access goes through a shared lock and rw access goes through an exclusive lock. I suppose this is what rust's ARC guarantees at compile-time with less overhead the sketch above?
So;
> Both are true, Rust just has more restrictions.
No, both are not true, my understanding of ARC ~= Locker && ARC > shared_ptr
Your `Locker` does not do what `Arc` does, even at compile time, because it does not allow concurrent access, like an `Arc<AtomicBool>` would. Your `Locker` is more like an `Arc<RwLock<T>>`.
Best equivalent you can get in C++ is `Arc<T>` = `std::shared_ptr<const T>`.
https://doc.rust-lang.org/std/sync/struct.Arc.html
> Shared references in Rust disallow mutation by default, and Arc is no exception: you cannot generally obtain a mutable reference to something inside an Arc. If you need to mutate through an Arc, use Mutex, RwLock, or one of the Atomic types.
I guess you could get the final pieces to get something similar by creating `Send` and `Sync` traits in C++: https://doc.rust-lang.org/nomicon/send-and-sync.html. I think the main pain point here is that you cannot auto-derive `Send` and `Sync` so it would end up being very verbose.
The rust borrow checker works on values, and all that, not just on objects with RAII.
This will not compile:
let x = 42;
let r1 = &x;
let m1 = &mut x;
println!("{r1}");
It will help you understand why "smart pointers" still won't help you.
e.g. std::string_view seems broken by design in wanting to support both raw-pointer based strings with zero ownership semantics as well as std::string. A string view (abstract concept) really needs to either have shared ownership of the underlying string, or have a non-owning reference that knows when it has been invalidated.
I'm not sure why this means you shouldn't be able to create a string_view on top of std::string, though. You can create a Rust &str on top of String, it just doesn't participate in ownership.
There are lot's of places where C++'s long history shows it's ragged edges - where newer features really don't play so nice with older ones. One would certainly hope that a new language like Rust is at least initially more consistent.. the question is what will it look like in 20 years time, if it's still being actively developed at that time?
This is somewhat typical of where C++ is at nowadays - layering new functionality on top of old that wasn't designed to accommodate it. In an ideal world the language and libraries would be refactored and rationalized, but of course backwards compatibility precludes that. This is the fate of old languages - stay unchanged and become obsolete, or keep layering on new functionality and become messy and inconsistent.
The sheer difficulty of doing this is one of the motivations behind Rust's borrow checker, which uses a combination of type system and static analyses to prove the safety without running anything. In fact this problem is probably easier to solve for languages where everything is GC-managed; those languages would have a heavy runtime which can transparently handle that in principle!
C++'s unique_ptr and shared_ptr both have a get() method that will return you a raw pointer to the managed object, which can be a safe optimization within a function holding ownership to the object, as well as allowing you to use legacy functions on it that take raw pointer arguments.
I was thinking the C++ compiler could itself realize when it is safe to do so, save the raw pointer to a temp variable, and "rewrite" smart pointer accesses to use this temporary raw pointer. One could even imagine the compiler changing smart pointer function parameters to raw pointers in some circumstances.
From other replies in this thread is seems that Rust's borrow-checker addresses the high level issues of object ownership and thread safety - it's not just a replacement for raw pointers (i.e. a smart pointer), which is exactly what C++' shared/unique pointers are.
In C++ the result of breaking semantic rules (not just those checked by the borrowck, most of the semantic rules in the language) is IFNDR - your program is Ill Formed, No Diagnostic Required - your entire program has no particular meaning, there is no explanation for what it does, shrug. In Rust it doesn't compile.
For people whose overriding mission is to get the code to compile, C++ is very attractive. Broken garbage? Meaningless nonsense? Not my problem it compiled so I went home. If you want to write software that works, that seems like you didn't do the hard part.
If you really believe that Google and FaceBook (etc, etc) hire morons who don't care if their code works, then you are not qualified to talk about programming languages.
But Rust now has a large amount of mindshare there and is being used a lot in new projects.
T t1; // stack, reference as t1
T* t2 = new T(); // heap, raw pointer, reference as * t2
std::unique_ptr<T> t3 = std::make_unique<T>(); // heap, smart pointer, reference as * t3
T* pt = &t1; // Create a raw pointer to t1! Bad idea!
Why not? What if you have some function f(T *) that you want to call?
But anyway, we're not _just_ talking about stack allocations, but also extra levels of indirection on the heap. For example, vectors store their elements in a heap-allocated buffer directly. If they kept them all in shared pointers, there would be an extra level of indirection. This means e.g. vector::operator[] has to return a reference (which is basically the same thing as a pointer under the hood); it can't return shared_ptr or similar (because storing all its elements as shared pointers would make it way slower due to the extra allocations).
In Rust, vector access is safe (due to the borrow checker), but in C++, it's not.
vector<int> v {1, 2, 3};
int& x = v[0];
v.push_back(4);
printf("%d\n");
This code is UB in C++. In Rust, it's impossible to write something like this. fn main() {
let mut v = vec![1, 2, 3];
let x = &v[0];
v.push(4);
println!("{x}");
}
This code fails to compile.In C++ (vs C), if the intent is to pass something large efficiently, then you'd use a reference parameter, not a pointer.
You seem to be confused about the meaning of C++ smart pointers - the whole point of them (as a replacement for C's raw pointers) is that they control and indicate ownership. You can't just assign a smart pointer to something you don't own (like an element of a vector). You can copy a shared_ptr to create an additional reference, or move a unique_ptr to move ownership.
A C++ compiler might generate a warning for that invalidated reference. clang++ is generally much better than g++, but I agree it'd be nice if a conforming compiler was forced to at least flag it, if not reject it.
The problem with doing this in the general case, where it's a user-defined (or library defined, as here) data structure, rather than one defined by the language, is that the compiler needs to inspect the implementation of that "push" method and realize that it might do something to invalidate references (& iterators). In the case of a library the compiler won't have access to the implementation to figure that out. How would Rust handle this if "vec" were a user-defined type where only the definition (not implementation) was available - how would it know that the push() was unsafe?
Sure, sorry, I was using "pointer" and "reference" interchangeably. Indeed, references are pointers under the hood.
> You seem to be confused about the meaning of C++ smart pointers
I am not confused at all. I understand exactly what unique_ptr and shared_ptr are in C++. They are basically the equivalent of Rust's Box and Arc (except that they can be null), but I used C++ before Rust so I learned about unique_ptr and shared_ptr first.
You are the one who asked what the advantage of Rust's borrow-checker is over C++-style memory management with smart pointers, but you seem to understand that it doesn't make sense to use smart pointers everywhere. Aren't you answering your own question? The advantage of Rust over C++ is that the borrow checker helps you in the cases where it doesn't make sense to use smart pointers / heap allocations.
You are the one who is maybe confused about what the borrow checker even is/does.
> A C++ compiler might generate a warning for that invalidated reference.
Neither clang nor g++ does so, even with -Wall. I just checked. How could they?
> I agree it'd be nice if a conforming compiler was forced to at least flag it, if not reject it.
If you did this then you would have basically reinvented the borrow checker.
> The problem with doing this in the general case, where it's a user-defined (or library defined, as here) data structure, rather than one defined by the language, is that the compiler needs to inspect the implementation of that "push" method and realize that it might do something to invalidate references (& iterators).
Not in Rust. It only needs to inspect the declaration. That is the whole point of the borrow checker. The fact that you think this can only be done for built-in types is what made me suspect that you don't understand what the borrow checker is.
The declaration of the indexing operator for Vec<T> is roughly (getting rid of some irrelevant details):
fn index(&self, i: usize) -> &T
This is shorthand for fn index<'a>(&'a self, i: usize) -> &'a T
Those references (the `&self` and the returned `&T`) have the same lifetime. That lifetime cannot overlap with any lifetime of a _mutable_ reference to the same data. `push` can be declared like so: fn push(&mut self, value: T)
Because this requires a mutable reference to `self`, the compiler statically checks that it does not overlap with any other reference to the same data, which includes the reference returned by the indexing operation, which is why the example I gave won't compile. This works the same way with user-defined types; Vec is not special in any way.The reason you can't do a similar thing in C++ is because it has no syntax for lifetimes. If you had a function on vector like
const T& index(size_t i)
you have no idea if the returned `T` is derived from `this` or from somewhere else, so you don't know what its lifetime should be.How exactly is this defined for something like index() which is returning a reference to a different type than the object itself, and where the declaration doesn't indicate that the referred to T is actually part of the parent object? Does the language just define that all references (of any type) returned by member functions are "invalidated" (i.e. caught by compiler borrow checker) by the mutable member call?
What happens in Rust if you attempt to use a reference to an object after the object lifetime has ended? Will that get caught at compile time too, and if so at what point (when attempt is made to use the reference, or at end of object lifetime) ?
Yes, exactly.
> How exactly is this defined for something like index() which is returning a reference to a different type than the object itself, and where the declaration doesn't indicate that the referred to T is actually part of the parent object?
Only if they have the same lifetime (the 'a in my example). For example, imagine a function that gets an element of a vector and uses that to index into another vector. You might write it like this:
fn indirect_index<'a, 'b, T>(v1: &'a Vec<usize>, v2: &'b Vec<T>, i: usize) -> &'b T {
let j = v1[i];
&v2[j]
}
The returned value is not invalidated by any future mutations of the first vector, but only the second vector, since they share the lifetime parameter 'b.> What happens in Rust if you attempt to use a reference to an object after the object lifetime has ended?
This is prevented at compile time by the borrow checker. E.g.:
// this takes ownership of the vec,
// and just lets it go out of scope
fn drop_vec<T>(_v: Vec<T>) {
}
fn main() {
let v = vec![1, 2, 3];
let x = &v[0];
drop_vec(v);
println!("{x}");
}
This program fails to compile with the following error: error[E0505]: cannot move out of `v` because it is borrowed
--> src/main.rs:9:14
|
7 | let v = vec![1, 2, 3];
| - binding `v` declared here
8 | let x = &v[0];
| - borrow of `v` occurs here
9 | drop_vec(v);
| ^ move out of `v` occurs here
10 | println!("{x}");
| --- borrow later used here
Just by having built-in knowledge of standard library types such as std::vector, the same way the compiler has built-in knowledge of some library functions such as C's printf().
I wouldn't expect such policing to be perfect, but the compiler could at least catch simple cases where reference/iterator use follows an invalidating operation in the same function.
Don't get me wrong - I'm not defending C++. It's a beast of a language, and takes a lot of experience and self-discipline to use without creating bugs that are hard to find.
Right, but you were asking what advantage Rust has over C++, which is what I'm trying to explain. (If you had instead asked what advantage C++ has over Rust, I'd have given a very different answer!)
> It's a beast of a language, and takes a lot of experience and self-discipline to use without creating bugs that are hard to find.
Rust makes creating a certain class of these hard-to-find bugs much harder.
In C or C++ if a function/method takes a raw pointer (or some other lifetime-constrained type like string_view), I have no idea if it’s going to stash it somewhere and try to look at it again later. If it returns a raw pointer or reference, I don’t know whether it is going to get invalidated by some future call. Iterator invalidation is a huge source of UB in C++ but completely unknown in rust.
Clearly having a hash map where all the values are stored indirectly in shared_ptr would let you provide a safe access API, but would be horrible for performance. In Rust you can have the safe API without compromising on efficiency.
Low-level programming would be quite a different scene if there were a lot of permitted data optimizations by compilers (profile guided more concise representations of structs, replacing pointer based data structures with indexed layouts, etc).
The other footgun is that there's no concept of a non-owning pointer which is dangerous - there are several equally dominant conventions in C++: naked pointers might be heap allocated, it might represent an optional const&, or it might be a pointer to the stack. Ingesting naked pointers should probably require an explicit annotation instead of assuming it's a new'ed pointer.
It's a neat idea, but I suspect this particular implementation is likely to introduce more UB, not less, because of the thread-safety footguns. In a single-threaded system, the borrow checker doesn't add a huge amount. The biggest gain is of lifetime enforcement which this doesn't get you. Also because you have to construct these Vals at point of initialization of your value, it's viral. Upgrading input arguments to use this can be dangerous if dealing with pointers.
* For C++ users, RefCell is a compile-time borrow checker escape hatch to do the checking at runtime instead - you can borrow immutably as many times xor borrow once mutably - anything else is a abort.
I thought of a rather nice way to picture it, the C++ type system is like you have Roman Numerals, and so now the notation itself fights trying to understand important concepts about numbers (types). Languages with a better type system are like having Arabic Numerals, it's not a panacea, but the notation allows significant improvements in expressiveness and teachability.
This analogy seems especially apt because Roman Numerals lacked zero as I understand it, and the C++ type system doesn't cope well with the idea of ZSTs nor with the Empty types which are analogous to zero in type arithmetic.
Regardless of "age of the project" or other considerations, this doesn't seem like a particularly tricky edge case of generic programming and yet C++ is stumped AFAICT
I am neither an expert in modern C++ nor in Rust, but I have witnessed enough of C++'s evolution over time to know that if C++ language devs find a feature desirable enough they will do whatever it takes to frobnicate the language in order to claim support for that feature.
[0] https://www.perplexity.ai/search/is-it-possible-Sd3TML68TfKv...
Infalible is a "empty type"[2]; there are no values with type Infalible, so a value cannot be constructed, so Optional<Infalible> is always None, never Some(infallible). Importantly, the compiler knows this and can use it to reason about the correctness of code.
C++ has no empty types. Void is close, but it's sometimes used where a unit type would be used, and anyway it's not a first class type. For example, you can't use std::optional<void>. Even if it were possible to make an empty type in C++, it wouldn't give you anything, because the compiler isn't equipped to reason about them.
BTW, the rust equivalent to std::optional<std::monostate> is Optional<()>. The empty tuple is Rust's idomatic unit type.
One of the reasons I am not able to follow up on C++ as much as I did in the past, isn't directly related to its complexity, rather that my main worktools, the JVM, CLR and Web ecosystems, are reaching similar levels of complexity, specially with the 6 months release candence, and there is only so much one can keep up with.
It is likely the best that can be done, but that's my point, C++ can't do this because the foundational type system isn't up to the task.
The bigger issue is that all of this new capability can't be easily grafted onto the old standard library. If you were to write a re-designed standard library from a C++20 baseline, and some people do, it is a dramatically different experience. Modern C++ is an amazing library-building language but the 'std' library it comes with is legacy rubbish in many regards.
Most of the Rust debates, praise and criticism are about higher level features, but just these sane pleasurable fundamentals is the main thing I miss in most languages (mostly Go and JS in my case).
auto foo2 = foo0; // foo0's ownership is not transfered to foo2
was this supposed to say "now transferred"?I don’t think it’s available yet, but last I heard that dev is working on a borrow checker for C++ as well.