I want it to be easier to have more crates. The overhead of converting a module tree into a new crate is high. Modules get to have hierarchy, but crates end up being flat. Some of this is a direct result of the flat crate namespace.
A lot of the toil ends up coming from the need to muck with toml files and the fact that rust-analyzer can’t do it for me. I want to have refactoring tools to turn module trees into crates easily.
I feel like when I want to do that, I have to play this game of copying files then playing whack-a-mole until I get all the dependencies right. I wish dependencies were expressed in the code files themselves a la go. I think go did a really nice job with the packaging and dependency structure. It’s what I miss most.
For example, classical object-oriented programming uses classes both as an encapsulation boundary (where invariants are maintained and information is hidden) and a data boundary, whereas in Rust these are separated into the module system and structs separately. This allows for complex invariants cutting across types, whereas a private member of a class can only ever be accessed within that class, including by its siblings within a module.
Another example is the trait object (dyn Trait), which allows the client of a trait to decide whether dynamic dispatch is necessary, instead of baking it into the specification of the type with virtual functions.
Notice also the compositionality: if you do want to mandate dynamic dispatch, you can use the module system to either only ever issue trait objects, or opaquely hide one in a struct. So there is no loss of expressivity.
Rust's users find the module system even more difficult than the borrow checker. I've tried to figure out why, and figure out how to explain it better, for years now. Never really cracked that nut. The modules chapter of TRPL is historically the least liked, even though I re-wrote it many times. I wonder if they've tried again lately, I should look into that.
> Another example is the trait object (dyn Trait), which allows the client of a trait to decide whether dynamic dispatch is necessary, instead of baking it into the specification of the type with virtual functions.
Here I'd disagree: this is separating the two features cleanly. Baking it into the type means you only get one choice. This is also how you can implement traits on foreign types so easily, which matters a lot.
I'm surprised the module system creates controversy. It's a bit confusing to get one's head around at first, especially when traits are involved, but the visibility rules make a ton of sense. It quite cleanly solves the problem of how submodules should interact with visibility. I've started using the Rust conventions in my Python projects.
I have only two criticisms:
First, the ergonomics aren't quite there when you do want an object-oriented approach (a "module-struct"), which is maybe the more common usecase. However, I don't know if this is a solvable design problem, so I prefer the tradeoff Rust made.
Second, and perhaps a weaker criticism, the pub visibility qualifiers like pub(crate) seems extraneous when re-exports like pub use exist. I appreciate maybe these are necessary for ergonomics, but it does complicate the design.
There is one other piece of historical Rust design I am curious about, which is the choice to include stack unwinding in thread panics. It seems at odds with the systems programming principle usecase for Rust. But I don't understand the design problem well enough to have an opinion.
> the pub visibility qualifiers like pub(crate) seems extraneous
I feel this way too, but some people seem to use them.
> which is the choice to include stack unwinding in thread panics. It seems at odds with the systems programming principle usecase for Rust.
In what way?
The module system in Rust is conceptually huge, and I feel it needs a 'Rust modules: the good parts' resource to guide people.
(1) There are five different ways to use `pub`. That's pretty overwhelming, and in practice I almost never see `pub(in foo)` used.
(2) It's possible to have nested modules in a single file, or across multiple files. I almost never see modules with braces, except `mod tests`.
(3) It's possible to have either foo.rs or foo/mod.rs. It's also possible to have both foo.rs and foo/bar.rs, which feels inconsistent.
(4) `use` order doesn't matter, which can make imports hard to reason about. Here's a silly example:
use foo::bar; use bar::foo;
(Huge fan of your writing, by the way!)
Thank you!
This bit me when trying to write a static analysis tool for Rust that finds missing imports: you essentially need to loop over imports repeatedly until you reach a fixpoint. Maybe it bites users rarely in practice.
It also have solved the problem where you ended doing a lot of `public` not because the logic dictated it, but as only way to share across crates.
It should have been all modules (even main.rs with mandatory `lib.rs` or whatever) and `crate` should have been a re-exported interface.
Very very old Rust had "crate files" which were this https://github.com/rust-lang/rust/blob/a8eeec1dbd7e06bc811e5...
.rc standing for "rust crate"
There's pros and cons here. I'm of two minds.
That's an interesting take. I feel like all three of these languages fit into pretty discrete lanes that the others don't. Python for quick hacking or scientific stuff, Go for web services and self-contained programs, Rust for portability (specifically sharing code as C ABI or WASM) and safety.
> It’s not hard to learn
I agree Rust is easy to learn. I've done it 4 or 5 times now.
No joke, is true.
When I see Rust first time I agree to everything, agree is the way, is correct and is nice (it hurt me a lot that have used around 10+ langs before professionally, and I come just right from F# so need so little to be converted!).
And obviously that is how I should have done the stuff if the other langs have the proper features!
Then, I need to actually program correctly and bam! Is so hard!
I need to relearn it many times. And yes, the hard part is to stop doing all the things that I have done in all the other langs implicitly.
BTW, the hard part with Rust is that a)It syntax is too familiar and b) Is a totally different programming model. Until it not get the second part and truly pay attention to `moves, borrow, share, lock, clone, copy` instead of `loops, iter, conditional, read, write, etc` then is very hard to progress.
This is somewhat of a stretch: dyn Traits in Rust are sort of like compile time duck typing. OTOH, interfaces in Go and virtual functions in C++ are the same thing.
(Though you can emulate them with unsafe in Rust, like anyhow)
Also, type system is very nice, probably one of the best amongst non-functional programming languages. Overloading, generics, type inference, distinct type aliases, first-class functions, subranges, etc. etc.
I've seen Go have praise over not supporting OOP. Nim got some of it too =D. No classes, only structs and functions. In fact every operator is a function and you can overload them for custom types. OOP is still possible, but it's harder to make inheritance monster with factory factory factories.
Nim gives you power, just remember that with power comes responsibility.
This is my pitch for Nim language.
https://www.lurklurk.org/effective-rust/ could be for you; while it starts from the very basics—for a person that knows how to program—it does seem to cover a lot and in a structured manner.
I expect one to learn something new at least something by page 70 :).
IO can’t be unit tested hence why you mock it. But his code didn’t do anything but confirm his mock worked. He’s writing mocks and testing mocks.
The functionality he referenced is just inherently not unit testable. Again, If you try to mock it and test things you end up testing your mocked code. That’s it.
I’ve seen this strange testing philosophy pop up time and time again where test code misses a shit load of errors because it’s just confirming that the mocks work.
For this area you need to move to integration tests if you want to confirm it works. This comes with the pain of rewriting tests should the implementations change but testing just mocks isn’t solving this problem.
Your unit tests only really matter if you’re doing a lot of big algorithm stuff and not much IO. Mocking helps if you have some IO sprinkled into a unit computation. In the example he gave every operation was IO and every operation had to be mocked so wtf was he thinking to want to place that in a unit test?
Say, I have this module that uses a private MongoDB as a cache. Its unit tests spin up a standard MongoDB container and use it (and then tear it down). Are they still unit tests or should I start calling them "integration tests"?
Unit tests should just test code units. All the external stuff should be mocked or not tested.
The example in the post is a unit test.
It’s good to keep it separate as unit tests are really easy to run and less complicated and much faster. Integration tests are much more complicated and often sort of freeze your code as it locks in the implementation.
Alternatively you mock _everything_ and then your "unit test" ends up just being a tautological test asserting that "the code I wrote executes in the way I wrote it". (Not to mention that every time you mock something you are also implicitly asserting what the expected behavior of that thing is).
The only truly reliable tests are E2E tests, but they are too expensive to cover all possible permutations as there are just too many.
This is the catch 22 with testing, and we're always forced to make pragmatic choices about where to draw our boundaries to maximize the value (i.e. actual bugs caught) of them.
That is, from Mongo, you use Serde and wind up with only valid records operated upon, of a table of such values.
Or anything that you have to use ipc like sockets or shared memory.
It's pretty easy to make testing stuff like "We'll conjure into existence a loopback network server" hermetic and likewise for the Entity Framework trick where it runs tests against a SQLite db even though that's not how your real DB works, it's often good enough. Containers are something which could be hermetic, but I am dubious.
It's an example for a blog post. I can't write thousands of lines of code for it, so I just sketched a vague outline.
Suppose the author doesn't use build.rs, which appears to have been composed of the listed things almost entirely.
It's within the rust ecosystem though. Perhaps cargo could expose a simpler way to use code generators.
In Rust there's at least 5 types of everything, in order of strength:
- Value / unqualified / "owned"
- Generically, T
- Optionally mutable
- Mutable Reference - &mut T
- you can only have one of these for a given value
- Reference / Shared reference - &T
- you can have arbitrarily many for a given value
- Raw constant pointer - *const T
- you can have arbitrarily many, and they're not liveness checked
- Raw mutable pointer - *mut T
- you can have arbitrarily many, and they're not liveness checked
Now I say at least because things get appreciably more complicated when you find yourself dealing with lifetimes which apply to "References", those are indeed themselves types, but ultimately represent a compiler-executed calculus regarding liveness relative to some Value.They also can 'fan out' like a multiple-dereference pointer, but the tricky part is how the APIs for Types conform to these, for example;
Since there are 3 different types of things in a collection, then there are 3 different ways to iterate over them `iter()`, `iter_mut()`, `into_iter()` in increasing order of strength. Most of the breadth or early complexity arises from the urge to treat these as a distraction, rather than a fundamental aspect of systems code.
Crates / modules are a bit of a meme: https://www.reddit.com/r/rust/comments/ujry0b/media_how_to_c...
Bevy has done some work investigating build performance: https://bevyengine.org/learn/quick-start/getting-started/set...
I still do think about it all in C++ terms. Borrowing and ownership are just specific terminology for things that you must know to create a correct C++ program of any useful complexity and performance. Two common anti-Rust arguments are:
- It's hard to learn
- You should just git gud at C++
But the Euler diagram of people who struggle with the borrow checker and people who are gud enough at C++ has no overlap. Likewise String and &str.
I also think it means the performance user story is, somehow, underappreciated, especially for naive users. Immutable borrows and moves are just part of the development experience, and copying is the unobvious path. And if you still struggle you can often just toss in `rayon` in ways that you never could with `std::execution::par`
I've yet to see anyone demonstrate the elegance Rust error handling for anything but the simplest of cases. It's all fun and games and question marks... until you hit this:
$ ./app
called `Result::unwrap()` on an `Err` value: no such file or directory
And then you start investigating and it turns out that the error value comes from somewhere deep in an unknown callstack that got discard by the authors using '?' everywhere.Yes, I know about anyhow and thiserror and eyre and ... ; point is none of this is ever shown in these 'look how elegant error handling is' posts. Come on, let's be a bit more honest with ourselves about Result<T, E> and '?' - it's not a full solution to error handling. After two years I'm sure you've hit this.
What makes this error checking good is that you can use it correctly and it is less cumbersome than the try/catch from C++.
>Come on, let's be a bit more honest with ourselves about Result<T, E> and '?' - it's not a full solution to error handling. After two years I'm sure you've hit this.
Nothing is ever a full solution. But it is meaningless to talk about this without doing comparisons. Do you think try/catch was the superior solution all along?
This isn't true. It really depends.
fn main() -> anyhow::Result<()> {
can be perfectly good, depending on your needs.What Rust does with error handling is give you flexibility. It's true that means you can make a mess. I myself have a TODO on my current codebase where I'm not exactly happy with what I'm doing at the moment overall. But it can also be very elegant, and more importantly, it doesn't force you into one paradigm that many not be good for your needs, but allows you to decide, which is very important in Rust's conceptual space. I wouldn't want to be forced to use the above signature in a no_std context, for example.
This gives the error to the calling process though. In some sense that means it is handled.
I don't think I disagree though, but I think my point still stands. If you do not think about at what point your errors are actually resolved then your program does not have proper error handling. If you unwrap in the middle of your code, you have to accept that crashing there is a possibility, even from an error far away.
I agree, but that's kind of my point - that's all these Rust praise articles are ever showing :).
> But it is meaningless to talk about this without doing comparisons.
Not comparing to other languages/approaches however allows a discussion to stay about Rust and how to make things better instead of yet another fruitless discussion about which language or approach is better in a vacuum. I'm not interested in demonstrating that Rust is better than C++ or vice versa, I'm interested in Rust being good on its own merits.
Then they are completely misrepresenting what error handling is about. Rusts error handling is good because programs only crash where you allow them to crash. If you are using the error handling correctly, you only crash wherever there is an error you believe should be unrecoverable.
>Not comparing to other languages/approaches however allows a discussion to stay about Rust and how to make things better instead of yet another fruitless discussion about which language or approach is better in a vacuum.
If you are complaining about the error handling of some language surely the most important and productive thing would be to compare it to the paradigms of other languages. If you are unwilling to consider that someone else is doing something correctly, which you are doing wrong you can't improve. Especially when there are two major paradigms, it seems important to talk about the alternative.
What exactly would you change about rusts error handling?
That's fine, but the commenter didn't pull this out of nowhere. It's in the article. Your reply makes it sound like you didn't read it, as if OP is giving a rare hypothetical that most Rust programmers don't support. This is common error handling advice in the Rust community.
And this is representative of what you may run into when reading other people's code or trying to contribute to their library.
Rusts error handling works by defining potential crashes with unwraps. A program never crashes unexpectedly there, as this is where you expect it to crash. The general pattern is fine and widely used, the other commenter did not understand that this kind of behavior results from unwrapping where you really do not want to unwrap.
That's not the case, I understand the issue at hand quite well. Please don't do this.
I didn't reply to your other comment because this conversation isn't/wasn't going anywhere.
Yeah... maybe not, but I can see this being a project in an undergraduate course.
Rust has modules, crates and workspaces. To optimize builds, you'll eventually move shared resources to their own crate(s).
Thus you almost certainly need parametric polymorphism whereas other languages described would use implementation/interface/inheritance/duck polymorphism. Parametric polymorphism explodes rapidly if you aren't judicious and it doesn't feel very agile.
Once you are dealing in traits, does that trait have a copy bound or am I going to need to take a borrow and also grab a lifetime next to my trait parameter? Or should I just hide it all by slapping my mock with an `impl Trait for Arc<RefCell<Mock>>` or equivalent?
pub trait LibraryRepository: Send + Sync + 'static {
async fn create_supplier(
&self,
request: supplier::CreateRequest,
) -> Result<Supplier, supplier::CreateError>;
I am splitting things "vertically" (aka by feature) rather than "horizontally" (aka by layer). So "library" is a feature of my app, and "suppliers" are a concept within that feature. This call ultimately takes the information in a CreateRequest and inserts it into a database.My implementation looks something like this:
impl LibraryRepository for Arc<Sqlite> {
async fn create_supplier(
&self,
request: supplier::CreateRequest,
) -> Result<Supplier, supplier::CreateError> {
let mut tx = self
.pool
.begin()
.await
.map_err(|e| anyhow!(e).context("failed to start SQLite transaction"))?;
let name = request.name().clone();
let supplier = self.create_supplier(&mut tx, request).await.map_err(|e| {
anyhow!(e).context(format!("failed to save supplier with name {name:?}"))
})?;
tx.commit()
.await
.map_err(|e| anyhow!(e).context("failed to commit SQLite transaction"))?;
Ok(supplier)
}
where Sqlite is #[derive(Debug, Clone)]
pub struct Sqlite {
pool: sqlx::SqlitePool,
}
You'll notice this basically: 1. starts a transaction
2. delegates to an inherent method with the same name
3. finishes the transaction
The inherent method has this signature: impl Sqlite {
async fn create_supplier(
self: &Arc<Self>,
tx: &mut Transaction<'_, sqlx::Sqlite>,
request: supplier::CreateRequest,
) -> Result<Supplier, sqlx::Error> {
So, I can choose how I want to test: with a real database, or without.If I want to write a test using a real database, I can do so, by testing the inherent method and passing it a transaction my test harness has prepared. sqlx makes this really nice.
If I'm testing some other function, and I want to mock the database, I create a mock implementation of LibraryService, and inject it there. Won't ever interact with the database at all.
In practice, my application is 95% end-to-end tests right now because a lot of it is CRUD with little logic, but the structure means that when I've wanted to do some more fine-grained tests, it's been trivial. The tradeoff is that there's a lot of boilerplate at the moment. I'm considering trying to reduce it, but I'm okay with it right now, as it's the kind that's pretty boring: the worst thing that's happened is me copy/pasting one of these implementations of a method and forgetting to change the message in that format!. I am also not 100% sure if I like using anyhow! here, as I think I'm erasing too much of the error context. But it's working well enough for now.
I got this idea from https://www.howtocodeit.com/articles/master-hexagonal-archit..., which I am very interested to see the final part of. (and also, I find the tone pretty annoying, but the ideas are good, and it's thorough.) I'm not 100% sure that I like every aspect of this specific implementation, but it's served me pretty well so far.
Inject a short lived bee into your example. A database that is only going to live for a finite time.
Sure. "Dr, it hurts... well stop doing that." Sometimes, you can design around the issue. I don't claim that this specific pattern works for everything, just that this is how my real-world application is built.
> Moreover, if you slap `Send + Sync + 'static` then you can certainly avoid the problems I am hinting at: you've committed to never having a lifetime and won't have to deal with the borrow checker.
Yes. Sometimes, one atomic increment on startup is worth not making your code more complex.
> A database that is only going to live for a finite time.
This is just not the case for a web application.
Nice. I want to write about my experiences someday, but some quick random thoughts about this:
My repository files are huge. i need to break them up. More submodules can work, and defining the inherent methods in a different module than the trait implementation.
I've found the directory structure this advocates, that is,
├── src
│ ├── domain
│ ├── inbound
│ ├── outbound
gets a bit weird when you're splitting things up by feature, because you end up re-doing the same directories inside of all three of the submodules. I want to see if moving to something more like ├── src
│ ├── feature1
│ │ ├── domain
│ │ ├── inbound
│ │ ├── outbound
│ ├── feature2
│ │ ├── domain
│ │ ├── inbound
│ │ ├── outbound
feels better. Which is of course its own kind of repetition, but I feel like if I'm splitting by feature, having each feature in its own directory with the repetition being the domain/inbound/outbound layer making more sense.I'm also curious about if coherence will allow me to move this to each feature being its own crate. compile times aren't terrible right now, but as things grow... we'll see.
I didn’t use an actor framework, for better or worse, but rolled my own.
Great write up!
Could you please elaborate?
Rust on web frontends is just not super mature. You can do things with it, and it's very cool, but TypeScript is a very mature technology at this point, and gives a lot of similar benefits to Rust. And it can live natively in a browser context, without complex bindings.
I don't work on native macOS apps, but I'm assuming it's similar: Objective-C or Swift are expected, so you end up needing to bind to APIs that aren't always natural feeling. I could see why you'd want to do something similar: write your core in Rust, but make the UI stuff be in Swift, and call into it from there.
There's a typo at the end of the Error Handling section:
When you need to explicitly handle an error, you omit the question mark operator and use thw Result value directly.