Replacing Protobuf with Rust to go 5 times faster
86 points
5 hours ago
| 12 comments
| pgdog.dev
| HN
GuB-42
29 minutes ago
[-]
What I find particularly ironic is that the title make it feel like Rust gives a 5x performance improvement when it actually slows thing down.

The problem they have software written in Rust, and they need to use the libpg_query library, that is written in C. Because they can't use the C library directly, they had to use a Rust-to-C binding library, that uses Protobuf for portability reasons. Problem is that it is slow.

So what they did is that they wrote their own non-portable but much more optimized Rust-to-C bindings, with the help of a LLM.

But had they written their software in C, they wouldn't have needed to do any conversion at all. It means they could have titled the article "How we lowered the performance penalty of using Rust".

I don't know much about Rust or libpg_query, but they probably could have gone even faster by getting rid of the conversion entirely. It would most likely have involved major adaptations and some unsafe Rust though. Writing a converter has many advantages: portability, convenience, security, etc... but it has a cost, and ultimately, I think it is a big reason why computers are so fast and apps are so slow. Our machines keep copying, converting, serializing and deserializing things.

Note: I have nothing against what they did, quite the opposite, I always appreciate those who care about performance, and what they did is reasonable and effective, good job!

reply
logicchains
7 minutes ago
[-]
> they had to use a Rust-to-C binding library, that uses Protobuf for portability reasons.

That sounds like a performance nightmare, putting Protobuf of all things between the language and Postgres, I'm surprised such a library ever got popular.

reply
cranx
2 hours ago
[-]
I find the title a bit misleading. I think it should be titled It’s Faster to Copy Memory Directly than Send a Protobuf. Which then seems rather obvious that removing a serialization and deserialization step reduces runtime.
reply
bluGill
30 minutes ago
[-]
Protobuf does something important that copying memory cannot do: a protocol that can be changed separately on either end and things can still work. You have to build for "my client doesn't send some new data" (make a good default), or "I got extra data I don't understand" (ignore it). However the ability to upgrade part of the system is critical when the system is large and complex since you can't fix everything to understand your new feature without making the new feature take ages to roll out.

Protobuf also handles a bunch of languages for you. The other team wants to write in a "stupid language" - you don't have to have a political fight to prove your preferred is best for everything. You just let that team do what they want and they can learn the hard way it was a bad language. Either it isn't really that bad and so the fight was pointless, or it was but management can find other metrics to prove it and it becomes their problem to decide if it is bad enough to be worth fixing.

reply
MrDarcy
1 hour ago
[-]
TIL serializing a protobuf is only 5 times slower than copying memory, which is way faster than I thought it’d be. Impressive given all the other nice things protobuf offers to development teams.
reply
nicman23
53 minutes ago
[-]
that actually crazy fast
reply
miroljub
1 hour ago
[-]
Yep.

Just doing memcpy or mmap would be even faster. But the same Rust advocates bragging about Rust speed frown upon such unsecure practices in C/C++.

reply
nottorp
4 hours ago
[-]
Are they sure it's because Rust? Perhaps if they rewrite Protobuf in Rust it will be as slow as the current implementation.

They changed the persistence system completely. Looks like from a generic solution to something specific to what they're carrying across the wire.

They could have done it in Lua and it would have been 3x faster.

reply
consp
4 hours ago
[-]
If they made the headline something on the line of "replacing protobuf with a native, optimized implementation" would not get the same attention as putting rust in the title to attract the everything-in-rust-is-better crowd.
reply
desiderantes
3 hours ago
[-]
That never happens. Instead, it always attracts the opposite group, the Rust complainers, where they go and complain about how "the everything-in-rust-is-better crowd created yet another fake headline to pretend that Rust is the panacea". Which results in a lot of engagement. Old ragebait trick.
reply
hu3
1 hour ago
[-]
At the very least it gets more upvotes.
reply
timeon
37 minutes ago
[-]
Well it is keyword for RSS feeds.
reply
izacus
1 hour ago
[-]
"never" huh?
reply
embedding-shape
3 hours ago
[-]
It's devbait, not many of us can resist bikeshedding about the title which obviously doesn't accurately reflect the article contents. And the article contents are self-aware enough to admit this to itself too, yet the title remains.
reply
alias_neo
4 hours ago
[-]
I was equally confused by the headline.

I wonder if it's just poorly worded and they meant to say something like "Replacing Protobuf with some native calls [in Rust]".

reply
win311fwg
3 hours ago
[-]
The title would suggest that it was already written in Rust; that it was the rewrite in Go that brought five times faster.
reply
misja111
4 hours ago
[-]
Correct, this has very little to do with Rust. But it wouldn't have made the front page without it.
reply
locknitpicker
4 hours ago
[-]
Yes you are absolutely right. The article even outright admits that Rust had nothing to do with it. From the article:

> Protobuf is fast, but not using Protobuf is faster.

The blog post reads like an unserious attempt to repeat a Rust meme.

reply
rozenmd
3 hours ago
[-]
"5 times faster" reminds me of Cap'n Proto's claim: in benchmarks, Cap’n Proto is INFINITY TIMES faster than Protocol Buffers: https://capnproto.org/
reply
7777332215
3 hours ago
[-]
In my experience capn proto is much less ergonomic.
reply
gf000
2 hours ago
[-]
I mean, cap'n'proto is written by the same person who created protobuf, so they are legit (and that somewhat jokish claim is simply that it requires no parsing).
reply
Sesse__
2 hours ago
[-]
> I mean, cap'n'proto is written by the same person who created protobuf

Notably, Protobuf 2, a rewrite of Protobuf 1. Protobuf 1 was created by Sanjay Ghemawat, I believe.

reply
7e
1 hour ago
[-]
Google loves to reinvent shit because they didn't understand it. And to get promo. In this case, ASN.1. And protobufs are so inefficient that they drive up latency and datacenter costs, so they were a step backwards. Good job, Sanjay.
reply
yodacola
4 hours ago
[-]
FlatBuffers are already faster than that. But that's not why we choose Protobuf. It's because a megacorp maintains it.
reply
nindalf
3 hours ago
[-]
You're saying we choose Protobufs [1] because Google maintains it but not FlatBuffers [2]?

[1] - https://github.com/protocolbuffers/protobuf: Google's data interchange format

[2] - https://github.com/google/flatbuffers: Also maintained by Google

reply
rafaelmn
3 hours ago
[-]
I get the OP is off base with his remark - but at the same time maintained by Google means shit in practice.

AFAIK they have a bunch of production infra on protobuff/gRPC - not so sure about flatbufferrs which came out of the game dev side - that's the difference maker to me - which project is actually rooted in.

reply
dewey
3 hours ago
[-]
> but at the same time maintained by Google means shit in practice.

If you worked on Go projects that import Google protobuf / grpc / Kubernetes client libraries you are often reminded of that fact.

reply
whoevercares
1 hour ago
[-]
Flatbuffers are fine - I think it is used in many places that needs zero-copy. Also outside google, it powers the Arrow format which is the foundation of modern analytics
reply
secondcoming
3 hours ago
[-]
Yet they've yet to release their internal optimisation that allows zero-copying string-type fields.
reply
linuxftw
1 hour ago
[-]
Many people are exclaiming that the title is baity, but I disagree. It seems like a perfectly fine title in the context of this blog, which is about a specific product. It's unlikely they wrote the blog with a HN submission in mind. They're not a news publication, either.
reply
t-writescode
3 hours ago
[-]
Just for fun, how often do regular-sized companies that deal in regular-sized traffic need Protobuf to accomplish their goals in the first place, compared to JSON or even XML with basic string marshalling?
reply
izacus
1 hour ago
[-]
I dunno, are you sure you can manually write correct de/serializaiton for JSON and XML so strings, floats and integer formats correctly get parsed between JavaScript, Java, Python, Go, Rust, C++ and any other languages?

Do you want to maintain that and debug that? Do you want to do all of that without help of a compiler enforcing the schema and failing compiles/CI when someone accidentally changes the schema?

Because you get all of that with protobuf if you use them appropriately.

You can of course build all of this yourself... and maybe it'll even be as efficient, performant and supported. Maybe.

reply
nicman23
48 minutes ago
[-]
i mean you can always go mono or duo language and then it is really not that of an issue
reply
tcfhgj
3 hours ago
[-]
Well, protobuf allows to generate easy to use code for parsing defined data and service stubs for many languages and is one of the faster and less bandwidth wasting options
reply
vouwfietsman
1 hour ago
[-]
Besides the other comments already here about code gen & contracts, a bigger one for me to step away from json/xml is binary serialization.

It sounds weird, and its totally dependent on your use case, but binary serialization can make a giant difference.

For me, I work with 3D data which is primarily (but not only) tightly packed arrays of floats & ints. I have a bunch of options available:

1. JSON/XML, readable, easy to work with, relatively bulky (but not as bad as people think if you compress) but no random access, and slow floating point parsing, great extensibility.

2. JSON/XML + base64, OK to work with, quite bulky, no random access, faster parsing, but no structure, extensible.

3. Manual binary serialization: hard to work with, OK size (esp compressed), random access if you put in the effort, optimal parsing, not extensible unless you put in a lot of effort.

4. Flatbuffers/protobuf/capn-proto/etc: easy to work with, great size (esp compressed), random access, close-to-optimal parsing, extensible.

Basically if you care about performance, you would really like to just have control of the binary layout of your data, but you generally don't want to design extensibility and random access yourself, so you end up sacrificing explicit layout (and so some performance) by choosing a convenient lib.

We are a very regularly sized company, but our 3D data spans hundreds of terabytes.

(also, no, there is no general purpose 3D format available to do this work, gltf and friends are great but have a small range of usecases)

reply
physicsguy
1 hour ago
[-]
This was the norm many years ago, I worked on a simulation software which existed long before Protobuf was even an apple in it's authors eyes. The whole thing was on a server architecture with a Java (later ported to Qt) GUI and a C++ core. The solver periodically sent data in a custom binary format over TCP for vector fields and things.
reply
tuetuopay
2 hours ago
[-]
Type safety. The contract is the law instead of a suggestion like JSON.

Having a way to describe your whole API and generate bindings is a godsend. Yes, it can be done with JSON and OpenApi, yet it’s not mandatory.

reply
bluGill
3 hours ago
[-]
In most languages protobuf is eaiser because it generates the boilerplate. And protobuf is cross language so even if you are working in javascript where json is native protobuf is still faster because the other side can be whatever and you are not spending their time parsing.
reply
Chiron1991
2 hours ago
[-]
It's not just about traffic. IoT devices (or any other low-powered devices for that matter) also like protobuf because of its comparatively high efficiency.
reply
pjmlp
1 hour ago
[-]
I never used it, coding since 1986.
reply
jonathanstrange
2 hours ago
[-]
Protobuf is fantastic because it separates the definition from the language. When you make changes, you recompile your definitions to native code and you can be sure it will stay compatible with other languages and implementations.
reply
speed_spread
1 hour ago
[-]
You mean like WSDL, OpenAPI and every other schema definition format?

Well I agree. Contract-first is great. You provide your clients with the specs and let them generate their own bindings. And as a client they're great too because I can also easily generate a mock server implementation that I can use in tests.

reply
lowdownbutter
3 hours ago
[-]
Don't read clickbaity headlines and scan hacker news five times faster.
reply
chuckadams
1 minute ago
[-]
Become a 5X Hacker News reader with this One Weird Trick.
reply
spwa4
1 hour ago
[-]
You should be terrified of the instability you're introducing to achieve this. Memory sharing between processes is very difficult to keep stable, it is half the reason kernels exist.
reply
sylware
2 hours ago
[-]
I don't understand, I used protobuf for map data, but it is a hardcore simple format, this is the whole purpose of it.

I wrote assembly, memory mapping oriented protobuf software... in assembly, then what? I am allowed to say I am going 1000 times faster than rust now???

reply
IshKebab
4 hours ago
[-]
I vaguely recall that there's a Rust macro to automatically convert recursive functions to iterative.

But I would just increase the stack size limit if it ever becomes a problem. As far as I know the only reason it is so small is because of address space exhaustion which only affects 32-bit systems.

reply
jeroenhd
3 hours ago
[-]
Explicit tail call optimization is in the works but I don't think it's available in stable jut yet.

The `become` keyword has already been reserved and work continues to happen (https://github.com/rust-lang/rust/issues/112788). If you enable #![feature(explicit_tail_calls)] you can already use the feature in the nightly compiler: https://play.rust-lang.org/?version=nightly&mode=debug&editi...

(Note that enabling release mode on that link will have the compiler pre-calculate the result so you need to put it to debug mode if you want to see the assembly this generates)

reply
embedding-shape
3 hours ago
[-]
> I vaguely recall that there's a Rust macro to automatically convert recursive functions to iterative.

Isn't that just TCO or similar? Usually a part of the compiler/core of the language itself, AFAIK.

reply
koverstreet
3 hours ago
[-]
I haven't been following become/TCO in Rust - but what I've usually seen is TCO getting flipped off because it interferes with backtraces and debugging.

So I think there's value in providing it as an explicit opt-in; that way when you're reading the code, you know to account for it when you're looking at backtraces.

Additionally, if you're relying on TCO it might be a major bug if the compiler isn't able to apply it - and optimizations that aren't applied are normally invisible. This might mean you could get an error if you're expecting TCO and you or the compiler screwed something up.

reply
tialaramex
2 hours ago
[-]
In a language like Rust where local variables are explicitly destroyed when scope ends a naive TCO is very annoying and `become` also helps fix that.

Suppose I have a recursive function f(n: u8) where f(0) is 0 and otherwise f(n) is n * bar(n) + f(n-1)

I might well write that with a local temporary to calculate bar(n) and then we do the sum, but this would inhibit TCO because that temporary should exist after we did the recursive calculation, even though it doesn't matter in practice.

A compiler could try to cleverly figure out whether it matters and destroy that local temporary earlier then apply TCO, but now your TCO is fragile because a seemingly minor code change might fool that "clever" logic, by ensuring it isn't correct to make this change and breaking your optimisation.

The `become` keyword is a claim by the programmer that we can drop all these locals and do TCO. So because the programmer claimed this should work they're giving the compiler permission to attempt the early drop and if it doesn't work and can't be TCO then complain that the program is wrong.

reply
steeve
3 hours ago
[-]
tldr: they replaced using protobuf as the type system across language boundaries for FFI with true FFI
reply
ahartmetz
1 hour ago
[-]
Title is as nonsensical as "We replaced Windows with ARM CPUs"
reply
Xunjin
3 hours ago
[-]
I loved, every clickbait title should come with a tldr just like this one.
reply
xxs
1 hour ago
[-]
if you see an order of magnitude difference and a language involved in the title, it's something I refuse to read (unless it's an obvious choice - interpret vs compilied/jit one)
reply