FilterHN

jasonthorsness

2 days ago

[-]

“None of the safe programming languages existed for the first 10 years of SQLite's existence. SQLite could be recoded in Go or Rust, but doing so would probably introduce far more bugs than would be fixed, and it may also result in slower code.”

Modern languages might do more than C to prevent programmers from writing buggy code, but if you already have bug-free code due to massive time, attention, and testing, and the rate of change is low (or zero), it doesn’t really matter what the language is. SQLIte could be assembly language for all it would matter.

https://security.googleblog.com/2024/09/eliminating-memory-s...

2 days ago

[-]

> and the rate of change is low (or zero)

This jives with a point that the Google Security Blog made last year: "The [memory safety] problem is overwhelmingly with new code...Code matures and gets safer with time."

https://www.sqlite.org/cves.html

miohtama

2 days ago

[-]

You can find historical SQLite CVEs here

Note that although code matures the chances of C Human error bugs will never go to zero. We have some bad incidents like Heartbleed to show this.

hnlmorg

2 days ago

[-]

Heartbleed was a great demonstration of critical systems that were under appreciated.

Too few maintainers, too few security researchers and too little funding.

When writing systems as complicated and as sensitive as the leading encryption suite used globally, no language choice will save you from under resourcing.

ziotom78

2 days ago

[-]

Right, but I believe nobody can claim that Human error bugs go to zero for Rust code.

john_the_writer

2 days ago

[-]

Agreed. I rather dislike the idea of "safe" coding languages. Fighting with a memory leak in an elixir app, for the past week. I never viewed c or c++ as unsafe. Writing code is hard, always has been, always will be. It is never safe.

simonask

2 days ago

[-]

This is a bit of a misunderstanding.

Safe code is just code that cannot have Undefined Behavior. C and C++ have the concept of "soundness" just like Rust, just no way to statically guard against it.

rileymat2

1 day ago

[-]

There is more than undefined behavior, if I was to define using an uninitialized variable as yielding whatever data was previously in that location, it is well defined but unsafe.

1 day ago

[-]

That could be well defined for POD types like arrays of bytes (I think in that case they call it "freezing"), but you'd still have undefined behavior if the type in question contained pointers. Also at least in Rust it's UB to create illegal values of certain types, like a bool that's anything other than 0 or 1.

Which is all kind of to say, all of Rust's different safety rules end up being surprisingly densely interconnected. It's really hard to guarantee one of them (say "no use-after-free") without ultimately requiring all of them ("no uninitialized variables", "no mutable aliasing", etc).

gcr

2 days ago

[-]

Modern compilers like clang and GCC both have static analysis for some of this. Check out the undefined behavior sanitizer.

simonask

2 days ago

[-]

As the other person pointed out, these are two different things. Sanitizers add runtime checks (with zero consideration for performance - don’t use these in productions). Static analysis runs at compile time, and while both GCC and Clang are doing amazing jobs of it, it’s still very easy to run into trouble. The mostly catch the low-hanging fruit.

The technical reason is that Rust-the-language gives the compiler much more information to work with, and it doesn’t look like it is possible to add this information to the C or C++ languages.

humanrebar

2 days ago

[-]

Sanitizers are technically dynamic analysis. They instrument built programs and analyze them as they run.

jen20

2 days ago

[-]

A memory leak is not a memory safety issue.

rileymat2

1 day ago

[-]

No, but it can be dangerous when you run out of memory for critical systems.

1vuio0pswjnm7

2 days ago

[-]

This begs the question of why Rust evangelists keep targeting existing projects instead of focusing writing new, better software. In theory these languages should allow software developers to write programs that they would not, or could not, attempt using languages without automatic memory management

Instead what I see _mostly_ is re-writes and proposed re-writes of existing software, often software that has no networking functions, and/or relatively small, easily audited software that IMHO poses little risk of memory-related bugs

This is concerning to me as an end-user who builds their software from source because the effects on compilation, e.g., increased resource requirements, increased interdependencies, increased program size, decreased compilation speed, are significant

https://sqlite.org/copyright.html

1vuio0pswjnm7

23 hours ago

[-]

Public domain. No "copyleft" license needed

Being written in a small, fast, "old and boring" language may be part of what makes SQLite apealing. Another (related) part may be the thoughtfulness and carefulness of its author, e.g., "time, attention and testing"

The former, i.e., the author's "time, attention and testing", may matter more than the later, i.e., the author's language choice

As suggested by the top comment, in effect the author's language choice, by itself, may not matter with respect to the issue of "safety". If true, then even an "unsafe language" may not reduce the "safety" of SQLite^1

djb's software is also public domain and written in an "unsafe language", mostly the same "old and boring" one as used to write SQLite. Like SQLite it is appealing to many people and is found in many places^2

1. But the thoughtlessness and carelessness of an author, no matter what language they choose, is still relevant to "safety". In sum, "safety" is partly a function of human effort, e.g., "time, attention and testing", not simply the result of a language choice. Perhaps "safe" and "unsafe" are adjectives that can apply to authors as well as languages

2. https://ianix.com

2 days ago

[-]

There's no such thing as "Rust evangelists targeting existing projects" as some sort of broad strategy. What you're observing is that some people like to write programs in Rust, and so they choose to write new code in Rust. For some people, writing versions of software they understand already is a good way to learn a language, for some, it's that they don't like some aspect of the existing software and want to make something similar, but different.

That is, nobody perceives, say, "the silver searcher" as being some sort of nefarious plot to re-write grep, but they did with ripgrep, even though that's not what it is trying to do.

There are a few projects that are deliberately doing a re-write for reasons they consider important, but they're not "just because it's in Rust," they're things like "openssl has had memory safety issues, we should use tools that eliminate those" or "sudo is very important and has grown a lot of features and has had memory safety issues, we should consider cutting some scope and using tools that help eliminate the memory safety issues." But those number of projects are very small. Heck, even things like uutils were originally "I just want to learn Rust" projects and not some sort of "we must re-write the world in Rust."

LexiMax

1 day ago

[-]

> That is, nobody perceives, say, "the silver searcher" as being some sort of nefarious plot to re-write grep, but they did with ripgrep, even though that's not what it is trying to do.

I have a suspicion as to why this perception exists in the C++ crowd.

I don't think it's because of evangelists. C++ was that evangelized language in the early 90's, but after a period in the sun it then survived the evangelism of Java, Python, Go, and others, despite losing ground to them in general purpose tasks.

And that's because all of those other languages, while being much safer than C++, came at the cost of performance or the overhead of a runtime. There was really no other language that had the expressiveness of C++ while allowing the developer to get down to the bare metal if they needed speed.

The existence and growing popularity of Rust changes that calculus, and I imagine that makes certain developers who might have a lot of investment in the C++ ecosystem defensive, causing them to overreact to any perceived slight in a way that other languages simply don't provoke.

throwawaymaths

2 hours ago

[-]

"Broad strategy" is your words, not gp's. you are fighting a strawman. gp is just arguing that this is happening, and it's annoying.

MangoToupe

2 days ago

[-]

> This begs the question of why Rust evangelists keep targeting existing projects instead of focusing writing new, better software.

Designing new software is orders of magnitude more difficult than iterating on existing software

itopaloglu83

1 day ago

[-]

Yes, of course, but if a project owner wants to stay with a non-rust programming language, maybe we should just let them be, instead of nagging them multiple times a day about why they haven't switched, in that case writing your own project with rust to prove that it can be done better is the way to go.

burntsushi

1 day ago

[-]

And when they do that, they'll be vilified for "rewriting in Rust" and creating nothing new.

Who actually is nagging people multiple times per day to do free labor for them and rewrite a project in a completely different programming language?

burntsushi

2 days ago

[-]

<rewind to the 90s>: This begs [sic] the question why copyleft evangelists keep targeting existing projects instead of focusing writing new, better software.

Few things in this life are novel. Regardless, when I wrote ripgrep, I was focusing on writing new and better software. I perceived several problems with similar tools at the time and set out to do something better.

Imagine if people actually listened to whinging like your comment. I'm glad I never have.

wolvesechoes

1 day ago

[-]

Good analogy with copyleft rewrites - most of those Rust rewrites tend to use permissive licenses, so it is clear they aim to destroy civilizational achievements.

burntsushi

1 day ago

[-]

That doesn't make any sense.

friendly_wizard

1 day ago

[-]

Close to 100% positive that comment was tongue in cheek

weinzierl

2 days ago

[-]

"SQLite could be recoded in Go or Rust, but doing so would probably introduce far more bugs than would be fixed, and it may also result in slower code."

We will see. On the Rust side there is Turso which is pretty active.

https://turso.tech/

Sammi

2 days ago

[-]

The Sqlite team in stuck in the classic dilemma. They are stuck with their existing thing because it is so big, that you can't just stop the world for you existing users and redo it. Meanwhile some small innovator comes along and builds the next thing because why not, they don't have anything holding them back. This is classic Innovator's Dilemma and Creative Destruction. This has of course not happened yet and we have to wait and see if Turso can actually deliver, but the Turso team is extremely talented and their git repo history is on fire, so it is definitely a possible scenario.

saalweachter

2 days ago

[-]

You say that like SQLite is a for profit company competing for market share or one of several rival projects in a corporation trying to not be canceled.

It's an open sourced project! It's public domain!

If you make an open source project that is heavily used and widely lauded for a quarter century before being supplanted by a newer solution that's better, do you know what that is?

A success! You did it! You made a thing that was useful and loved!

Nothing lasts forever; there's nothing wrong with a project filling a niche and then gracefully fading away when that niche goes away.

Sammi

2 days ago

[-]

Funny you should say so because I actually made an effort _not_ to use corporate business terms, because open source project definitely do compete with each other both for developer and user attention and for general prestige, which in turn may be leveraged to get access to funding and other development resources in various ways. And in an even funnier turn of events Sqlite development is actually funded by a for-profit company that sells Sqlite support contracts: https://www.sqlite.org/consortium.html

2 days ago

[-]

I agree with your broad point, but it's worth pointing out that SQLite very much is a product that is being sold and that aspect of things is not open source.

immibis

1 day ago

[-]

SQLite is a for profit company competing for market share. They're one of those companies that gives the product away for free, and sells professional support and custom development, as well as a few add-on modules. Pricing table here: https://sqlite.org/prosupport.html

You may think these are ludicrous prices, but think of it as market segmentation. It seems like they're only a few employees, so if they get, like, 300 companies in the entire world to sign up for email support, they earn a pretty respectable salary. Or if they get, like, four companies in the whole world to join their consortium, and nobody else buys anything. In fact, there are four consortium members on the homepage: https://sqlite.org/index.html and possibly others who chose not to be listed. So we have (it seems) two people getting paid $600k per year to work on this. This is a software SMB - they're not trying to hyperscale or squeeze every penny, just make a living selling a good product at a steady rate.

This model only works, of course, because SQLite is a genuinely good product that everyone loves and uses for free in every open-source project. It wouldn't work if a copy of SQLite cost even $10, because then we'd all be using MariaDB. It might not even work if it was proprietary but free.

0xffff2

1 day ago

[-]

The fact that there are "cloud pricing" and "schedule a call" links already tells me all I need to know. Doesn't seem like this is a product that is really competing with SQLite at all.

just6979

1 day ago

[-]

That's just because they're using a different scheme to fund development. SQLite has its paid support and consortium, while Turso is leaning on cloud hosting and the paid support that comes with that. Both can still be used standalone and unsupported, completely for free.

Turso is arguably positioned slightly better as a standalone product seeing as it's using a more traditional open source "bazaar" model, as opposed to SQLite's source available "cathedral" model.

no_flaks_given

1 day ago

[-]

But Turso is a for profit company that's bound to rug pull eventually.

So SQLite is still the bar

devjab

2 days ago

[-]

I quite like that Zig works a drop in for C in a few use cases. It's been very nice to utilize it along with our Python and regular C binaries. We attempted to move into Go because we really like the philosophy and opinions it force upon it's developers, but similar to interpreted languages it can be rather hard to optimize it. I'm sure people more talented than us would have an easy time with it, but they don't work for us. So it was easier to just go with Python and a few parts of it handled by C (and in even fewer cases Zig).

I guess we could use Rust and I might be wrong on this, but it seemed like it would be a lot of work to utilize it compared to just continuing with C and gradually incorprating Zig, and we certainly don't write bug free C.

Hendrikto

2 days ago

[-]

> We attempted to move into Go […], but similar to interpreted languages it can be rather hard to optimize it. […] So it was easier to just go with Python

I don’t get that. You had trouble optimizing Go, so you went with Python?

devjab

2 days ago

[-]

We had Python and C. We aimed for Go. Now we have Python and C. Yhe deeper story is more change management than technically. We hoped we could obtain advantages from Go because we, perhaps naively, figured it would lessen the gap between programming and software engeniering. We have a lot of people who can build software, but few who can optimise it. We hoped Go would give us a lot of "free" optimisaton, but it didn't. It also wasn't as easy to transition not SWE's into not Python as we had hoped. We made no major rewrites, we instead build some of our new tools and services in Go. Some of these have been phased out, others will live out their lifecycles as is.

I personally really like Go, but I feel like I now have a better understanding of why so many teams stick with c/c++ without even considering adopting Go, Rust or similar.

actionfromafar

2 days ago

[-]

Python + C. Probably C for the optimized parts.

Hendrikto

2 days ago

[-]

Why not Go + C then?

actionfromafar

2 days ago

[-]

I'm not them but it seems they already had Python.

just6979

1 day ago

[-]

Because why bother if you're keeping the C? Part of the reason for moving to Go was safety by replacing the C, not just to move away from Python. I'd say the mistake was thinking Python programmers would enjoy moving to Go. I've done it, and it was not enjoyable. I wouldn't mind doing just the tight peformance things in Go instead of C... But using Go for the high-level things that Python is great at, and where the performance is not an issue, is just silly.

WesolyKubeczek

2 days ago

[-]

Have you tried writing Python extension modules in Zig instead of C? How is it?

devjab

2 days ago

[-]

No, just C ABI compatible libraries. Maybe when there are two fridays in a week we will have enough time to do some actual adoption.

ChrisRR

2 days ago

[-]

This is the argument against re-writing the linux base utils in rust. When they've had such widespread use for decades, a hell of a lot of bugs have been ironed out

bombcar

2 days ago

[-]

Especially since memory bugs are only a subset of all bugs, and (perhaps) not even the most important subset.

Memory bugs are often implicated in security issues, but other bugs are more likely to cause data loss, corruption, etc.

2 days ago

[-]

The author cleverly leaves out all the safer alternatives that have existed outside UNIX, and what was happening with computers outside Bell Labs during the 1970's.

Not only was Apple was able to launch the Mac Classic with zero lines of C code, their Pascal dialect lives on in Delphi and Free Pascal.

As one example.

2 days ago

[-]

Isn’t Pascal just as problematic as C in this respect? And the original Mac was mostly (all?) assembly, not Pascal. They couldn’t have used a higher level language anyway. The system was just too small for that. SQLite wouldn’t fit on it.

2 days ago

[-]

Not at all, because Pascal is more strongly typed, and has features that C is still yet to acquire in type safety.

Non exaustive list:

- proper strings with bounds checking

- proper arrays with bounds checking

- no pointer decays, you need to be explicit about getting pointers to arrays

- less use cases of implicit conversions, requires more typecasts

- reference parameters reduce the need of pointers

- variant records directly support tags

- enumerations are stronger typed without implicit conversions

- modules with better control what gets exposed

- range types

- set types

- arenas

There was plenty of Pascal code on Mac OS including a Smalltalk like OOP framework, until C++ took over Object Pascal's role at Apple, which again it isn't C.

I love the "but it has used Assembly!" usual rebutal, as if OSes written in C weren't full of Assembly, or inline Assembly and language extensions that certainly aren't C (ISO C proper) either.

If you prefer, Zig is a modern taken on what Modula-2 in 1978, and Object Pascal in the 1980's already offered, with the addition of nullable types and comptime as key differentor in 40 years, packaged in a more appealing syntax for current generations.

2 days ago

[-]

My point is that the original Mac used little to no Pascal. The assembly isn’t a “rebuttal,” it’s just what was actually used.

2 days ago

[-]

Sure if the only thing that matters is what happened in 1990, and nothing else that came afterwards.

Also if we ignore the historical tellings from Apple employees at the time, in places like the Folklore book and CHM interviews.

2 days ago

[-]

What is with people on HN using a specific example and then getting annoyed when I respond to it? You specifically said Apple launched the original Mac without C. Which is true, but the implication that it used Pascal is not. I'm not addressing what happened years later.

Can you elaborate on these historical tellings? From what I found on folklore.org, Lisa OS had a bunch of Pascal, and the Mac system borrowed a bunch of Lisa code, but it was all hand-translated to assembly in the process.

http://pascal.hansotten.com/ucsd-p-system/apple-pascal/

1 day ago

[-]

> When Apple began development of the Macintosh (1982) Apple used Lisa Pascal and the Lisa Workshop for system software development.

> Object Pascal for the Macintosh was developed by Apple starting in 1985 to support more rapid and more standardized development of Macintosh programs. Available for only MPW, Object Pascal is a descendant of the Lisa Clascal compiler.

> The key Apple player behind Object Pascal was Larry Tesler who recruited the help of Niklaus Wirth, the creator of Pascal, to clean up the syntax of Clascal. Object Pascal was used to develop the extensive MacApp class library. This library was fully documented by Apple via several books and the source code for MacApp was provided to developers.

https://www.folklore.org/3rd_Party_Developers_and_Macintosh_...

> Macintosh development in the early days (circa 1983-1985) was done using the Apple Lisa computer and its Lisa Workshop development environment. I originally used a Lisa 2/5 model which contained 1M byte of RAM, an internal 400K 3.5" Sony floppy drive, and an external 5M byte ProFile hard drive (yes, 5M as in mega bytes was considered a rather large drive in those days). I later used a Lisa 2/10 model which had an additional 10M byte internal Widget hard drive which gave me a total of 15M bytes of hard drive storage.

> The Lisa Pascal language was very powerful and compiled Pascal source files to Motorola 68000 object code files. I never found a need to use the Workshop's 68000 assembler since everything I needed for my application could be written in the higher level Lisa Pascal language. Macintosh application resource information was created as text files which were then compiled to a binary format using the RMaker resource compiler. Transferring a Macintosh object program from the Lisa to the Macintosh required the Lisa utility program MacCom which copied Lisa files to a Macintosh formatted disk in the Lisa's 400K internal disk drive. MacCom combined separate Lisa data and resource fork files which were stored on the Lisa's hard drive and stored them as single documents on the Macintosh floppy.

> Macintosh programming was based on a collection of programming libraries called "units" in Pascal parlance. These resided on the Lisa and implemented the Macintosh application programming interface (API) called the Toolbox and Operating System by Apple. These libraries came on Lisa formatted disks called the Lisa Macintosh Supplement. I recall receiving around 3 or 4 supplements each with around a half dozen disks with these libraries. These disks also contained Macintosh utility and sample applications such as the Uriah Heap desk accessory by Andy Hertzfeld (called desk ornaments in the early days), the Edit text editor, and the File application by Cary Clark which showed detailed examples of Macintosh programming.

1 day ago

[-]

Folklore.org has many stories about writing Lisa OS code in Pascal, about rewriting various pieces in assembly for the Macintosh, and developing apps in Pascal, but I can’t find any mention of actually writing any part of the original Mac system in Pascal.

Nothing you’ve quoted says otherwise. The closest is the very first sentence, but all it says is that Pascal was in use at Apple when the Macintosh project began, not that it was used for that project.

https://bitsavers.org/pdf/apple/mac/Inside_Macintosh_Vol_1_1...

1 day ago

[-]

Whatever makes you happy to keep your view on the matter.

1 day ago

[-]

Now I'm wondering if you actually understand the difference between providing a Pascal interface and actually using Pascal to implement the stuff. That manual discusses the interface.

I'm happy to change my view given evidence, but you have yet to provide a single word of evidence that there was any Pascal in the original Macintosh system.

throwaway81523

1 day ago

[-]

> Go or Rust

[Cries in Ada]

nabhasablue

2 days ago

[-]

there is already an sqlite port in Go :) https://gitlab.com/cznic/sqlite

ncruces

2 days ago

[-]

That's not a port. That's an extremely impressive machine translation of C to Go.

The output is a non-portable half-a-million LoC Go file for each platform.

lanstin

1 day ago

[-]

And it works very well both as SQLite and as a pure go entity. I have used it for few years to do async backups of an in memory counting database (read from only at start up, written to by worker go routine that batches writes) without incident. Doesn’t really show up in the profiler.

cratermoon

2 days ago

[-]

also unmaintainable and full of unsafe

ahoka

2 days ago

[-]

Sure they did exist, almost no one cared though.

pizza234

2 days ago

[-]

> you already have bug-free code due to massive time, attention, and testing, and the rate of change is low (or zero), it doesn’t really matter what the language is. SQLIte could be assembly language for all it would matter.

This is the C/++ delusion - "if one puts enough effort, a [complex] memory unsafe program can be made memory safe"; the year after this page was published, the Magellan series of RCEs was released.

Keeping SQLite in C is certainly a valid design choice, but it's important to be aware of the practical implications of the language.

saalweachter

2 days ago

[-]

I think beyond the historical reasons why C was the best choice when SQLite was being developed, or the advantages it has today, there's also just no reason to rewrite SQLite in another language.

We don't have to have one implementation of a lightweight SQL database. You can go out right now and start your own implementation in Rust or C++ or Go or Lisp or whatever you like! You can even make compatible APIs for it so that it can be a drop-in replacement for SQLite! No one can stop you! You don't need permission!

But why would we want to throw away the perfectly good C implementation, and why would we expect the C experts who have been carefully maintaining SQLite for a quarter century to be the ones to learn a new language and start over?

2 days ago

[-]

> But why would we want to throw away the perfectly good C implementation, and why would we expect the C experts who have been carefully maintaining SQLite for a quarter century to be the ones to learn a new language and start over?

Because a lot of language advocacy has degraded to telling others what you want them to do instead of showing by example what to do. The idea behind this is that language adoption is some kind of zero sum game. If you're developing project 'x' in language 'y' then you are by definition not developing it in language 'z'. This reduces the stature of language 'z' and the continued existence of project 'x' in spite of not being written in language 'z' makes people wonder if language 'z' is actually as much of a necessity as its proponents claim. And never mind the fact that if the decision in what language 'x' would be written were to be revisited by the authors of 'x' that not only language 'z' would be on the menu, but also languages 'd', 'l', 'j' and 'g'.

waterTanuki

2 days ago

[-]

Given the common retort for why not try X project in Y new language is "it's barely used in other things. Let's wait and see it get industry adoption before trying it out" it's hard to see it as anything OTHER than a zero-sum game. As much as I like Rust I recognize some things like SQLite are better off in C. But the reason you see so much push for some new languages is because if they don't get and maintain regualr adoption, they will die off.

2 days ago

[-]

Plenty of programming languages gained mass adoption without such tactics.

john_the_writer

2 days ago

[-]

Yeah.. I always remind myself of the netscape browser. A lesson in "if it's working to mess with it" My question is always the reverse. Why try it in Y new language. Is there some feature that Y provides that was missing in X? How often do those features come up.

Company I worked for decided to build out a new microservice in language Y. The whole company was writing in W and X, but they decided to write the new service in Y. When something goes wrong, or a bug needs fixing, 3 people in the company of over 100 devs know Y. Guess what management is doing.. Re-writing it in X.

wolvesechoes

1 day ago

[-]

> there's also just no reason to rewrite SQLite in another language

But think about all those karma points here and on Reddit, or GitHub stars!

AdamJacobMuller

2 days ago

[-]

One good reason is that people have written golang adapters, so that you can use sqlite databases without cgo.

I agree to what I think you're saying which is that "sqlite" has, to some degree, become so ubiquitous that it's evolved beyond a single implementation.

We, of course, have sqlite the C library but there is also sqlite the database file format and there is no reason we can't have an sqlite implementation in golang (we already do) and one in pure rust too.

I imagine that in the future that will happen (pure rust implementation) and that perhaps at some point much further in the future, that may even become the dominant implementation.

[0] https://github.com/ncruces/go-sqlite3

zimpenfish

2 days ago

[-]

> One good reason is that people have written golang adapters, so that you can use sqlite databases without cgo.

There's also the Go-wrapped WASM build of the C sqlite[0] which is handy.

glandium

2 days ago

[-]

And, in fact, these implementations exist. At least in Rust, there's rqlite and turso.

otoolep

2 days ago

[-]

rqlite[1] author here. Just to be clear, rqlite is not SQLite but rewritten in Go. rqlite uses the vanilla C code, and calls it from Go[2]. I consider that an important advantage over other approaches -- rqlite gets all the benefits of rock-solid[3] SQLite. As result there are no questions about the quality of the database engine.

[1] https://rqlite.io

[2] https://rqlite.io/docs/design/

[3] https://www.sqlite.org/testing.html

biohazard2

2 days ago

[-]

> there's also just no reason to rewrite SQLite in another language. […] But why would we want to throw away the perfectly good C implementation, and why would we expect the C experts who have been carefully maintaining SQLite for a quarter century to be the ones to learn a new language and start over?

The SQLite developers are actually open to the idea of rewriting SQLite in Rust, so they must see an advantage to it:

> All that said, it is possible that SQLite might one day be recoded in Rust. Recoding SQLite in Go is unlikely since Go hates assert(). But Rust is a possibility. Some preconditions that must occur before SQLite is recoded in Rust include: […] If you are a "rustacean" and feel that Rust already meets the preconditions listed above, and that SQLite should be recoded in Rust, then you are welcomed and encouraged to contact the SQLite developers privately and argue your case.

yomismoaqui

2 days ago

[-]

My theory is they wrote this just to get the ‘rewrite everything in Rust’ crowd off their backs.

rirze

2 days ago

[-]

I think it’s the opposite. They want to atleast explore rewriting in Rust but are afraid of backlash. Hence why they’re open to private discussion. I can imagine they are split internally.

biohazard2

1 day ago

[-]

I find it a bit too specific, because it won't get rid of the `rewrite everything in (Go|Zig|…)` crowds. But who knows…?

turtletontine

2 days ago

[-]

Thanks for this, I fully agree. One frustration I have with the modern moment is the tendency to view anything more than five years old with disdain, as utterly irrelevant and obsolete. Maybe I’m just going old, but I like my technology dependable and boring, especially software. Glad to see someone express respect for the decades of expertise that have gone into things we take for granted.

friendly_wizard

1 day ago

[-]

I think we owe an equally proportionate measure of respect to the authors of boring old sqlite who, given the depth of their experience, recognize that there may in fact be benefits to be gained from the rewrite and are open to exploring the possibility. The blockers as stated, I have no doubt, were carefully considered with a level of insight few of us could match. If and when the time is right, if they choose to undertake the effort I'm sure the juice will be worth the squeeze. The fact that they're not jumping in blindly today is telling. Even more telling will be if they do eventually go that route.

eusto

2 days ago

[-]

I think that if SQLite would suddenly have to add a bunch of new features, the discussion about rewriting it would be very relevant.

I think we like to fool ourselves that decisions like these are based on performance considerations or maintainability or whatever, but in reality they would be based on time to market and skill availability in the areas where the team is being built.

At the end of the day, SQLite is not being rewritten because the cost of doing so is not justifiable

RhysU

2 days ago

[-]

These guys are, after all, running a business. If they thought the best thing for their business was a rewrite, they'd do it.

etruong42

2 days ago

[-]

To build excitement in a project and potentially release new versions with new features, all more safely than adding C code.

bfkwlfkjf

2 days ago

[-]

> Safe languages insert additional machine branches to do things like verify that array accesses are in-bounds. In correct code, those branches are never taken. That means that the machine code cannot be 100% branch tested, which is an important component of SQLite's quality strategy.

Huh it's not everyday that I hear a genuinely new argument. Thanks for sharing.

Aurornis

2 days ago

[-]

I guess I don’t find that argument very compelling. If you’re convinced the code branch can’t ever be taken, you also should be confident that it doesn’t need to be tested.

This feels like chasing arbitrary 100% test coverage at the expense of safety. The code quality isn’t actually improved by omitting the checks even though it makes testing coverage go up.

nimih

2 days ago

[-]

> If you’re convinced the code branch can’t ever be taken, you also should be confident that it doesn’t need to be tested.

I don't think I would (personally) ever be comfortable asserting that a code branch in the machine instructions emitted by a compiler can't ever be taken, no matter what, with 100% confidence, during a large fraction of situations in realistic application or library development, as to do so would require a type system powerful enough to express such an invariant, and in that case, surely the compiler would not emit the branch code in the first place.

One exception might be the presence of some external formal verification scheme which certifies that the branch code can't ever be executed, which is presumably what the article authors are gesturing towards in item D on their list of preconditions.

timv

2 days ago

[-]

The argument here is that they're confident that the bounds check isn't needed, and would prefer the compiler not insert one.

The choices therefore are:

1. No bound check

2. Bounds check inserted, but that branch isn't covered by tests

3. Bounds check inserted, and that branch is covered by tests

I'm skeptical of the claim that if (3) is infeasible then the next best option is (1)

Because if it is indeed an impossible scenario, then the lack of coverage shouldn't matter. If it's not an impossible scenario then you have an untested case with option (1) - you've overrun the bounds of an array, which may not be a branch in the code but is definitely a different behaviour than the one you tested.

nimih

1 day ago

[-]

> Because if it is indeed an impossible scenario, then the lack of coverage shouldn't matter.

At the point where a load-bearing piece of your quality assurance strategy is 100% branch coverage of the generated machine code, it very much does matter.

> I'm skeptical of the claim that if (3) is infeasible then the next best option is (1)

In the general case, obviously not. But, in the specific case we’re discussing, which is that (2) has the rider of “the development team will be forced to abandon a heretofore important facet of their testing strategy at the exact moment they are rewriting the entire codebase in a language they are guaranteed to have less expertise in,” I think (1) seems pretty defensible.

skywhopper

2 days ago

[-]

I think you’re misreading their statement. They aren’t saying they don’t want the compiler to insert the additional code. They’re saying they want to test all code the compiler generates.

estebank

2 days ago

[-]

In safety critical spaces you need to be able to trace any piece of a binary back to code back to requirements. If a piece of running code is implicit in code, it makes that traceability back to requirements harder. But I'd be surprised if things like bounds checks are really a problem for that kind of analysis.

Aurornis

2 days ago

[-]

I don’t see the issue. The operations which produce a bounds check are traceable back to the code which indexes into something.

0xWTF

2 days ago

[-]

What tools do you use for this? PlantUML?

refulgentis

2 days ago

[-]

Yeah sounds too clever by half, memory safe languages are less safe because they have bounds checks...maybe I could see it on a space shuttle? Well, only in the most CYA scenarios, I'd imagine.

https://www.sqlite.org/famous.html

evil-olive

2 days ago

[-]

> maybe I could see it on a space shuttle?

"Airbus confirms that SQLite is being used in the flight software for the A350 XWB family of aircraft."

sgarland

2 days ago

[-]

Bear in mind that SQLite is used in embedded systems, and I absolutely wouldn’t be surprised to learn it’s in space.

manwe150

2 days ago

[-]

Critical applications like that used to use ADA to get much more sophisticated checking than just bounds. No certified engineer would (should) ever design a safety critical system without multiple “unreachable” fail safe mechanisms

Next they’ll have to tell me about how they had to turn off inlining because it creates copies of code which adds some dead branches. Bounds checks are just normal inlined code. Any bounds checked language worth its salt has that coverage for all that stuff already.

skywhopper

2 days ago

[-]

SQLite is used in a lot of hypercritical application areas. I’d almost be surprised if it’s not part of some if not all modern spaceflight stacks.

dimitrios1

2 days ago

[-]

There is a whole 'nother level of safety validation that goes beyond your everyday OWASP, or heck even what we consider "highly regulated" industry requirements that 95-99% of us devs care about. SQLite is used in some highly specialized, highly sensitive environments, where they are concerned about bit flips, and corrupted memory. I had the luxury of sitting through Richard Hipp's talk about it one time, but I am certainly butchering it.

2 days ago

[-]

I'm confused about the claim though. These branches are not at the source level, and test coverage usually is measured at the source level.

Deanoumean

2 days ago

[-]

You didn't understand the argument. The testing is what instills the confidence.

throw0101d

2 days ago

[-]

> If you’re convinced the code branch can’t ever be taken, you also should be confident that it doesn’t need to be tested.

“What gets us into trouble is not what we don't know. It's what we know for sure that just ain't so.” — Mark Twain, https://www.goodreads.com/quotes/738123

Ekaros

2 days ago

[-]

If a code branch can't be ever taken. Doesn't that mean you do not need it? Basically it must be code that will not get executed. So leaving it out does not matter.

If you then can come up a scenario where you need it. Well in fully tested code you do need to test it.

jonahx

2 days ago

[-]

So is the argument that safe langs produce stuff like:

    // pseudocode
    if (i >= array_length) panic("index out of bounds")

that are never actually run if the code is correct? But (if I understand correctly) these are checks implicitly added by the compiler. So the objection amounts to questioning the correctness of this auto-generated code, and is predicated upon mistrusting the correctness of the compiler? But presumably the Rust compiler itself would have thorough tests that these kinds of checks work?

Someone please correct me if I'm misunderstanding the argument.

btilly

2 days ago

[-]

One of the things that SQLite is explicitly designed to do is have predictable behavior in a lot of conditions that shouldn't happen. One of those predictable behavior is that it does its best to stay up and running, and continuing to do the best it can. Conditions where it should succeed in doing this include OOM, the possibility of corrupted data files, and (if possible) misbehaving CPUs.

Automatic array bounds checks can get hit by corrupted data. Thereby leading to a crash of exactly the kind that SQLite tries to avoid. With complete branch testing, they can guarantee that the test suite includes every kind of corruption that might hit an array bounds check, and guarantee that none of them panic. But if the compiler is inserting branches that are supposed to be inaccessible, you can't do complete branch testing. So now how do you know that you have tested every code branch that might be reached from corrupted data?

Furthermore those unused branches are there as footguns which are reachable with a cosmic ray bit flip, or a dodgy CPU. Which again undermines the principle of keeping running if at all possible.

vlovich123

2 days ago

[-]

In rust at least you are free to access an array via .get which returns an option and avoids the “compiler inserted branch” (which isn’t compiler inserted by the way - [] access just implicitly calls unwrap on .get and sometimes the compiler isn’t able to elide).

Also you rarely need to actually access by index - you could just access using functional methods on .iter() which avoids the bounds check problem in the first place.

OptionOfT

2 days ago

[-]

For slices the access is handled inside of the compiler: https://github.com/rust-lang/rust/blob/235a4c083eb2a2bfe8779...

I'm checking to see how array access is implemented, whether through deref to slice, or otherwise.

vlovich123

2 days ago

[-]

I had Vec in mind but regardless nothing forces you to use the bounds-checked variant vs one that returns option<t>. And if you really are sure the bounds hold you can always use the assume crate or just unwrap_unchecked explicitly.

jemmyw

2 days ago

[-]

Keeping running if possible doesn't sound like the best strategy for stability. If data was corrupted in memory in a was that would cause a bounds check to fail then carrying on is likely to corrupt more data. Panic, dump a log, let a supervisor program deal with the next step, or a human, but don't keep going potentially persisting corrupted data.

btilly

2 days ago

[-]

What the best strategy is depends on your use case.

The use case that SQLite has chosen to optimize for is critical embedded software. As described in https://www.sqlite.org/qmplan.html, the standard that they base their efforts on is a certification for use in aircraft. If mission critical software on a plane is allowed to crash, this can render the controls inoperable. Which is likely to lead to a very literal crash some time later.

The result is software that has been optimized to do the right thing if at all possible, and to degrade gracefully if that is not possible.

Note that the open source version of SQLite is not certified for use in aviation. But there are versions out there that have been certified. (The difference is a ton of extra documentation.) And in fact SQLite is in use by Airbus. Though the details of what exactly for are not, as far as I know, public.

If this documented behavior is not what you want for your use case, then you should consider using another database. Though, honestly, no other database comes remotely close when it comes to software quality. And therefore I doubt that "degrade as documented rather than crash" is a good reason to avoid SQLite. (There are lots of other potential reasons for choosing another database.)

jemmyw

1 day ago

[-]

You're right and when I thought about it more I considered that "supervisor" isn't what I would want. Rather I'm thinking raising errors to the program that embeds sqlite so that it can decide what to do. I do have a desktop app that uses sqlite and I'd rather it raised an error than tried to recover.

Groxx

2 days ago

[-]

outside political definitions, I'm not sure "crash and restart with a supervisor" and "don't crash" are meaningfully different? they're both error-handling tactics, likely perfectly translatable to each other, and Erlang stands as an existence proof that crashing is a reasonable strategy in extremely reliable software.

I fully recognize that political definitions drive purchases, so it's meaningful to a project either way. but that doesn't make it a valid technical argument.

btilly

2 days ago

[-]

Yes, Erlang demonstrates that "crash and restart with a supervisor" is a potentially viable strategy to reliability.

But the choice is not just political. There are very meaningful technical differences for code that potentially winds up embedded in other software, and could be inside of literal embedded software.

The first is memory. It takes memory to run whatever is responsible for detecting the crash, relaunching, and starting up a supervisor. This memory is not free. Which is one of the reasons why Erlang requires at a minimum 10 MB or so of memory. By contrast the overhead of SQLite is something like half a MB. This difference is very significant for people putting software into medical devices, automotive controllers, and so on. All of which are places where SQLite is found, but Erlang isn't.

The second is concurrency. Erlang's concurrency model leaks - you can't embed it in software without having to find a way to fit Erlang concurrency in. This isn't a problem if Erlang already is in your software stack. But that's an architectural constraint that would be a problem in many of the contexts that SQLite is actually used in.

Remember, SQLite is not optimized for your use case. It is optimized for embedded software that needs to try to keep running when things go wrong. It just happens to be so good that it is useful for you.

Izkata

2 days ago

[-]

If the cause of the crash is in any way related to the persisted data, there's a good chance you're now stuck in a crashloop.

If it can avoid crashing, other functionality may continue to work fine.

Groxx

1 day ago

[-]

this is true for any kind of error branch - if the cause is persisted, it's going to still be there on the next iteration.

2 days ago

[-]

It still needs to detect that there is corrupted data, dump the log and the supervisor would not be the best if it was external since in some runtimes it could be missing, they just build it into it and we came full circle.

NobodyNada

2 days ago

[-]

> But (if I understand correctly) these are checks implicitly added by the compiler.

This is a dubious statement. In Rust, the array indexing operator arr[i] is syntactic sugar for calling the function arr.index(i), and the implementation of this function on the standard library's array types is documented to perform a bounds-check assertion and access the element.

So the checks aren't really implicitly added -- you explicitly called a function that performs a bounds check. If you want different behavior, you can call a different, slightly-less-ergonomic indexing function, such as `get` (which returns an Option, making your code responsible for handling the failure case) or `get_unchecked` (which requires an unsafe block and exhibits UB if the index is out of bounds, like C).

nubbler

2 days ago

[-]

Another commenter in this thread used the phrase "complex abomination" which seems more and more apt the more I learn about Rust.

J_Shelby_J

2 days ago

[-]

Nothing in this world is perfect, but this behavior is less of an abomination than whatever a junior dev on a timeline might write to handle this condition.

binary132

2 days ago

[-]

I think it’s less like doubting that the given panic works and more like an extremely thorough proof that all possible branches of the control flow have acceptable behavior. If you haven’t tested a given control flow, the issue is that it’s possible that the end result is some indeterminate or invalid state for the whole program, not that the given bounds check doesn’t panic the way it’s supposed to. On embedded for example (which is an important usecase for SQLite) this could result in orphaned or broken resources.

jonahx

2 days ago

[-]

> I think it’s less like doubting that the given panic works and more like an extremely thorough proof that all possible branches of the control flow have acceptable behavior.

The way I was thinking about it was: if you somehow magically knew that nothing added by the compiler could ever cause a problem, it would be redundant to test those branches. Then wondering why a really well tested compiler wouldn't be equivalent to that. It sounds like the answer is, for the level of soundness sqlite is aspiring to, you can't make those assumptions.

thayne

2 days ago

[-]

But does it matter if that control flow is unreachable?

If the check never fails, it is logically equivalent to not having the check. If the code isn't "correct" and the panic is reached, then the equivalent c code would have undefined behavior, which can be much worse than a panic.

nubbler

2 days ago

[-]

In the first case, if it is actually unreachable, I would never want that code ending up in my binary at all. It must be optimised out.

Your second case implies that it is reachable.

thayne

2 days ago

[-]

In the first case, it often is optimized out. But the optimizer isn't perfect, and can't detect every case where it is unreachable.

If you have the second case, I would much rather have a panic than undefined behavior. As mentioned in another comment, in c indexing an array is semantically equivalent to:

    if (i < len(arr)) arr[i] else UB()

In fact a c compiler could put in a check and abort if it is out of bounds, like rust does and still be in spec. But the undefined behavior could also cause memory corruption, or cause some other subtle bug.

2 days ago

[-]

> questioning the correctness of this auto-generated code

I wouldn't put it that way. Usually when we say the compiler is "incorrect", we mean that it's generating code that breaks the observable behavior of some program. In that sense, adding extra checks that can't actually fail isn't a correctness issue; it's just an efficiency issue. I'd usually say the compiler is being "conservative" or "defensive". However, the "100% branch testing" strategy that we're talking about makes this more complicated, because this branch-that's-never-taken actually is observable, not to the program itself but to its test suite.

2 days ago

[-]

no it's a (accidental) red Hering argument

sure safety checks are added but

it's ignoring that many of such checks get reliably optimized away

worse it's a bit like saying "in case of a broken invariant I prefer arbitrary potential highly problematic behavior over clean aborts (or errors) because my test tooling is inadequate"

instead of saying "we haven't found adequate test tooling" for our use case

Why inadequate? Because technically test setups can use

1. fault injection to test such branches even if normally you would never hit them

2. for many of such tests (especially array bound checks) you can pretty reliably identify them and then remove them from your test coverage statistic

idk. what the tooling of rust wrt this is in 2025, but around the rust 1.0 times you mainly had C tooling you applied to rust so you had problems like that back then.

sixthDot

2 days ago

[-]

Bound checks are usually conditionally compiled. That's more a kind of "contract" you'll verify during testing. In the end the software actually used will not check anything.

    #ifdef CONTRACTS
    if (i >= array_length) panic("index out of bounds")
    #endif

sixthDot

1 day ago

[-]

an hygienic way to handle that is often "assert", can be a macro or a built in statement.The main problem with assertions is the side-effects. The verification must be pure...

lionkor

2 days ago

[-]

It's not like that, the compiler explicitly doesn't do compile-time checks here and offloads those to the runtime.

Rust does not stop you from writing code that accesses out of bounds, at all. It just makes sure that there's an if that checks.

selcuka

2 days ago

[-]

Ok, but you can still test all the branches in your source code and have 100% coverage. Those additional `if` branches are added by the compiler. You are responsible for testing the code you write, not the one that actually runs. Your compiler's test suite is responsible for the rest.

By the same logic one could also claim that tail recursion optimisation, or loop unrolling are also dangerous because they change the way code works, and your tests don't cover the final output.

binary132

2 days ago

[-]

If they produce control flow _in the executable binary_ that is untested, then they could conceivably lead to broken states. I don’t believe most of those sorts of transformations cause alternative control flows to be added to the executable binary.

I don’t think anyone would find the idea compelling that “you are only responsible for the code you write, not the code that actually runs” if the code that actually runs causes unexpected invalid behavior on millions of mobile devices.

okanat

2 days ago

[-]

Well this way of arguing it may seem smart but it is not fully correct.

Google already ships binaries compiled with Rust in Android. They are actually system services which are more critical than SQLite storage of apps.

Moreover Rust version of SQLite can ship binaries compiled with a qualified compiler like Ferrocene: https://ferrocene.dev/en/ (which is the downstream, qualified version of standard Rust compiler rustc). In qualification process the compiler is actually checked whether it generates reasonable machine code against a strict set of functional requirements.

Most people don't compile SQLite with qualified versions of GCC either. So this exact argument actually can be turned against them.

foul

2 days ago

[-]

>You are responsible for testing the code you write, not the one that actually runs.

Hipp worked as a military contractor for battleships, furthermore years later SQLite was under contract under every proto-smartphone company in the USA. Under these constraints you maybe are not responsible to test what the compiler spits out across platforms and different compilers, but doing that makes the project a lot more reliable, makes it sexier for embedded and weapons.

lionkor

1 day ago

[-]

You're right but only in software that isn't a database. In a database, when the program panic!()s, you're SOL. These extra branches panic.

If sqlite were to read one byte over the end of an array, it's unlikely to lose your data. Rust would guarantee that to lose data.

tialaramex

2 days ago

[-]

I believe there's a Rust RFC for a way to write mandatory tail calls with the become keyword. So then the code is actually defined to have a tail call, if it can't have a tail call it won't compile, if it can have one then that's what you get.

Some languages I was aware of are defined so that if what you wrote could be a tail call it is. However you might write code you thought was a tail call and you were wrong - in such languages it only blows up when it recurses too deep and runs out of stack. AIUI the Rust feature would reject this code.

thfuran

1 day ago

[-]

>You are responsible for testing the code you write, not the one that actually runs

That's a bizarre claim. The source code isn't the product, and the product is what has to work. If a compiler or OS bug causes your product to function incorrectly, it's still your problem. The solution is to either work around the bug or get the bug fixed, not just say "I didn't write the bug, so deal with it."

selcuka

1 day ago

[-]

You are a better developer than me, then. I take it you have tests in your product repos that test your compiler behaviour, including optimisations that you enable while building binaries, and all third party dependencies you use. Is that accurate?

There is a difference between "gcc 4.8 is buggy, let's not use it" and "let's write unit tests for gcc". If you are suspicious about gcc, you should submit your patches to gcc, not vendor them in your own repo.

thfuran

1 day ago

[-]

>I take it you have tests in your product repos that test your compiler behaviour, including optimisations that you enable while building binaries, and all third party dependencies you use. Is that accurate?

Are you asking whether I write integration tests? Yes, I do. And at work there's a whole lot of acceptance testing too.

>There is a difference between "gcc 4.8 is buggy, let's not use it" and "let's write unit tests for gcc".

They're not proposing writing unit tests for gcc, only actually testing what gcc produces from their source. You know, by executing it like tests tend to do. Testing only the first party source would mean relying entirely on static source code analysis instead.

selcuka

1 day ago

[-]

> Are you asking whether I write integration tests? Yes, I do.

Exactly. You don't need unit tests for the binary output. You want to test whether the executable behaves as expected. Therefore "rust adds extra conditional branches that are never entered, and we can't test those branches" argument is not valid.

unclad5968

2 days ago

[-]

I don't see anything wrong with taking responsibility for the code that actually runs. I would argue it's that level of accountability has played a part in Sqlite being such a great project.

estebank

2 days ago

[-]

> You are responsible for testing the code you write, not the one that actually runs.

This is not correct for every industry.

anitil

2 days ago

[-]

It's the sort of argument that I wouldn't accept from most people and most projects, but from Dr Hipp isn't most people and Sqlite isn't most projects.

cogman10

2 days ago

[-]

It's a bad argument.

Certainly don't get me wrong, SQLite is one of the best and most thoroughly tested libraries out there. But this was an argument to have 4 arguments. That's because 2 of the arguments break down as "Those languages didn't exist when we first wrote SQLite and we aren't going to rewrite the whole library just because a new language came around."

Any language, including C, will emit or not emit instructions that are "invisible" to the author. For example, whenever the C compiler decides it can autovectorize a section of a function it'll be introducing a complicated set of SIMD instructions and new invisible branch tests. That can also happen if the C compiler decides to unroll a loop for whatever reason.

The entire point of compilers and their optimizations is to emit instructions which keep the semantic intent of higher level code. That includes excluding branches, adding new branches, or creating complex lookup tables if the compiler believes it'll make things faster.

Dr Hipp is completely correct in rejecting Rust for SQLite. Sqlite is already written and extremely well tested. Switching over to a new language now would almost certainly introduce new bugs that don't currently exist as it'd inevitably need to be changed to remain "safe".

Ferret7446

2 days ago

[-]

> Any language, including C, will emit or not emit instructions that are "invisible" to the author

Presumably this is why they do 100% test coverage. All of those instructions would be tested and not invisible to the test suite

cogman10

2 days ago

[-]

How could they know? Any changes to the compiler will potentially generate new code.

A new compiler, new flags, a new version. These all can create new invisible untested branches.

joshkel

2 days ago

[-]

The way you know is by running the full SQLite test suite, with 100% MC/DC coverage (slightly stricter than 100% branch coverage), on each new compiler, version, and set of flags you intend to support. It's my understanding that this is the approach taken by the SQLite team.

Dr. Hipp's position is paraphrased as, “I cannot trust the compilers, so I test the binaries; the source code may have UBs or run into compiler bugs, but I know the binaries I distribute are correct because they were thoroughly tested" at https://blog.regehr.org/archives/1292. There, Dr. John Regehr, a researcher in undefined behavior, found some undefined behavior in the SQLite source code, which kicked off a discussion of the implications of UB given 100% MC/DC coverage of the binaries of every supported platform.

(I suppose the argument at this point is, "Users may use a new compiler, flag, or version that creates untested code, but that's not nearly as bad as _all_ releases and platforms containing untested code.")

ynik

2 days ago

[-]

Autovectorization / unrolling can maybe still be handed with a couple of additional tests. The main problem I see with doing branch coverage on compiled machine code is inlining: instead of two tests for one branch, you now need two tests for each function that a copy of the branch was inlined into.

manwe150

2 days ago

[-]

If it was as completely tested as claimed, then switching to rust would be trivial. All you need to do is pass the test suite and all bugs would be gone. I can think of other reasons not to jump to rust (it is a lot of code, sqlite already works well, and test coverage is very good but also incomplete, and rust only solves a few correctness problems)—just not because of claiming sqlite is already tested enough to be bug free of the kinds of issues that rust might actually prevent.

2 days ago

[-]

> to rust would be trivial.

no, you still need to rewrite, re-optimize, etc. everything

it would make it much easier to be fully compatible, sure, but that doesn't make it trivial

furthermore part of it's (mostly internal) design are strongly influenced by C specific dev-UX aspects, so you wouldn't write them the same, so test for them (instead of integration tests) may not apply

which in general also means that you most likely would break some special purpose/usual user which do have "brittle" (not guaranteed) assumptions about SQLite

if you have code which very little if at all changes and has no major issues, don't rewrite it

but most of the new "external" things written around SQLite, alternative VFS impl. etc. tend to be at most partially written in C

2 days ago

[-]

> If it was as completely tested as claimed

It is.

> then switching to rust would be trivial

So prove it. Hint: it's not trivial.

0: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.get...

hypeatei

2 days ago

[-]

Couldn't a method like `get_unchecked()` be used to avoid the bounds check[0] if you know it's safe?

2 days ago

[-]

Yes. You have to write `unsafe { ... }` around it, so there's an ergonomic penalty plus a more nebulous "sense that you're doing something dangerous that might get some skeptical looks in code review" penalty, but the resulting assembly will be the same as indexing in C.

hypeatei

2 days ago

[-]

I figured, but I guess I don't understand this argument then. SQLite as a project already spends a lot of time on quality so doing some `unsafe` blocks with a `// SAFETY:` comment doesn't seem unreasonable if they want to avoid the compiler inserting a panic branch for bounds checks.

Ferret7446

2 days ago

[-]

If you put unsafe around almost all of your code (array indexing) aren't you better off just writing C?

2 days ago

[-]

Perhaps if the only thing you're doing is array indexing? Though I'm not sure that would apply in this particular case anyways.

tomjakubowski

2 days ago

[-]

In many cases LLVM can prove the bounds check is redundant or otherwise is unnecessary and will optimize it away.

ChadNauseam

2 days ago

[-]

I wonder if this problem could be mitigated by not requiring coverage of branches that unconditionally lead to panics. or if there could be some kind of marking on those branches that indicate that they should never occur in correct code

accelbred

2 days ago

[-]

You'd want to statically prove that any panic is unreachable

jkafjanvnfaf

2 days ago

[-]

It's new because it makes no sense.

There already is an implicit "branch" on every array access in C, it's called an access violation.

Do they test for a segfault on every single array access in the code base? No? Then they don't really have 100% branch coverage, do they?

prein

2 days ago

[-]

Take a look at their description of how SQLite is tested: https://www.sqlite.org/testing.html

I think a lot of projects that claim to have 100% coverage are overselling their testing, but SQLite is in another category of thoroughness entirely.

beached_whale

2 days ago

[-]

I think those branches are often not there because it's provably never going out of bounds. There are ways to ensure the compiler knows the bounds cannot be broken.

NobodyNada

2 days ago

[-]

It's interesting to consider (and the whole page is very well-reasoned), but I don't think that the argument holds up to scrutiny. If such an automatic bounds-check fails, then the program would have exhibited undefined behavior without that branch -- and UB is strictly worse than an unreachable branch that does something well-specified like aborting.

A simple array access in C:

    arr[i] = 123;

...can be thought of as being equivalent to:

    if (i >= array_length) UB();
    else arr[i] = 123;

where the "UB" function can do literally anything. From the perspective of exhaustively testing and formally verifying software, I'd rather have the safe-language equivalent:

    if (i >= array_length) panic();
    else arr[i] = 123;

...because at least I can reason about what happens if the supposedly-unreachable condition occurs.

Dr. Hipp mentions that "Recoding SQLite in Go is unlikely since Go hates assert()", implying that SQLite makes use of assert statements to guard against unreachable conditions. Surely his testing infrastructure must have some way of exempting unreachable assert branches -- so why can't bounds checks (that do nothing but assert undefined behavior does not occur) be treated in the same way?

eesmith

2 days ago

[-]

The 100% branch testing is on the compiled binary. To exempt unreachable assert branches, turn off assertions, compile, and test.

A more complex C program can have index range checking at a different place than the simple array access. The compiler's flow analysis isn't always able to confirm that the index is guaranteed to be checked. If it therefore adds a cautionary (and unneeded) range check, then this code branch can never be exercised, making the code no longer 100% branch tested.

2 days ago

[-]

the problem is it's kinda an anti argument

you basically say if deeply unexpected things happen you prefer you program doing widely arbitrary and as such potentially dangerous things over it having a clean abort or proper error. ... that doesn't seem right

worse it's due to a lack of the used tooling and not a fundamental problem, not only can you test this branches (using fault injection) you also often (not always) can separate them from relevant branches when collecting the branch statistics

so the while argument misses the point (which is tooling is lacking, not extra checks for array bounds and similar)

lastly array bounds checking is probably the worst example they could have given as it

- often can be disabled/omitted in optimized builds

- is quite often optimized away

- has often quite low perf. overhead

- bound check branches are often very easy to identify, i.e. excluding them from a 100% branch testing statistic is viable

- out of bounds read/write are some of the most common cases of memory unsafety leading to security vulnerability (including full RCE cases)

sgbeal

2 days ago

[-]

> you prefer you program doing widely arbitrary and as such potentially dangerous things over it having a clean abort or proper error.

SQLite isn't a program, it's a library used by many other programs. As such, aborting is not an option. It doesn't do "wildly arbitrary" things - it reports errors to the client application and takes it on faith that they will respond appropriately.

coolThingsFirst

2 days ago

[-]

This is a dumb argument, it's like saying for a perfect human being there's no need for smart pointers, garbage collection or the borrow checker.

ChrisRR

2 days ago

[-]

I can't figure out how you've come to that equivalence

kazinator

2 days ago

[-]

> In incorrect code, the branches are taken, but code without the branches just behaves unpredictably.

It's like seat belts.

E.g. what if we drive four blocks and then the case occurs when the seatbelt is needed need the seatbelt? Okay, we have an explicit test for that.

But we cannot test everything. We have not tested what happens if we drive four blocks, and then take a right turn, and hit something half a block later.

Screw it, just remove the seatbelts and not have this insane untested space whereby we are never sure whether the seat belt will work properly and prevent injury!

DarkNova6

2 days ago

[-]

- Rust needs to mature a little more, stop changing so fast, and move further toward being old and boring.

- Rust needs to demonstrate that it can be used to create general-purpose libraries that are callable from all other programming languages.

- Rust needs to demonstrate that it can produce object code that works on obscure embedded devices, including devices that lack an operating system.

- Rust needs to pick up the necessary tooling that enables one to do 100% branch coverage testing of the compiled binaries.

- Rust needs a mechanism to recover gracefully from OOM errors.

- Rust needs to demonstrate that it can do the kinds of work that C does in SQLite without a significant speed penalty.

2 days ago

[-]

1. Rust has had ten years since 1.0. It changes in backward compatible ways. For some people, they want no changes at all, so it’s important to nail down which sense is meant.

2. This has been demonstrated.

3. This one hinges on your definition of “obscure,” but the “without an operating system” bit is unambiguously demonstrated.

4. I am not an expert here, but given that you’re testing binaries, I’m not sure what is Rust specific. I know the Ferrocene folks have done some of this work, but I don’t know the current state of things.

5. Rust as a language does no allocation. This OOM behavior is the standard library, of which you’re not using in these embedded cases anyway. There, you’re free to do whatever you’d like, as it’s all just library code.

6. This also hinges on a lot of definitions, so it could be argued either way.

2 days ago

[-]

> 2.

ironically if we look at how things play out in practice rust is far more suited as general purpose languages then C, to a point where I would argue C is only a general purpose language on technicality not on practical IRL basis

this is especially ridiculous when they argue C is the fasted general purpose language when that has proven to simply not hold up to larger IRL projects (i.e. not micro benchmarks)

C has terrible UX for generic code re-use and memory management, this often means that in IRL projects people don't write the fasted code. Wrt. memory management it's not rare to see unnecessary clones, as not doing so it to easy to lead to bugs. Wrt. data structures you write the code which is maintainable, robust and fast enough and sometimes add the 10th maximal simple reimplementation (or C macro or similar) of some data structure instead of using reusing some data structures people spend years of fine tuning.

When people switched a lot from C to C++ most general purpose projects got faster, not slower. And even for the C++ to Rust case it's not rare that companies end up with faster projects after the switch.

Both C++ and Rust also allow more optimization in general.

So C is only fastest in micro benchmarks after excluding stuff like fortran for not being general purpose while itself not really being used much anymore for general purpose projects...

drnick1

2 days ago

[-]

I think Rust (and C++) are just too complicated and visually ugly, and ultimately that hurts the maintainability of the code. C is simple, universal, and arguably beautiful to look at.

simonask

2 days ago

[-]

C is simple. As a result, programming in C is not simple in any way.

2 days ago

[-]

These are all opinions.

2 days ago

[-]

C is so simple, that you will need to read a 700-page, comitee-written manual befor you can attempt to write it correctly.

guest_reader

2 days ago

[-]

> C is so simple, that you will need to read a 700-page, comitee-written manual befor you can attempt to write it correctly.

The official C99 standard document is typically about 210 pages.

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf

2 days ago

[-]

Admittedly the 700 pages include the appendix and it is only a draft version, but still...

2 days ago

[-]

Except the little detail that you need to complement that with all the manuals of the C compilers being used.

2 days ago

[-]

Rust has dependency hell and supply chain attacks like with npm.

jeroenhd

2 days ago

[-]

C has the same problem, but it lacks a common package manager like other languages do. Just because you need to clone git submodules or run package manager commands (hope you're on a supported OS version!) doesn't mean C doesn't have package manager issues.

C projects avoiding dependencies entirely just end up reimplementing work. You can do that in any language.

mamcx

2 days ago

[-]

But is optional. For this kind of project, is logical to adopt something like the tiger battle ethos and own all the code and have no external deps (or vendor them). Even do your own std if wanna.

Is hard work? But is not that different from what you see in certain C projects that neither use external deps

bsder

2 days ago

[-]

Tigerbeetle. Your autocorrect really mangled that one ...

2 days ago

[-]

The lack of dependency hell is a bit of an illusion when it comes to C. What other languages solve via library most C projects will reimplement themselves, which of course increases the chance for bugs.

2 days ago

[-]

You control the dependencies you put in Cargo.toml.

2 days ago

[-]

What about the dependencies of your dependencies?

I don't put too many things in Cargo.toml and it still pulls like a hundred things

ghosty141

2 days ago

[-]

Then don't? In C you would just implement everything yourself, so go do that in Rust if you don't want dependencies.

In C I've seen more half-baked json implementations than I can count on my fingers because using dependencies is too cumbersome in that ecosystem and people just write it themselves but most of the time with more bugs.

2 days ago

[-]

If you care about not having any dependencies, then choosing dependencies that themselves don't have many dependencies should be going into the ones that you choose.

rendaw

2 days ago

[-]

Direct and transitive dependencies are locked and hashed.

BrouteMinou

2 days ago

[-]

Your system is going to be owned, but at least, it's going to be "memory safely" owned!

P. S.

I you don't account all the unsafe sections scattered everywhere in all those dependencies.

gerdesj

2 days ago

[-]

"1. Rust has had ten years since 1.0. ..."

Rust insists on its own package manager "rustup" and frowns on distro maintainers. When Rust is happy to just be packaged by the distro and rustup has gone away, then it will have matured to at least adolescence.

2 days ago

[-]

Rust has long worked with distro package maintainers, and as far as I know, Rust is packaged in every major Linux distribution.

There are other worlds out there than Linux.

gerdesj

2 days ago

[-]

So why insist on rustup?

2 days ago

[-]

different goals

the rust version packaged in distros is for compiling rust code shipped as part of the distro. This means it

- is normally not the newest version (which , to be clear, is not bad per see, but not necessary what you need)

- might not have all optional components (e.g. no clippy)

but if you idk. write a server deployed by you company

- you likely want all components

- you don't need to care what version the distro pinned

- you have little reason not to use the latest rust compiler

for other use cases you have other reasons, some need nightly rust, some want to test against beta releases, some want to be able to test against different rust versions etc. etc.

rustup exist (today) for the same reason why a lot of dev projects use project specific copies of all kinds of tooling and libraries which do not match whatever their distro ships: The distro use-case and generic dev-use case have diverging requirements! (Other examples nvm(node), flutter, java etc.).

Also some distros are notorious for shipping outdated software (debian "stable").

And not everything is Linux, rustup works on OSX.

2 days ago

[-]

Distributions generally package the versions of compilers that are needed to build the programs in their package manager. However, many developers want more control than that. They may want to use different versions of the compiler on different projects, or a different version than what’s packaged.

Basically, people use it because they prefer it.

gspr

2 days ago

[-]

I'm a Debian Developer, and do some Rust both professionally and for fun. I restrict myself to using only libraries and tooling from Debian. The experience is quite OK. And I find the Rust language team to be quite friendly and sympathetic to our needs.

Rather, what makes it hard is the culture and surrounding ecosystem of pinned versions or the latest of everything. That's probably in part the fault of Rustup being recommended, I agree. But it's not nefarious.

2 days ago

[-]

One question towards maturity: has any working version of the Rust compiler ever existed? By which I mean one that successfully upholds the memory-safety guarantees Rust is supposed to make, and does not have any "soundness holes" (which IIRC were historically used as a blank check / excuse to break backwards compatibility).

The current version of the Rust compiler definitely doesn't -- there's known issues like https://github.com/rust-lang/rust/issues/57893 -- but maybe there's some historical version from before the features that caused those problems were introduced.

2 days ago

[-]

has there ever been a modern optimizing C compiler free of pretty serious bugs? (it's a rhetoric question, there hasn't been any)

2 days ago

[-]

Every compiler has soundness bugs. They’re just programs like any other. This isn’t exclusive to Rust.

2 days ago

[-]

In general, the way Rust blurs the line between "bugs in the compiler" and "problems with how the language is designed" seems pretty harmful and misleading. But it's also a core part of the marketing strategy, so...

2 days ago

[-]

What makes you say this is a core part of the marketing strategy? I don’t think Rust’s marketing has ever focused on compiler bugs or their absence.

2 days ago

[-]

You are correct that Rust's marketing does not claim that there are no bugs in its compiler. In fact it does the opposite: it suggests that there are no problems with the language, by asserting that any observed issue in the language is actually a bug in the compiler.

Like, in the C world, there's a difference between "the C specification has problems" and "GCC incorrectly implements the C specification". You can make statements about what "the C language" does or doesn't guarantee independently of any specific implementation.

But "the Rust language" is not a specification. It's just a vague ideal of things the Rust team is hoping their compiler will be able to achieve. And so "the Rust language" gets marketed as e.g. having a type system that guarantees memory safety, when in fact no such type system has been designed -- the best we have is a compiler with a bunch of soundness holes. And even if there's some fundamental issue with how traits work that hasn't been resolved for six years, that can get brushed off as merely a compiler bug.

This propagates down into things like Rust's claims about backwards compatibility. Rust is only backwards-compatible if your programs are written in the vague-ideal "Rust language". The Rust compiler, the thing that actually exists in the real world, has made a lot of backwards-incompatible changes. But these are by definition just bugfixes, because there is no such thing as a design issue in "the Rust language", and so "the Rust language" can maintain its unbroken record of backwards-compatibility.

[0]: https://github.com/rust-lang/rust/issues/57893

2 days ago

[-]

> And even if there's some fundamental issue with how traits work that hasn't been resolved for six years, that can get brushed off as merely a compiler bug.

Is it getting brushed off as merely a compiler bug? At least if I'm thinking of the same bug as you [0] the discussion there seems to be more along the lines of the devs treating it as a "proper" language issue, not a compiler bug. At least as far as I can tell there hasn't been a resolution to the design issue, let alone any work towards implementing a fix in the compiler.

The soundness issue that I see more frequently get "brushed off as merely a compiler bug" is the lifetime variance one underpinning cve-rs [1], which IIRC the devs have long decided what the proper behavior should be but actually implementing said behavior is blocked behind some major compiler reworks.

> has made a lot of backwards-incompatible changes

Not sure I've seen much evidence for "a lot" of compatibility breaks outside of the edition system. Perhaps I'm just particularly (un)lucky?

> because there is no such thing as a design issue in "the Rust language"

I'm not sure any of the Rust devs would agree? Have any of them made a claim along those lines?

[1]: https://github.com/Speykious/cve-rs

2 days ago

[-]

> Is it getting brushed off as merely a compiler bug?

Yes, this thread contains an example: https://news.ycombinator.com/item?id=45587209 . (I linked the same bug you did in the comment that that's a reply to.)

The Rust team may see this as a language design issue internally, and I'd be inclined to agree. Rust's outward-facing marketing does not reflect this view.

2 days ago

[-]

> I linked the same bug you did in the comment that that's a reply to

Ah, my apologies. Not sure exactly how I managed to miss that.

That being said, I guess I might have read that bit of your comment different than you had in mind; I was thinking of whether the Rust devs were dismissing language design issues as compiler bugs, not what third parties (albeit one with an unusually relevant history in this case) may think.

> Rust's outward-facing marketing does not reflect this view.

As above, perhaps I interpret the phrase "outward-facing marketing" differently than you do. I typically associate that (and "marketing" in general, in this context) with more official channels, whether that's official posts or posts by active devs in an official capacity.

2 days ago

[-]

Oh, I didn't realize steveklabnik wasn't an official member of the project anymore (as of 2022 apparently: https://blog.rust-lang.org/2022/01/31/changes-in-the-core-te... ). I do think he still expressed this position back when he was a major public face of the language, but it seems unfair to single him out and dig through his comment history.

Rust's marketing is pretty grassroots in general, but even current official sources like https://rust-lang.org/ say things like "Rust’s rich type system and ownership model guarantee memory-safety" that are only true of the vague-ideal "Rust language" and are not true of the type system they actually designed and implemented in the Rust compiler.

2 days ago

[-]

Yeah, Steve has been "just" a well-informed third party for a while now. I would be curious if he has commented on that specific issue before; usually when unsoundness comes up it's cve-rs which is mentioned.

> but even current official sources like https://rust-lang.org/ say things like "Rust’s rich type system and ownership model guarantee memory-safety" that are only true of the vague-ideal "Rust language" and are not true of the type system they actually designed and implemented in the Rust compiler.

That's an understandable point, though I think something similar would arguably still apply even if Rust had a "proper" spec since a "proper" spec doesn't necessarily rule out underspecification/omissions/mistakes/etc, both in the spec and in the implementation. A "real" formal spec à la WebAssembly might solve that issue, but given the lack of time/resources for a "normal" spec at the time a "real" one would have been a pipe dream at best.

That being said, I think it's an interesting question as to what should be done if/when you discover an issue like the trait coherence one, whether you have a spec or not. "Aspirational" marketing doesn't exactly feel nice, but changing your marketing every time you discover/fix a bug also doesn't exactly feel nice for other reasons.

Bit of a fun fact - it appears that the particular trait coherence issue actually has existed in some form since Rust 1.0, and was only noticed a few years later when the issue was filed. Perhaps a proper specification effort would have caught it (especially since one of the devs said they had concerns when implementing a relevant check), but given it had taken that long to discover I wouldn't be too surprised if it would have been missed anyway.

2 days ago

[-]

I agree that it's a tough situation. "The type system guarantees memory safety" is an extremely important pillar of Rust's identity. They kind of have to portray all soundness issues as "more compiler bugs than something broken in the language itself" (see eg https://news.ycombinator.com/item?id=21930599 which references a GitHub label that AIUI would've included the trait coherence thing at the time) to keep making that claim. It is a core part of the marketing strategy.

2 days ago

[-]

Yes, so there's a few things going on here: the first is, I absolutely pattern matched on the cve-rs link. Most people bringing that up are trying to bring up a quick gotcha. I did not follow the first link, I assumed it was to that. I am not educated on that specific bug at all.

I still ultimately think that the framing of Rust being any different than other languages here is actively trying to read the worst into things; Rust is working on having a spec, and formally proving things out. This takes a long time. But it's still ongoing. That doesn't mean Rust marketing relies on lying, I don't think most people even understand "soundness" at all, let alone assume that when Rust says "there's no UB in safe code" or similar that there's a promise of zero soundness bugs or open questions. That backwards incompatible changes are made in spite of breaking code at times to fix soundness issues is an acknowledgement of how sometimes there are in fact bugs, this doesn't change that for virtually all Rust users most of the time, updating the compiler is without fanfare, and so in practice, it is backwards compatible. I have heard of people struggling to update their C or C++ compilers to new standards, that doesn't mean that those languages are horribly backwards incompatible, just that there is a spectrum here, and being on one side of it as close as realistically possible doesn't mean that it's a lie.

But, regardless of all of that, it does appear that the issue you linked specifically may be not just a bug, but a real issue. That's my bad, and I'll try to remember that specific bug in the future.

1 day ago

[-]

> They kind of have to portray all soundness issues as "more compiler bugs than something broken in the language itself" [] to keep making that claim.

That's part of the "interesting question" I referred to in my comment. There's probably multiple factors that go into the decision of what to put onto the front page, and the presence/absence of soundness issues is just one of those factors.

wrs

2 days ago

[-]

For a little more color on 5, as a user of no_std Rust on embedded processors I use crates like heapless or trybox that provide Vec, String, etc. APIs like the std ones, but fallible.

Of course, two libraries that choose different no_std collection types can't communicate...but hey, we're comparing to C here.

2 days ago

[-]

even OOM isn't that different

like there are some things you can well in C

and this things you can do in rust too, through with a bit of pain and limitations to how you write rust

and then there is the rest which looks "hard but doable" in C, but the more you learn about it the more it's a "uh wtf. nightmare" case where "let's kill+restart and have robustness even in presence of the process/error kernel dying" is nearly always the right answer.

QuiEgo

2 days ago

[-]

> Rust has had ten years since 1.0. It changes in backward compatible ways. For some people, they want no changes at all, so it’s important to nail down which sense is meant.

I’d love to see rust be so stable that MSRV is an anachronism. I want it to be unthinkable you wouldn’t have any reason not to support Rust from forever ago because the feature set is so stable.

2 days ago

[-]

> I want it to be unthinkable you wouldn’t have any reason not to support Rust from forever ago because the feature set is so stable.

What other languages satisfy this criteria?

QuiEgo

2 minutes ago

[-]

This is extremely common in the embedded and systems programming space, which Rust is otherwise attractive to use in.

For example, a very popular library in the systems/embedded space, cjson, works with C89. FreeRTOS works with C99. This is a very common pattern in major libraries. It is very rare that taking a dependency could force you to update your toolchain in the embedded space.

Toolchain updates invalidate your entire testing history and make you revalidate everything, which is often a giant PITA when your testing involves physical things (ex you may have to plug and unplug things, put the system in chambers that replicate certain environmental conditions, etc).

2 days ago

[-]

Fortran, cobol, C or other old languages that stopped changing but are still used.

2 days ago

[-]

All three of the languages you list are still actively updated. Coincidentally, the latest standard for all three of them is from 2023(ish):

- C23: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3096.pdf

- Cobol 2023: https://www.incits.org/news-events/news-coverage/available-n... (random press release since a PDF of the standard didn't immediately show up in a search)

- Fortran 2023: https://wg5-fortran.org/N2201-N2250/N2212.pdf

C2Y has a fair number of already-accepted features as well and it's relatively early in the standard release cycle: https://thephd.dev/c2y-hitting-the-ground-running

2 days ago

[-]

Can’t compile with just a PDF file, though.

2 days ago

[-]

Yes, compilers will take some time to implement the new standards.

C23 seems to have decent support from a few compilers, with GCC leading the pack: https://en.cppreference.com/w/c/compiler_support/23.html

gcobol supports (or at least aims to support?) COBOL 2023: https://gcc.gnu.org/onlinedocs/gcc-15.1.0/gcobol/gcobol.html. Presumably there are other compilers working on support as well.

Intel's Fortran compiler and LFortran have partial support for Fortran 2023 (https://www.intel.com/content/www/us/en/developer/articles/t..., https://docs.lfortran.org/en/usage/). I'd guess that support both from these compilers and from other compilers (Flang?) would improve over time as well..

2 days ago

[-]

Flang does already support warnings for features that have breaking changes in F’2023.

1 day ago

[-]

Glad to hear it! Didn't turn up anything obvious in my brief search, but I was expecting some work towards newer standards at least.

1 day ago

[-]

Why would you expect that? There are few incentives to work on features that nobody is using.

1 day ago

[-]

Because given the work needed to actually get a feature into the standard I'd assume there's at least some demand motivating the addition.

I'd also be a bit hesitant about claiming that nobody is using said features. It's quite possible that the "new" feature is actually "just" a standardization of something that exists in practice. WG14 is hardly a stranger to that kind of thing, from what I understand; it wouldn't surprise me if something similar occurs for the Fortran/COBOL working groups as well.

1 day ago

[-]

That's a reasonable assumption, but it turns out to rarely be the case with WG5/SC22/J3. There are exceptions (like SIND, SINPI, &c.), but most new features in Fortran standards are committee inventions without prototypes, reference implementations, or official conformance test suites. This leads to incompatible implementations, and then lack of use in codes that need to be portable. It's a mess, really. I have a test suite of examples of this kind of thing that accumulated during the implementation of flang-new, in which it was a difficult task to figure out what portable Fortran really means in a way that's meaningful to users.

1 day ago

[-]

I'll defer to your expertise here. If C++'s experience with similar situations is any indication it really does not seem like a fun spot to be in.

Does make me curious how the dynamics of the Fortran/COBOL committees differs from that of the C/C++ committees.

casparvitch

2 days ago

[-]

Why can't `if condition { panic(err) }' be used in go as an assert equivalent?

2 days ago

[-]

Because C's assert gets compiled out if you have NDEBUG defined in your program. How do you do conditional compilation in Go (at the level of conditionally including or not including a statement)?

https://stackoverflow.com/questions/36703867/golang-preproce...

echoangle

2 days ago

[-]

> How do you do conditional compilation in Go (at the level of conditionally including or not including a statement)?

Wouldn't this work? Surely the empty function would be removed completely during compilation?

2 days ago

[-]

That builds or doesn't an entire file. Assert works as a statement. There is not an equivalent in Go to conditionally removing just a statement in a function based on a compile time option.

echoangle

2 days ago

[-]

Can’t you include or not include a function that contains a single assert, and depending on the condition, the function call is removed or included?

2 days ago

[-]

That defeats the point of asserts. Now you have two copies to keep in sync with each other, whereas asserts are inline with the rest of your code and you have one file that can be built with or without them. They could use a separate tool to produce the assert free version, but that adds tooling beyond what Go provides. Nearly every mainstream language allows you to do this without any extra steps, except Go.

casparvitch

2 days ago

[-]

Ah apologies I misunderstood, thanks

2 days ago

[-]

It's kinda sad to read as most of their arguments might seem right at first but if put under scrutiny really fall apart.

Like why defend C in 2025 when you only have to defend C in 2000 and then argue you have a old, stable, deeply tested, C code base which has no problem with anything like "commonly having memory safety issues" and is maintained by a small group of people very highly skilled in C.

Like that argument alone is all you need, a win, simple straight forward, hard to contest.

But most of the other arguments they list can be picked apart and are only half true.

25 minutes ago

[-]

At the very least, I find this argument to be fairly reasonable:

> Safe languages usually want to abort if they encounter an out-of-memory (OOM) situation. SQLite is designed to recover gracefully from an OOM. It is unclear how this could be accomplished in the current crop of safe languages.

I don't think most Rust code written today has guardrails in the case of OOM. I don't think this disqualifies Rust for most things, because I happen to find the trade-off worth it compared to the things it does protect against that C doesn't, but I don't think it's a particularly controversial take that Rust still could use some ergonomic improvements around handling allocation failures. Right now, trying to create a Box or Vec can theoretically fail at runtime if no memory is available, and those failures aren't returned from the functions called to create them. Handling panics is something you can normally do in Rust, but if you're already OOM, things get complicated pretty fast.

I agree that in the long run it would probably make sense to have something like this in Rust when eventually the current maintainers aren't around, but I also don't think it makes much sense to criticize them for continuing to maintain the code that already exists.

mungaihaha

2 days ago

[-]

> But most of the other arguments they list can be picked apart and are only half true

I'd like to see you pick the other arguments apart

2 days ago

[-]

> Other programming languages sometimes claim to be "as fast as C". But no other language claims to be faster than C for general-purpose programming, because none are.

Not OP, And I'm not really arguing with the post, but this struck me as a really odd thing to include in the article. Of course nothing is going to be faster then C, because it compiles straight to machine code with no garbage collection. Literally any language that does the same will be the same speed but not faster, because there's no way to be faster. It's physically impossible.

A much better statement, and one inline with the rest of the article, would be that at the time C and C++ were really the only viable languages that gave them the performance they wanted, and C++ wouldn't have given them the interoperability they wanted. So their only choice was C.

2 days ago

[-]

> Literally any language that does the same will be the same speed but not faster, because there's no way to be faster. It's physically impossible.

There is nothing special about C that makes this true. C has semantics, just like any language, that are higher level than assembly, and sometimes, those semantics make the code slower than other languages that have different semantics.

Consider this C function:

    void redundant_store(int *a, int *b) {
        int t = *a + 1;
        *a = t;
        *b = 0; // May clobber *a if (b == a)
        *a = t;
    }

Because a and b may point to the same address, you get this code (on clang trunk):

  redundant_store:
          mov     eax, dword ptr [rdi]
          inc     eax
          mov     dword ptr [rdi], eax
          mov     dword ptr [rsi], 0
          mov     dword ptr [rdi], eax
          ret

That fifth line there has to be kept in, because the final `*a = t;` there is semantically meaningful; if a == b, then a is also set to 0 on line four, and so we need to reset it to t on line five.

Consider the Rust version:

    pub fn redundant_store(a: &mut i32, b: &mut i32) {
        let t = *a + 1;
        *a = t;
        *b = 0; // a and b must not alias, so can never clobber
        *a = t;
    }

You get this output (on Rust 1.90.0):

  redundant_store:
          mov     eax, dword ptr [rdi]
          inc     eax
          mov     dword ptr [rsi], 0
          mov     dword ptr [rdi], eax
          ret

Because a and b can never alias, we know that the extra store to *a is redundant, as it's not possible for the assignment to *b to modify *a. This means we can get rid of this line.

Sure, eliminating one single store isn't going to have a meaningful difference here. But that's not the point, the point is that "it's not possible to be faster than C because C is especially low level" just simply isn't true.

2 days ago

[-]

Leave it to hackernews to be overly pedantic.

Yes, obviously different languages will produce different assembly based on the language semantics. So you will get performance differences. And it's certainly possible for code written in Rust or Zig or Odin to be more performant then C code depending on how it's written.

My point was about classes of languages (for lack of a better term). Loosely, from fastest to slowest you have:

1. Compiled languages (meaning straight to machine code) with manual memory management

2. Compiled languages with garbage collection

3. Languages that run in VMs but are AOT compiled or JITed

4. Purely interpreted languages.

I acknowledge that not all languages fit nicely into these and there will be exceptions, but it's a convenient mental model that's close enough for these purposes.

Languages in the first category are going to be the most performant. Obviously there will be some variation between them based on how the code is written, but unless it's written really poorly it's not going to drop into an entirely different category. Where as languages in other categories are going to be far more difficult if not impossible to get close to the same kind of performance.

And there is no meaningfully huge jumps left after the first group. We are all the way down at optimizing assembly code, and that's where you start to hit physical limitations. Some number of operations have to be executed and the CPU can only execute them so fast.

2 days ago

[-]

I agree with you that there are broad, loose categories. The only thing I object to is splitting C out from category 1 and putting it into some kind of category 0.

1 day ago

[-]

I'm not? I'm literally saying C, Rust, Zig, Odin, and any other manually memory managed language that compiles straight to machine code is the fastest category you can have, because at that point your bumping against literal hardware limitations. There is no category below them in terms of performance.

"None faster" means you can't just change languages like you could from Java to C (assuming you can write quality code in both) and see a substantial performance boost.

1 day ago

[-]

> Of course nothing is going to be faster then C, ... because there's no way to be faster. It's physically impossible.

This reads to me as if you're saying C is in a class of its own. That may not be what you meant! But it's what I understood. C is the fastest language, period, and others may approach its speed (which is the ... part) but cannot surpass it. This is different than something like "C, Rust, Zig, and Odin are roughly the fastest languages."

Anyway, it's all good, we understand each other now. Sorry for appearing overly pedantic.

1 day ago

[-]

Fair. I was using C as short hand for that entire class of languages. I could have been clearer.

tialaramex

2 days ago

[-]

"Because none are" is a particularly hollow claim because to support it you have to caveat things so heavily.

You have to say OK, I allow myself platform specific intrinsics and extensions even though those aren't standard ISO C, and that includes inline assembler. I can pick any compiler and tooling. And I won't count other languages which are transpiled to C for portability because hey in theory I could just write that C myself, couldn't I so they're not really faster.

At the end you're basically begging the question. "I claim C is fastest because I don't count anything else as faster" which is no longer a claim worth disputing.

The aliasing optimisations in Fortran and Rust stand out as obvious examples where to get the same perf in C requires you do global analysis (this is what Rust is side-stepping via language rules and the borrowck) which you can't afford in practice.

But equally the monomorphisation in C++ or Rust can be beneficial in a similar way, you could in principle do all this by hand in your C project but you won't, because time is finite, so you live without the optimisations.

2 days ago

[-]

I think one additional factor that should be taken into account is the amount of effort required to achieve a given level of performance, as well as what extensions you're willing to accept. C with potentially non-portable constructs (intrinsics, inline assembly, etc.) and an unlimited amount of effort put into it provides a performance ceiling, but it's not inconceivable that other programming languages could achieve an equal level of performance with less effort, especially if you compare against plain standard C. Languages like ISPC that expose SIMD/parallelism in a more convenient manner is one example of this.

Another somewhat related example is Fortran and C, where one reason Fortran could perform better than C is the restrictions Fortran places on aliasing. In theory, one could use restrict in C to replicate these aliasing restrictions, but in practice restrict is used fairly sparingly, to the point that when Rust tried to enable its equivalent it had to back out the change multiple times because it kept exposing bugs in LLVM's optimizer.

Deanoumean

2 days ago

[-]

The argument you propose only works for justifying a maintenance mode for and old codebase. If you want to take the chance to turn away new developers from complex abominations like C++ and Rust and garbage collected sloths like Java and get them to consider a comparatively simple but ubiquitous language that is C, you have to offer more.

dangus

2 days ago

[-]

Is SQLite looking for new developers? Will they ever need a large amount of developers like a mega-corp that needs to hire 100 React engineers?

2 days ago

[-]

No, but as morbid as this sounds, the three(?) devs one day will pass away so now what?

sgbeal

2 days ago

[-]

> No, but as morbid as this sounds, the three(?) devs...

Two full-time core devs and three part-time "peripheral" devs.

> ... one day will pass away ...

And not a one of us are young :/.

dangus

2 days ago

[-]

Well the point is that it’s not hard to find 3 people who are C experts. Yes, even young ones.

2 days ago

[-]

Then the rights will be sold to a FAANG or an open souce fork like libSQL will live on.

colejohnson66

2 days ago

[-]

SQLite is public domain (as much as is legally possible). So there's no "rights" to "sell" except the trademark.

2 days ago

[-]

The testing suite is not open, which is one of the most important part of the project.

skywhopper

2 days ago

[-]

I assume they have written this extensive document with lots of details in response to two and a half decades of thousands of “why not rewrite in X?” questions they’ve had to endure.

cmrx64

2 days ago

[-]

(it’s from 2017)

sema4hacker

2 days ago

[-]

"Why SQLite is coded in C..." is an explanation, as documented at sqlite.org.

"Why is SQLite coded in C and not Rust?" is a question, which immediately makes me want to ask "Why do you need SQLite coded in Rust?".

lifthrasiir

2 days ago

[-]

Because the title has been editorialized.

t14n

2 days ago

[-]

fwiw there's a project doing just that: https://github.com/tursodatabase/turso

they have a blog hinting at some answers as to "why": https://turso.tech/blog/introducing-limbo-a-complete-rewrite...

urbandw311er

2 days ago

[-]

Indeed. Why is SQLite coded in C and not BASIC?

wodenokoto

2 days ago

[-]

I think it’s more interesting that DuckDB is written in C++ and not rust than SQLite.

SQLite is old, huge and known for its gigantic test coverage. There’s just so much to rewrite.

DuckDB is from 2019, so new enough to jump on the “rust is safe and fast”

tomjakubowski

2 days ago

[-]

If I'm remembering a DuckDB talk I attended correctly, they chose C++ because they were most confident in their ability to write clear code in it which would be autovectorized by the compilers they were familiar with. Rust in 2019 didn't have a clear high level SIMD story yet and the developers (wisely) did not want to maintain handrolled SIMD code.

xiphias2

11 hours ago

[-]

I'm not sure it has changed.

Maybe autovectorization works, but can I just write a few ARM64 instructions on my Mac in Rust stable (notcexperimental/nightly) as I can do it in C/C++ by just including a few ARM specific header files?

jandrewrogers

2 days ago

[-]

If maximum performance is a top objective, it is probably because C++ produces faster binaries with less code. Modern C++ specifically also has a lot of nice compile-time safety features, especially for database-like code.

wodenokoto

2 days ago

[-]

I can’t verify those claims one way or another, but I’m interested to hear why they were downvoted.

jandrewrogers

2 days ago

[-]

I've worked on a couple different projects that did substantial parallel development in C++20 and Rust, which created interesting opportunities for concrete comparison. It was performance-engineered code and we needed to validate their equivalence by testing them against each other.

The practical differences are larger than the theoretical differences, so I would expect the gap to diminish over time.

Rust reminded me of when I used to write database engines in Java. It required a lot more code, which has its own costs, but never really delivered on claims of comparable performance. The "more code" part largely comes down to the more limited ability to build good abstractions compared to C++20 and more limited composability. The "slower binaries" part comes down to worse codegen, which you can't blame on Rust per se, and a lot of extra overhead introduced in the code to satisfy the Rust safety model that would simply not be required in other systems languages.

Safety is a mixed bag. Rust can check several things at compile-time that C++20 cannot. C++20 can check several things at compile-time that Rust cannot.

For high-performance database-y code, memory is allocated at startup and is accessed via managed index handles. Rust does the same thing. In these types of memory models, i.e. no dynamic allocation and no raw pointers, both Rust and C++20 offer similar memory safety guarantees. Most high-performance software is thread-per-core that is almost purely single-threaded, so thread-safety concerns are limited.

That said, stripping away all of the above, the only real advantage that C++20 has its much more powerful toolset for building abstractions. Its performance and unique safety elements are based almost entirely on the ability to build concise, contextual, and highly composable abstractions as needed. This is not a feature that should be downplayed, I immediately miss it when I use most other languages.

infinite8s

1 day ago

[-]

As someone coming back to C++ after more than a decade away, do you have any recommended resources on C++20 or open source projects you've seen that utilize the language this way?

2 days ago

[-]

if they write it on modern C++ then its alright tbh

unsungNovelty

2 days ago

[-]

As I write more code, use more software and read about rewrites...

The biggest gripe I have with a rewrite is... A lof of the time we rewrite for feature parity. Not the exact same thing. So you are kind ignoring/missing/forgetting all those edge cases and patches that were added along the way for so many niche or otherwise reasons.

This means broken software. Something which used to work before but not anymore. They'll have to encounter all of them again in the wild and fix it again.

Obviously if we are to rewrite an important piece of software like this, you'd emphasise more on all of these. But it's hard for me to comprehend whether it will be 100%.

But other than sqlite, think SDL. If it is to be rewritten. It's really hard for me to comprehend that it's negligible in effect. Am guessing horrible releases before it gets better. Users complaining for things that used work.

C is going to be there long after the next Rust is where my money is. And even if Rust is still present, there would be a new Rust then.

So why rewrite? Rewrites shouldn't be the default thinking no?

2 days ago

[-]

Two previous, and substantial, discussions on this page:

https://news.ycombinator.com/item?id=28278859 - August 2021

https://news.ycombinator.com/item?id=16585120 - March 2018

https://web.archive.org/web/20210825025834/https%3A//www.sql...

bravura

2 days ago

[-]

I'm curious about tptacek's comment (https://news.ycombinator.com/item?id=28279426). 'the "security" paragraphs in this page do the rest of the argument a disservice. The fact is, C is a demonstrable security liability for sqlite.'

The current doc no longer has any paragraphs about security, or even the word security once.

The 2021 edition of the doc contained this text which no longer appears: 'Safe languages are often touted for helping to prevent security vulnerabilities. True enough, but SQLite is not a particularly security-sensitive library. If an application is running untrusted and unverified SQL, then it already has much bigger security issues (SQL injection) that no "safe" language will fix.

It is true that applications sometimes import complete binary SQLite database files from untrusted sources, and such imports could present a possible attack vector. However, those code paths in SQLite are limited and are extremely well tested. And pre-validation routines are available to applications that want to read untrusted databases that can help detect possible attacks prior to use.'

vincent-manis

2 days ago

[-]

The point about bounds checking in `safe' languages is well taken, it does prevent 100% test coverage. As we all agree, SQLite has been exhaustively tested, and arguments for bounds checking in it are therefore weakened. Still, that's not an argument for replicating this practice elsewhere, not unless you are Dr Hipp and willing to work very hard at testing. C.A.R. Hoare's comment on eliminating runtime checks in release builds is well-taken here: “What would we think of a sailing enthusiast who wears his life-jacket when training on dry land but takes it off as soon as he goes to sea?”

I am not Dr Hipp, and therefore I like run-time checks.

daxfohl

2 days ago

[-]

It sounds like the core doesn't even allocate, and presumably the extended library allocates in limited places using safe patterns. So there wouldn't be much benefit from Rust anyway, I'd think. Had SQLite ever had a memory leak or use-after-delete bug on a production release? If so, that answers the question. But I've never heard of one.

Also, does it use doubly linked lists or graphs at all? Those can, in a way, be safer in C since Rust makes you roll your own virtual pointer arena.

thinkharderdev

2 days ago

[-]

> Also, does it use doubly linked lists or graphs at all? Those can, in a way, be safer in C since Rust makes you roll your own virtual pointer arena.

You can implement a linked list in Rust the same as you would in C using raw pointers and some unsafe code. In fact there is one in the standard library.

2 days ago

[-]

Rust’s memory safety guarantees aren’t exclusive to hep allocation. In fact, the language doesn’t heap allocate at all.

You can write a linked list the same way you would in C if you wish.

2 days ago

[-]

> Had SQLite ever had a memory leak or use-after-delete bug on a production release?

sure, it's an old library they had pretty much anything (not because they don't know what they are doing but because shit happens)

lets check CVEs of the last few years:

- CVE-2025-29088 type confusion

- CVE-2025-29087 out of bounds write

- CVE-2025-7458 integer overflow, possible in optimized rust but test builds check for it

- CVE-2025-6965 memory corruption, rust might not have helped

- CVE-2025-3277 integer overflow, rust might have helped

- CVE-2024-0232 use after free

- CVE-2023-36191 segmentation violation, unclear if rust would have helped

- CVE-2023-7104 buffer overflow

- CVE-2022-46908 validation logic error

- CVE-2022-35737 array bounds overflow

- CVE-2021-45346 memory leak

...

as you can see the majority of CVEs of sqlite are much less likely in rust (but a rust sqlite impl. likely would use unsafe, so not impossible)

as a side note there being so many CVEs in 2025 seem to be related to better some companies (e.g. Google) having done quite a bit of fuzz testing of SQLite

other takeaways:

- 100% branch coverage is nice, but doesn't guarantee memory soundness in C

- given how deeply people look for CVEs in SQLite the number of CVEs found is not at all as bad as it might look

but also one final question:

SQLite uses some of the best C programmers out there, only they merge anything to the code, it had very limited degree of change compared to a typical company project. And we still have memory vulnerabilities. How is anyone still arguing for C for new projects?

daxfohl

2 days ago

[-]

Wow that's a great analysis!

Yeah I essentially agree. I'm sure there are still plenty of good cases for C, depending on project size, experience of the engineers, integration with existing libraries, target platform, etc. But it definitely seems like Rust would be the better option in scenarios where there's not some a priori thing that strongly skews toward or forces C.

oguz-ismail

2 days ago

[-]

> How is anyone still arguing for C for new projects?

It just works

16 minutes ago

[-]

If your definition of "works" includes out of bounds memory access, use after free, etc., then yes. If your definition does not include those, then it demonstrably does not.

Alternately, maybe there's a spectrum of undesirable behaviors, some of which are preventable by choice of language, some of which aren't, and trying to reduce a complex set of tradeoffs to a simple binary of whether it "just works" only restates the conclusion someone has already come to because you need to actually reason about those tradeoffs to come to an informed decision of where to implicitly draw the line in the first place.

2 days ago

[-]

That list alone sounds like it does not work.

2 days ago

[-]

As long as it is possible to produce a OOB in something as simple as a matrix transpose, Rust also does not work: https://rustsec.org/advisories/RUSTSEC-2023-0080.html.

12 minutes ago

[-]

And something as simple as a for loop to iterate over an array of elements with an off-by-one error can cause undefined behavior in C. Let's not pretend that there's some universally-agreed-upon hierarchy of what types of bugs are unconscionable and which ones are unfortunate unavoidable facts of life just because certain ones existed in the older language and others didn't.

dwattttt

2 days ago

[-]

While a package with 10 million all-time downloads is nothing to sneeze at, it's had one memory corruption bug reported in its ~7 year life.

It's being compared to a C library that's held to extremely high standards, yet this year had two integer overflow CVEs and two other memory corruption CVEs.

SQLite is a lot more code, but it's also been around a lot longer.

2 days ago

[-]

The point is that matrix transpose should be trivial. But my main point really is that looking at CVEs is just nonsense. In both cases it is is a rather meaningless.

2 days ago

[-]

except that if you read into the actual issue you will realize that transposing matrices high performant is surprisingly not trivial, e.g. see this code code: https://github.com/ejmahler/transpose/blob/e70dd159f1881d86a...

furthermore the issue at core was an integer overflow, which is tricky in all languages and e.g. has poppet up on HN recently in context of "proven correct" code still having bugs (because the prove didn't use finit precision integers)

it's also less tricky in rust then in C due to no implicit casts and debug build checking for integer overflows and tests normally running against debug builds

Projects do sometimes enable it even on release builds for security sensitive code(1).

so if anything the linked issue is in favor of using rust over C while acting as a reminder that no solution is perfect

(1): It comes at a high performance cost, but sometimes for some things it's an acceptable cost. Also you can change such setting per crate. E.g. at a company I worked at a few years ago we did build some sensitive and iffy but not hot parts always with such checks enabled and some supper hot ML parts always with optimizations enabled even for "debug/test" builds.

2 days ago

[-]

This is ignoring the elephant in the room: SQLite is being rewritten in Rust and it's going quite well. https://github.com/tursodatabase/turso

It has async I/O support on Linux with io_uring, vector support, BEGIN CONCURRENT for improved write throughput using multi-version concurrency control (MVCC), Encryption at rest, incremental computation using DBSP for incremental view maintenance and query subscriptions.

Time will tell, but this may well be the future of SQLite.

3eb7988a1663

2 days ago

[-]

It should be noted that project has no affiliation with the SQLite project. They just use the name for promotional/aspirational purposes. Which feels incredibly icky.

Also, this is a VC backed project. Everyone has to eat, but I suspect that Turso will not go out of its way to offer a Public Domain offering or 50 year support in the way that SQLite has.

2 days ago

[-]

> They just use the name for promotional/aspirational purposes. Which feels incredibly icky.

The aim is to be compatible with sqlite, and a drop-in replacement for it, so I think it's fair use.

> Also, this is a VC backed project. Everyone has to eat, but I suspect that Turso will not go out of its way to offer a Public Domain offering or 50 year support in the way that SQLite has.

It's MIT license open-source. And unlike sqlite, encourages outside contribution. For this reason, I think it can "win".

frumplestlatz

2 days ago

[-]

Calling it “SQLite-compatible” would be one thing. That’s not what they do. They describe it as “the evolution of SQLite”.

It’s absolutely inappropriate and appropriative.

They’ve been poor community members from the start when they publicized their one-sided spat with SQLite over their contribution policy.

The reality is that they are a VC-funded company focused on the “edge database” hypetrain that’s already dying out as it becomes clear that CAP theorem isn’t something you can just pretend doesn’t exist.

It’ll very likely be dead in a few years, but even if it’s not, a VC-funded project isn’t a replacement for SQLite. It would take incredibly unique advantages to shift literally the entire world away from SQLite.

It’s a new thing, not the next evolution of SQLite.

penberg

12 hours ago

[-]

The actual reality is that I personally started the project because its synchronous architecture is holding back performance. You can read all about it in https://penberg.org/papers/penberg-edgesys24.pdf. The design is literally the next evolution of SQLite's architecture.

1 day ago

[-]

The founders came from ScyllaDb, I wouldn't be so quick to count them out. The repo has a lot of contributors and traction. As long as the company survives, I think it has a bright future.

penberg

12 hours ago

[-]

The project is MIT licensed with a growing community of contributors. It does not even matter how long the company lives, all that matters is that some of the core contributors live.

frumplestlatz

1 day ago

[-]

So they have a history of using the legitimacy, trust, and infrastructure of the open-source ecosystem to grow adoption and contributions, then gradually shifting constraints in favor of monetization and control?

Either way, the math is different this time. SQLite isn’t heavy server-side software written in Java with weaknesses that leave it obviously open for market disruption.

It’s also a public domain gift to the world that literally everything has deployed — often in extremely demanding and complex environments.

I work for a major consumer product manufacturer, and I can guarantee that we will not be switching away from SQLite anytime soon, and if we ever do, it will not be to a VC-backed project with a history like this one has, no matter how much hype startup bros try to create around the idea of disrespectful and appropriative disruption.

VC-funded ‘open’ databases almost always follow the same arc: borrow legitimacy, capture attention, then fence it off. It’s the inevitability of the incentives they’ve chosen.

penberg

12 hours ago

[-]

You're not wrong about VC-funded database arc, but what history are you even talking about?

I am sure you understand that I have absolutely nothing to do with Scylla's licensing. I have not worked there for four years nor was I ever in a position there that I would even had that opportunity to influence such decisions.

I am also sure you understand that Scylla's development model was completely different: they had AGPL license and contributors had to sign a CLA, which is why they were able to relicense in the first place. Turso is MIT licensed and there's no barrier to contributing and, therefore, already a much bigger contributor base.

I fully understand the scepticism, but you're mistaken about the open source history of Turso's founders.

blibble

2 days ago

[-]

> The aim is to be compatible with sqlite, and a drop-in replacement for it, so I think it's fair use.

try marketing your burger company as "The Next Evolution of McDonalds" and see what happens

4 minutes ago

[-]

This might be a compelling argument if McDonalds were the name of a public domain project rather than a trademarked corporation

matinsjoe54

4 minutes ago

[-]

Hire an Expert Hacker To Recover Your Lost Ethereum And Crypto Wallet/HIRE OPTIMUM HACKERS RECOVERY I Would Like To Take This opportunity To Express My Gratitude To OPTIMUM HACKERS RECOVERY For Using Their Hacking Expertise To Return My Stolen Cryptocurrency, Which Was Valued At $450,000. I Tried It And Was Sceptical, Optimum Hackers Recovery Helped Me And It Worked. I Got My Money Back. I’m Pleased I Found Them Early Since I Feared Never Get My Money Back From Those Fraudulent Internet Investments. Now I Am Grateful To Thier Service Rendering. You Can Also Reach Them On Email: support@optimumhackersrecovery.com WhatsApp: + 1-256-256-8636 Website: https://optimumhackersrecovery.com

assimpleaspossi

2 days ago

[-]

>>SQLite is being rewritten in Rust

SQLite is NOT being rewritten in Rust!

>>Turso Database is an in-process SQL database written in Rust, compatible with SQLite.

2 days ago

[-]

It's a ground up rewrite. It's not an official rewrite, if that's what you mean. Words are hard.

stoltzmann

2 days ago

[-]

So a reimplementation, not a rewrite.

2 minutes ago

[-]

How does one implement software other than writing it?

1 day ago

[-]

That's a better word, thank you

blibble

2 days ago

[-]

> Time will tell, but this may well be the future of SQLite.

turdso is VC funded so will probably be defunct in 2 years

daxfohl

2 days ago

[-]

Or, so it's being written mostly by AI.

2 days ago

[-]

Could also be an outcome. It is MIT open-source though.

lionkor

2 days ago

[-]

So they have much worse test coverage than sqlite

zvmaz

2 days ago

[-]

In the link you provided, this is what I read: "An in-process SQL database, compatible with SQLite."

Compatible with SQLite. So it's another database?

simonw

2 days ago

[-]

Yeah, I don't think it even counts as a fork - it's a ground-up re-implementation which is already adding features that go beyond the original.

ForHackernews

2 days ago

[-]

It's a fork and a rewrite.

2 days ago

[-]

so its sqlite++ since they added bunch of things on top of that

2 days ago

[-]

The moment turso becomes stable , SQLite will inevitably fade away with time if they don’t rethink how contributions should be taken. I honestly believe the Linux philosophy of software development will be what catapults turso forward.

matt3210

2 days ago

[-]

I can compile c anywhere and for any processor, which can’t be said for rust

mikece

2 days ago

[-]

The fact that a C library can easily be wrapped by just about any language is really useful. We're considering writing a library for generating a UUID (that contains a key and value) for reasons that make sense to us and I proposed writing this in C so we could simply wrap it as a library for all of the languages we use internally rather than having to re-implement it several times. Not sure if we'll actually build this library but if we do it will be in C (I did managed to get the "wrap it for each language" proposal pre-approved).

01HNNWZ0MV43FF

2 days ago

[-]

It is. You can also write it in C++ or Rust and expose a C API+ABI, and then you're distributing a binary library that the OS sees as very similar to a C library.

Occasionally when working in Lua I'd write something low-level in C++, wrap it in C, and then call the C wrapper from Lua. It's extra boilerplate but damn is it nice to have a REPL for your C++ code.

Edit: Because someone else will say it - Rust binary artifacts _are_ kinda big by default. You can compile libstd from scratch on nightly (it's a couple flags) or you can amortize the cost by packing more functions into the same binary, but it is gonna have more fixed overhead than C or C++.

bsder

2 days ago

[-]

> It is. You can also write it in C++ or Rust and expose a C API+ABI, and then you're distributing a binary library that the OS sees as very similar to a C library.

If I want a "C Library", I want a "C Library" and not some weird abomination that has been surgically grafted to libstdc++ or similar (but be careful of which version as they're not compatible and the name mangling changes and ...).

This isn't theoretical. It's such a pain that the C++ folks started resorting to header-only libraries just to sidestep the nightmare.

2 days ago

[-]

Rust libraries also impose an - in my opinion - unacceptable burden to the open source ecosystem: https://www.debian.org/releases/trixie/release-notes/issues....

This makes me less safe rather than more. Note that there is a substantial double standard here, we could never in the name of safety impose this level of burden from C tooling side because maintainers would rightfully be very upset (even toggling a warning in the default set causes discussions). For the same reason it should be unacceptable to use Rust before this is fixed, but somehow the memory safety absolutists convinced many people that this is more important than everything else. (I also think memory safety is important, but I can't help but thinking that pushing for Rust is more harmful to me than good. )

2 days ago

[-]

As someone that also cares about C++, header-only libraries are an abomination from folks that think C and C++ are scripting languages.

mellinoe

2 days ago

[-]

You can expose a C interface from many languages (C++, Rust, C# to name a few that I've personally used). Instead of introducing a new language entirely, it's probably better to write the library in one of the languages you already use.

psyclobe

2 days ago

[-]

SQLite is a true landmark, c not withstanding it just happened to be the right tool at the right time and by now anything else is well not as interesting as what they have going on now; totally bucks the trend of throw away software.

kazinator

2 days ago

[-]

> The C language is old and boring. It is a well-known and well-understood language.

So you might think, but there is a committee actively undermining this, not to mention compiler people keeping things exciting also.

There is a dogged adherence to backward compatibility, so that you can't pretend C has not gone anywhere in thirty-five years, if you like --- provided you aren't invoking too much undefined behavior. (You can't as easily pretend that your compiler has not gone anywhere in 35 years with regard to things you are doing out of spec.)

[1] https://pkg.go.dev/modernc.org/sqlite

steeleduncan

2 days ago

[-]

> SQLite could be recoded in Go

Sqlite has been recoded (automatically) in Go a while ago [1], and it is widely deployed

> would probably introduce far more bugs than would be fixed

It runs against the same test suite with no issues

> and it may also result in slower code

It is quite a lot slower, but it is still widely used as it turns out that the convenience of a native port outweighs the performance penalty in most cases.

I don't think SQLite should be rewritten in Go, Rust, Zig, Nim, Swift ... but ANSI C is a subset of the feature set of most modern programming languages. Projects such as this could be written and maintained in C indefinitely, and be automatically translated to other languages for the convenience of users in those languages

sgbeal

2 days ago

[-]

> It runs against the same test suite with no issues

It runs against the same public test suite. The proprietary test suite is much more intensive.

sim7c00

2 days ago

[-]

> would probably introduce far more bugs than would be fixed

It runs against the same test suite with no issues

- that proves nothing about bugs existing or not.

ChrisRR

2 days ago

[-]

> It runs against the same test suite with no issues

That doesn't guarantee no bugs. It just means that the existing behaviour covered by the tests is still the same. It may introduce new issues in untested edge cases or performance issues

pizlonator

2 days ago

[-]

SQLite works great in Fil-C with minimal changes.

So, the argument for keeping SQLite written in C is that it gives the user the choice to either:

- Build SQLite with Yolo-C, in which case you get excellent performance and lots of tooling. And it's boring in the way that SQLite devs like. But it's not "safe" in the sense of memory safe languages.

- Build SQLite with Fil-C, in which case you get worse (but still quite good) performance and memory safety that exceeds what you'd get with a Rust/Go/Java/whatever rewrite.

Recompiling with Fil-C is safer than a rewrite into other memory safe languages because Fil-C is safe through all dependencies, including the syscall layer. Like, making a syscall in Rust means writing some unsafe code where you could screw up buffer sizes or whatnot, while making a syscall in Fil-C means going through the Fil-C runtime.

pm2222

2 days ago

[-]

These points strike me:

  Safe languages insert additional machine branches to do things like verify that array accesses are in-bounds. In correct code, those branches are never taken. That means that the machine code cannot be 100% branch tested, which is an important component of SQLite's quality strategy.

  Rust needs to mature a little more, stop changing so fast, and move further toward being old and boring.

  Rust needs to demonstrate that it can do the kinds of work that C does in SQLite without a significant speed penalty.

2 days ago

[-]

If the branch is never taken, and the optimizer can prove it, it will remove the check. Sometimes if it can’t actually prove it there’s ways to help it understand, or, in the almost extreme case, you do what I commented below.

sedatk

2 days ago

[-]

Yeah I don't understand the argument. If you can't convince the compiler that that branch will never be taken, then I strongly suspect that it may be taken.

compiler-guy

2 days ago

[-]

A program can have many properties that the compiler cannot prove statically. To take a very basic case, the halting problem.

unclad5968

2 days ago

[-]

That's not the point. The point is that if it is never taken, you can't test it. They don't care that it inserts a conditional OP to check, they care that they can't test the conditional path.

sedatk

2 days ago

[-]

But, there is no conditional path when the type system can assure the compiler that there is nothing to be conditional about. Do they mean that it's impossible to be 100% sure about if there's a conditional path or not?

rstuart4133

2 days ago

[-]

This is annoying in Rust. To me array accesses aren't the most annoying, it's match{} branches that will never been invoked.

There is unreachable!() for such situations, and you would hope that:

    if array_access_out_of_bounds { unreachable!(); }

is recognised by the Rust tooling and just ignored. That's effectively the same as SQLite is doing now by not doing the check. But it isn't ignored by the tooling: unreachable!() is reported as a missed line. Then there is the test code coverage including the standard output by default, and you have to use regex's on path names to remove it.

2 days ago

[-]

A more direct translation of the sqlite strategy here is to use get_unchecked instead of [], and then you get the same behaviors.

Your example does what [] does already, it’s just a more verbose way of writing the same thing. It’s not the same behavior as sqlite.

https://turso.tech/blog/introducing-limbo-a-complete-rewrite...

pella

2 days ago

[-]

Turso:

https://algora.io/challenges/turso "Turso is rewriting SQLite in Rust ; Find a bug to win $1,000"

------

- Dec 10, 2024 : "Introducing Limbo: A complete rewrite of SQLite in Rust"

- Jan 21, 2025 - "We will rewrite SQLite. And we are going all-in"

https://turso.tech/blog/we-will-rewrite-sqlite-and-we-are-go...

- Project: https://github.com/tursodatabase/turso

Status: "Turso Database is currently under heavy development and is not ready for production use."

a-dub

2 days ago

[-]

sqlite3 has one (apparently this is called "the amalgamation") c source file that is ~265 kloc (!) long with external dependencies on zlib, readline and ncurses. built binaries are libsqlite3.so at 4.8M and sqlite3 at 6.1M.

turso has 341 rust source files spread across tens of directories and 514 (!) external dependencies that produce (in release mode) 16 libraries and 7 binaries with tursodb at 48M and libturso_sqlite3.so at 36M.

looks roughly an order of magnitude larger to me. it would be interesting to understand the memory usage characteristics in real-world workloads. these numbers also sort of capture the character of the languages. for extreme portability and memory efficiency, probably hard to beat c and autotools though.

2 days ago

[-]

I don't think the SQLite authors actually edit the single giant source file directly. Their source control repository has the code split up into many separate files, which are combined into "the amalgamation" by a build script: https://github.com/sqlite/sqlite/tree/master/src

a-dub

2 days ago

[-]

yeah i saw that afterwards. they do it to squeeze more optimization out of the compiler by putting everything in one compilation unit. given the prominence of the library i have to wonder if this was an input to zig's behind-the-scenes single compilation unit design choice...

01HNNWZ0MV43FF

2 days ago

[-]

But if you don't have the bounds checks in machine code, then you don't have bounds checks.

I suppose SQLite might use a C linter tool that can prove the bounds checks happen at a higher layer, and then elide redundant ones in lower layers, but... C compilers won't do that by default, they'll just write memory-unsafe machine code. Right?

belter

2 days ago

[-]

Some of the most interesting comments are out of: "3. Why Isn't SQLite Coded In A "Safe" Language?"

"....Safe languages insert additional machine branches to do things like verify that array accesses are in-bounds. In correct code, those branches are never taken. That means that the machine code cannot be 100% branch tested, which is an important component of SQLite's quality strategy..."

"...Safe languages usually want to abort if they encounter an out-of-memory (OOM) situation. SQLite is designed to recover gracefully from an OOM. It is unclear how this could be accomplished in the current crop of safe languages..."

dgfitz

2 days ago

[-]

> Rust needs to mature a little more, stop changing so fast, and move further toward being old and boring.

Talking about C99, or C++11, and then “oh you need the nightly build of rust” were juxtaposed in such a way that I never felt comfortable banging out “yum install rust” and giving it a go.

2 days ago

[-]

Other than some operating systems projects, I haven’t run into a “requires nightly” in the wild for years. Most users use the stable releases.

(There are some decent reasons to use the nightly toolchain in development even if you don’t rely on any unfinished features in your codebase, but that means they build on stable anyway just fine if you prefer.)

dgfitz

2 days ago

[-]

Good to know, maybe I’ll give it a whirl. I’d been under the (mistaken, apparently) impression that if one didn’t update monthly they were going to have a bad time.

2 days ago

[-]

You may be running into forwards compatibility issues, not backwards compatibility issues, which is what nightly is about.

The Rust Project releases a new stable compiler every six weeks. Because it is backwards compatible, most people update fairly quickly, as it is virtually always painless. So this may mean, if you don’t update your compiler, you may try out a new package version and it may use features or standard library calls that don’t exist in the version you’re using, because the authors updated regularly. There’s been some developments in Cargo to try and mitigate some of this, but since it’s not what the majority of users do, it’s taken a while and those features landed relatively recently, so they’re not widely adopted yet.

Nightly features are ones that aren’t properly accepted into the language yet, and so are allowed to break in backwards incompatible ways at any time.

2 days ago

[-]

But the original point "C99 vs something later" is also about forward compatibility issues.

2 days ago

[-]

Sure, I had originally responded to the "needs nightly Rust part" only.

1vuio0pswjnm7

2 days ago

[-]

Why doesn't ON CONFLICT(column_name) accept multiple arguments, i.e., multiple columns

One stupid workaround is combining multiple columns into one, with values separated by a space, for example. This works when each column value is always a string containing no spaces

Another stupid workaround, probably slower, might be to hash the multiple columns into a new column and use ON CONFLICT(newcolumn_name)

Havoc

2 days ago

[-]

For a project that is functionally “done” switching doesn’t make sense. Something like kernel code where you know it’ll continue to evolve - there going through the pain may be worth it

firesteelrain

2 days ago

[-]

One thing I found especially interesting is the section at the end about why Rust isn’t used. It leaves open the door and at least is constructive feedback to the Rust community

jokoon

2 days ago

[-]

I wonder if the hype helps rust being a better language

At this point I wish the creators of the language could talk about what rust is bad at.

2 days ago

[-]

Folks involved often do! Talking about what’s not great is the only path towards getting better, because you have to identify pain points in order to fix them.

estebank

2 days ago

[-]

I would go as far as saying that 90% of managing the project is properly communicating, discussing and addressing the ways in which Rust sucks. The all-hands in NL earlier this year was wall to wall meetings about how much things suck and what to do about them! I mean this in the best possible way. ^_^

6r17

2 days ago

[-]

If I remember correctly most of SQLite "closed-source" leverage comes from the test-suite - which probably cannot transpose to another language as easily. Ultimately there are already other solutions coming up re-writing it in rust or go.

deanebarker

2 days ago

[-]

It's hard to argue with success. SQLite's pervasiveness is kind of a royal flush.

dusted

2 days ago

[-]

In my opinion, you don't get to ask "why is X done by Y" before you've done X yourself by something not Y and not Y by proxy either.

next_xibalba

2 days ago

[-]

Aren't SQLite’s bottlenecks primarily io-bound (not CPU)? If so, fopen, fread, or syscalls are the most important to performance and pure language efficiency wouldn't be limiter.

morshu9001

2 days ago

[-]

This is what I expected. Rust is the first thing that has been worth considering as a C replacement. C++ wasn't.

system2

2 days ago

[-]

What's up with SQLite news lately? I feel like I see at least 1-2 posts about it per day.

MomsAVoxell

2 days ago

[-]

Back in the good ol'/bad ol' days of the very early web/Internet, I had the fortune of working with someone who, lets say, has kind of a background in certain operating systems circles.

Not only had this fellow built a functional ISP in one of the toughest markets (at that time), in the world - but he'd also managed to build the database engine and quite a few of the other tools that ran that ISP, and was in danger of setting a few standards for a few things which, since then, have long since settled out, but .. nevertheless .. it could've been.

Anyway, this fellow wrote everything in C. His web page, his TODO.h for the day .. he had C-based tools for managing his docs, for doing syncs between various systems under his command (often in very far-away locations, and even under water a couple times) .. everything, in C.

The database system he wrote in pure C was, at the time, quite a delight. It gave a few folks further up the road a bit of a tight neck.

He went on to do an OS, because of course he did.

Just sayin', SQLite devs aren't the only ones who got this right. ;)

a-saleh

2 days ago

[-]

Ok, I didn't expect such a high praise for rust. I am not joking.

coolThingsFirst

2 days ago

[-]

I don't want to sound cynical but a lot of it has to deal with the simplicity of the language. It's much harder to find a good Rust engineer than a C one. When all you have is pointers and structs it's much easier to meet the requirements for the role.

plainOldText

2 days ago

[-]

I’d be curious to know what the creators of SQLite would have to say about Zig.

Zig gives the programmer more control than Rust. I think this is one of the reasons why TigerBeetle is written in Zig.

2 days ago

[-]

> Zig gives the programmer more control than Rust

More control over what exactly? Allocations? There is nothing Zig can do that Rust can’t.

array_key_first

2 days ago

[-]

> More control over what exactly? Allocations? There is nothing Zig can do that Rust can’t.

I mean yeah, allocations. Allocations are always explicit. Which is not true in C++ or Rust.

Personally I don't think it's that big of a deal, but it's a thing and maybe some people care enough.

2 days ago

[-]

> Which is not true in [] Rust.

...If you're using the alloc/std crates (which to be fair, is probably the vast majority of Rust devs). libcore and the Rust language itself do not allocate at all, so if you use appropriate crates and/or build on top of libcore yourself you too can have an explicit-allocation Rust (though perhaps not as ergonomic as Zig makes it).

Cloudef

2 days ago

[-]

I think zig generally composes better than rust. With rust you pretty much have to start over if you want reusable / composable code, that is not use the default std. Rust has small crates for every little thing because it doesn't compose well, as well to improve compile times. libc in the default std also is major L.

2 days ago

[-]

> I think zig generally composes better than rust.

I read your response 3 times and I truly don't know what you mean. Mind explaining with a simple example?

1: https://zig.news/andrewrk/how-to-use-hash-map-contexts-to-sa...

Cloudef

2 days ago

[-]

It mainly comes down how the std is designed. Zig has many good building blocks like allocators, and how every function that allocates something takes one. This allows you to reuse the same code for different kind of situations.

Hash maps in zig std are another great example, where you can use adapter to completely change how the data is stored and accessed while keeping the same API [1]. For example to have map with limited memory bound that automatically truncates itself, in rust you need to either write completely new data structure for this or rely on someone's crate again (indexmap).

Errors in zig compose also better, in rust I find error handling really annoying. Anyhow makes it better for application development but you shouldn't use it if writing libraries.

When writing zig I always feel like I can reuse pieces of existing code by combining the building blocks at hand (including freestanding targets!). While in rust I always feel like you need go for the fully tailored solution with its own gotchas, which is ironic considering how many crates there are and how many crates projects depend on vs. typical zig projects that often don't depend on lots of stuff.

ginko

2 days ago

[-]

I'm generally a fan of Zig but it's in no way stable enough to write something like sqlite in it.

2 days ago

[-]

> Nearly all systems have the ability to call libraries written in C. This is not true of other implementation languages.

From section "1.2 Compatibility". How easy is it to embed a library written in Zig in, say, a small embedded system where you may not be using Zig for the rest of the work?

Also, since you're the submitter, why did you change the title? It's just "Why is SQLite Coded in C", you added the "and not Rust" part.

plainOldText

2 days ago

[-]

The article allocates the last section to explaining why Rust is not a good fit (yet) so I wanted the title to cover that part of the conversation since I believe it is meaningful. It illustrates the tradeoffs in software engineering.

2 days ago

[-]

> Otherwise please use the original title, unless it is misleading or linkbait; don't editorialize.

From the site guidelines: https://news.ycombinator.com/newsguidelines.html

zenxyzzy

1 day ago

[-]

I'm really getting tired of resume driven development. Choosing technology X because the coder wants it on their resume is a truly shitty reason. Rust is just the lastest trendy bullshit that will make meh programmers into superstars.

ternaryoperator

2 days ago

[-]

> Recoding SQLite in Go is unlikely since Go hates assert()

Any idea what this refers to? assert is a macro in C. Is the implication that OP wants the capability of testing conditions and then turning off the tests in a production release? If so, then I think the argument is more that go hates the idea of a preprocessor. Or have I misunderstood the point being made?

https://go.dev/doc/faq#assertions

2 days ago

[-]

ternaryoperator

2 days ago

[-]

Steve, thanks for taking the time to point me to this on-point passage.

netrap

2 days ago

[-]

Why not just say "because I don't want to change it"? rustaceans will always say your argument isn't valid for some reason or another. IDGAF, it's written in C because it is -- that's it!

2 days ago

[-]

because Rust isnt out yet back then????

binary132

2 days ago

[-]

I love him so much.

BiraIgnacio

2 days ago

[-]

"1. C Is Best"

rednafi

2 days ago

[-]

Also, Rust needs a better stdlib. A crate for every little thing is kinda nuts.

One reason I enjoy Go is because of the pragmatic stdlib. On most cases, I can get away without pulling in any 3p deps.

Now of course Go doesn’t work where you can’t tolerate GC pauses and need some sort of FFI. But because of the stdlib and faster compilation, Go somehow feels lighter than Rust.

firesteelrain

2 days ago

[-]

Rust doesn’t really need a better stdlib as much as a broader one, since it is intentionally narrow. Go’s stdlib includes opinions like net/http and templates that Rust leaves to crates. The trade-off is Rust favors stability and portability at the core, while Go favors out-of-the-box ergonomics. Both approaches work, just for different teams.

afdbcreid

2 days ago

[-]

Is Rust's stdlib worse than C's? It's not an argument here.