(code at https://github.com/accretional/collector - forgive the documentation. I'm working on a container-based agent project and also trialling using agents heavily to write the individual features. It's working pretty well but the agents have been very zealous at documenting things lol).
This is my first real project using sqlite and we've hit some similarly cool benchmarks:
* 5-15ms downtime to backup a live sqlite db with a realistic amount of data for a crud db
* Capable of properly queueing hundreds of read/write operations when temporarily unavailable due to a backup
* e2e latency of basically 1ms for CRUD operations, including proto SerDe
* WAL lets us do continuous, streaming, chunked backups!
Previously I'd only worked with Postgres and Spanner. I absolutely love sqlite so far - would still use Spanner for some tasks with an infinite budget but once we get Collector to implement partitions I don't think I would ever use Postgres again.
Did you consider using a filesystem with atomic snapshots? For example sqlite with WAL on BTRFS. As far as I can tell, this should have a decent mechanical sympathy.
edit: I didn't really explain myself. This is for zero downtime backups. Snapshot, backup at your own pace, delete the snapshot.
Also, the last time I checked the Linux scheduling quanta was about 10ms, so it’s not clear backups are going to even be the maximum duration downtime while the system is healthy.
[1] Various HN posts regarding Hetzner vs AWS in terms of costs and perf.
3x EX44 running Patroni + PostgreSQL would give you 64GB of working memory, at least 512 GB NVMe of dataset (configurable with more for a one-time fee) at HA + 1 maintenance node. Practically speaking, that would have carried the first 5 - 10 years of production at the company I work at with ease, for 120 Euros hardware cost/month + a decent sysadmin.
I also know quite a few companies who toss 3-4x 20k - 30k at DELL every few years to get a database cluster on-prem so that database performance ceases to be a problem (unless the application has bad queries).
I want to like Hetzner but the bureaucratic paper process of interacting with them and continuing to interact with them is just... awful.
Not that the other clouds don't also have their own insane bureaucracies so I guess it's a wash.
I'm just saying, I want a provider that leaves me alone and lets me just throw money at them to do so.
Otherwise, I think I'd rather simply deploy my own oversized server in a colo even with the insanely overpriced hardware prices currently.
edit: And shortly after writing this comment I see: "Microsoft won't let me pay a $24 bill, blocking thousands in Azure spending" https://news.ycombinator.com/item?id=46124930
But basically after the initial paperwork I had some issues with my account getting flagged even though I wasn't using it 99.999% of the time. It's not a huge deal for me because I wasn't trying them out for anything serious. I just questioned how often that might happen if I was actually using it seriously and what kind of headaches it could cause me while re-verifying everything with them.
From people I know if everything is going good then their service is great. Server performance is good, pricing is good, etc.
That’s been my experience with Hetzner.
A lot of people get butthurt that a business dares to verify who they’re dealing with as to filter out the worst of the worst (budget providers always attract those), but as long as you don’t mind the reasonable requirement to verify your ID/passport they’re hands-off beyond that.
I guess my concern on the bureaucracy is if you are unlucky enough to get flagged as a false positive it can be an annoying experience. And I can't really blame them too hard for having to operate that way in an environment of bad actors.
You're definitely right that the budget providers do attract the types of people trying to do bad things/exploit them in some way.
> This particular upper bound is untested since the developers do not have access to hardware capable of reaching this limit.
> However, tests do verify that SQLite behaves correctly and sanely when a database reaches the maximum file size of the underlying filesystem (which is usually much less than the maximum theoretical database size) and when a database is unable to grow due to disk space exhaustion.
No one has solved this problem. Scale out is typically more elastic, at least for reads.
But it only works if workloads remain fixed. If workloads grow at similar rates you’re back to the same problem.
Does my data fit in RAM? https://yourdatafitsinram.net/
Not sure using EC2/AWS/Amazon is a good example here, if you're squeezing for large single-node performance you most certainly go for dedicated servers, or at least avoid vCPUs like a plague.
So yeah, easily.
No, it does not. This article first says that normally you would run an application and the database on separate servers and then starts measuring the performance of a locally embedded database. If you have to keep the initial requirement for your software, then SQLite is completely out of equation. If you can change the requirement, then you can achieve similar performance by tuning the local PGSQL instance -- and then it also becomes a valuation of features and not just raw throughput. I'm not saying SQLite is not an option either, but this article seems confusing in that it compares two different problems/solutions.
As an industry, we seem to have settled on patterns that actually are quite inefficient. There's no problem that requires the solution of doing things inefficiently just because someone said databases should run on a different host.
As an industry, we've generally decided against "one big box", for reasons that aren't necessarily performance related.
No it isn't? You can run a thin sqlite wrapping process on another server just fine. Ultimately all any DB service is, PostgreSQL included, is a request handler and a storage handler. SQLite is just a storage handler, but you can easily put it behind a request handler too.
Putting access to sqlite behind a serial request queue used to be the standard way of implementing multi-threaded writes. That's only spitting distance away from also putting it behind TCP.
I'm not saying that this is a good idea, and it could fail in a spectacular manner, but it can be done. DML over this is just asking for trouble.
Far from it, as now your not just dealing with network but also with raft consensus... So each write is not just a network trip, its also 2x acknowledging. And your reads go over the leader, what can mean if somebody accessed node 1 app but node 2 is the leader, well, ...
Its slower on reads and writes, then just replications that PostgreSQL does. And i do not mean async but even sync PostgreSQL will be faster.
The reason dqlite exists is because canonical needed something to synchronize their virtualization cluster (lxd), and they needed a db with raft consensus, that is a lib (as not a full blown server install like postgres). Performance was not the focus and its usage is totally different then most people needs here.
Nit: dqlite is a library, it is not a network-exposed database like rqlite is. Sure, it requires connecting to other nodes over the network, but local access is via in-process. In contrast one connects with rqlite over the network - HTTP specifically.
The other thing to point out is in this article is that the PG network example CANNOT scale horizontally due to the power law. You can throw a super cluster at the problem and still fundamentally do around 1000 TPS.
And now that there are solid streaming backup systems, the only real issue is redundancy not scaling.
There’s absolutely nothing paradoxical about any of this.
It'd be a very short article if so, don't you think? Full article would be something like: "Normally you'd have a remote connection to the database, and since we're supposed to test SQLite's performance, and SQLite is embedded, it doesn't compare. Fin"
Though I'd say it's for a broader set of applications than that (embedded apps, desktop apps, low-concurrency server apps etc).
Phones and mobile apps installations of course outnumber web app deployments, and it doesn't say what you paraphrased about servers.
You mention setting the conn pool to 8 to match your # of cores. That would be fine if you didn't have any sleeps inside of your txns... But the moment you added the sleeps inside the txns, your limit of 8 kills through throughput... because no other thread can access the DB once 8 of them grab connections and start the 20ms of total sleep. Imagine instead if you had 64 connections... you would 8x your throughput... What if you were to go even higher? At some point you might start overloading the DB... at that point, you could consider tuning the db to accept more connections... or... maybe you've truly reached the DB's peak performance limit.
I just don't think that 8 connections represents that limit... you need to do everything you can to open up your client config until you reach PG's limitations.
Tangentially I also highly recommend this article on pool sizing.
https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-...
> The synchronous=NORMAL setting provides the best balance between performance and safety for most applications running in WAL mode. You lose durability across power lose with synchronous NORMAL in WAL mode, but that is not important for most applications. Transactions are still atomic, consistent, and isolated, which are the most important characteristics in most use cases.
Edit: Also, the example indicates financial transactions. Can you explain why you need serializability but not durability?
You're going to want persistent storage on your server, not ephemeral. You'll also want NVME. A lot of the time you're going to end up on bare metal running a single server anyway.
You're going to have down time for migrations unless you're very clever with your schema and/or replicas.
Litestream for me at least is what makes SQLite viable for a web app as prior to that there wasn't a good replication story.
With litestream it's much easier to have a backup on standby. That being said where I have used it in production some amount of downtime has been acceptable so mileage may vary.
probably worth stating these kinds of design considerations/assumptions up-front
i'm sure lots of applications are fine with "downtime for [database] migrations" but lots more are definitely not, especially those interested in synthetic metrics like TPS
You can achieve zero downtime with Sqlite if you really need to.
TPS is not a synthetic metric when you cap out at 100 TPS because of Amdahl's law and your users having a power distribution.
A few projects:
* https://github.com/rqlite/rqlite Distributed, fault tolerant cluster
* https://litestream.io/ Replication to S3 (or compatible) - more disaster recovery than fail over
* https://fly.io/docs/litefs/ Same Author as litestream). Distributed replication. Requires writes to be redirected to the primary.
I am debating Postgres vs sqlite (probably with litestream) for a project right now.And other than HW redundancy, I can get pretty far by scaling vertically on a single box. And for my app, I could probably (and my users!) live with some occasional downtime (as long as the data is replicated/backed up).
If I get 20-50K users, it'll be a successful venture so I don't need much these days and it will be cheaper and easier to run as well.
Linked at the bottom of the article is the most extreme I've seen.
I know due to some recent update, SQLite can support concurrent reads but still only a single writer. For which cases this would be a problem?
Some recommend it's better to start with postgres anyway if you have any remote thoughts of scaling in mind....
What kind of data is so critical that the data from a quarter second before catastrophic destruction must be saved?
I guess weapons testing, at least... But that wouldn't be streaming data of that importance for a very large % of time.
It’s great that some people have workloads that this is a fit for. What’s more common is the use case managed databases like RDS etc solves for. You have some quantity of data you want to always be there, be available over a network for whatever app(s) need it and want backups, upgrades, access control etc solved for you.
I love SQLite and reach for it for hobby projects, but as a product for general business apps it is quite niche. It has the qualities that make for any popular product on HN, a great getting started experience and a complex maintenance and operational experience.
TIL `SAVEPOINT` can occur in a BEGIN ... END SQLite transaction, and that works with optimizing batch size on a particular node with a given load.
Is there a solution for SQLite WAL corruption?
From https://news.ycombinator.com/item?id=45133444 :
> "PSA: SQLite WAL checksums fail silently and may lose data" https://news.ycombinator.com/item?id=44672902
> sqlite-parquet-vtable, [...]
And a working ECC or non-ECC RAM bus, and [...].
How bad is recovery from WAL checksum / journal corruption [in SQLite] [with batching at 100k TPS]?
And should WAL checksums be used for distributed replication "bolted onto" SQLite?
>> (How) Should merkle hashes be added to sqlite for consistency? How would merkle hashes in sqlite differ from WAL checksums?
SQLite would probably still be faster over the network with proper Merkleization
A sqlite related issue I ran into had to do with accessing the SQLite database from multiple threads. Found out a solution easily: for multi-threading use, SQLite needed to be initialized with a `SQLITE_OPEN_FULLMUTEX` flag. Since then, the website has been running seamlessly for about 3 weeks now.
First one for writing with flags:
SQLITE_OPEN_CREATE | SQLITE_OPEN_READWRITE | SQLITE_OPEN_FULLMUTEX
Second one for reading with flags: SQLITE_OPEN_READONLY | SQLITE_OPEN_FULLMUTEX
As you can note, I have SQLITE_OPEN_FULLMUTEX on both of them. Should I only have it for the writing one?Also: use WAL mode and enable mmap.
> But, wait our transactions are not serialisable, which they need to be if we want consistent transaction processing
You either don't know what serializable does or trying to mislead the reader. There is zero reason to use searializable here.
> Let's say you have 5ms latency between your app server and your database.
5ms latency is unrealistic. Unless you use wifi or you database is in another datacenter.
> I'm talking about transactions per second, specifically interactive transactions that are common when building web applications
No they are not common at all. You probably invented them just to make pg look bad.
Interactive transactions are a well-defined concept in database technology, exactly with the semantics described by the author: transactions with multiple queries, with application logic in between, for instance taking a result from one query, processing it, and running a second query with the outcome of that processing as input.
That said, the example in the blog post feels a bit contrived to me, the question being whether that transaction could be structured in a why so that both updates run right after each other towards the end of the transaction, thus significantly reducing the duration of the row lock being held (which is not to say that the general problem described in the article doesn't exist, it does for sure).
In practice I'd never implement a ledger like this these days, I'd much prefer an append only model with some triggers. But, like you said interactive transactions are very much a thing.
Out of curiosity do you have a better/less contrived example in mind?
If you're processing financial transactions you want your isolation level to be serialisable. As the order in which the transactions are processed matters.
> 5ms latency is unrealistic. Unless you use wifi or you database is in another datacenter.
Even with 1ms latency. Amdahl's law will still make you cap out at a theoretical 1000 TPS if you have 100% row lock contention.
> No they are not common at all. You probably invented them just to make pg look bad.
I'm confused. I invented transactions? Are you saying you don't use transactions with rollback when you use PG?
So you don't know what serializable level is.
> Even with 1ms latency. Amdahl's law will still make you cap out at a theoretical 1000 TPS if you have 100% row lock contention.
So why use 5ms and 10ms for examples?
> I'm confused. I invented transactions?
"Interactive" transactions.