And if for some reason you refuse to, then this "barrier" or "hooks" approach to testing will in practice not help. It requires you to already know the potential race conditions, but if you are already aware of them then you will already write your code to avoid them. It is the non-obvious race conditions that should scare you.
To find these, you should use randomized testing that runs many iterations of different interleavings of transaction steps. You can build such a framework that will hook directly into your individual DB query calls. Then you don't have to add any "hooks" at all.
But even that won't find all race condition bugs, because it is possible to have race conditions surface even within a single database query.
You really should just use SERIALIZABLE and save yourself all the hassle and effort and spending hours writing all these tests.
This brought back awful memories of MS SQLServer and JDBC. Way back when, maybe Java 1.5 or so, SQLServer would deadlock between connections when all they were doing was executing the exact same statement. Literally. Not the same general statement with different parameters.
Serializable just means that within the transaction your logic can naively assume it’s single threaded. It doesn’t magically solve distributed system design for you.
“Just use random testing” isn’t really an answer. Some race conditions only show up with pathological delays on one thread.
Definitely recommend starting new codebases with it enabled everywhere.
That doesn't mean it won't happen, of course. The people who write databases are just programmers, too. And you can certainly imagine a situation where you get two (or more) "ad-hoc" transactions that can't necessarily progress when serializable but can with read committed (ad-hoc in the sense of the paper here: https://cacm.acm.org/research-highlights/technical-perspecti...).
Use transactions table (just a name, like orders)
On it have an Insert trigger.
It should make a single update with simple “update … set balance += amount where accoundId = id”. This will be atomic thanks to db engine itself.
Also add check constraint >= 0 for balance so it would never become negative even if you have thousands of simultaneous payments. If it becomes negative, it will throw, insert trigger will rethrow, no insert will happen, your backend code will catch it.
—
That’s it: insert-trigger and check constraint.
No need for explicit locking, no stored procedures, no locks in you backend also, nada. Just a simple insert row. No matter the load and concurrent users it will work like magic. Blazingly fast too.
That’s why there is ACID in DBs.
—
Shameless plug: learn your tool. Don’t approach Postgresql/Mssql/whathaveyousql like you’re a backend engineer. DB is not a txt file.
The logic used for crediting amounts of money is not important.
Erm, knowing and understanding how to use your database is a bread and butter skill of a backend engineer.
PG is still handling the locks for you, so this isn’t like a bulletproof solution and - like always - depending on your use case, scale, etc this may or may not work.
> No matter the load and concurrent users it will work like magic
Postgres will buckle updating a single row at a certain scale.
————-
Regardless, this article was about testing a type of scenario that is commonly not tested. You don’t always have a great tool like PG on hand that gives you solutions so this testing isn’t needed.
I think you can do better than explicit barrier() calls. My hunch is the test middleware layer can intercept calls and impose a deterministic ordering.
(There are a few papers around looking into more complex OS level frameworks to systematically search for concurrency bugs, but these would be tough to drop into the average web app.)
There would be some challenges for sure. Likely optimistic concurrent patterns would require an equivalent of loom's `yield_now` [1] to avoid getting stuck. And you'd probably need a way to detect one transaction waiting for another's lock to get out of situations like your update lock vs barrier example. I vaguely recall PostgreSQL might have some system catalog table for that or something.
I've built something similar to Loom, except it's more focused on extensively modeling the C++11/Rust memory model (https://github.com/reitzensteinm/temper). My experience is that fairly shallow random concurrent fuzzing yields the vast majority of all concurrency bugs.
Antithesis (https://antithesis.com/) are probably the leaders of the pack in going deeper.
It uses generators and their yield as the yield point (and supports running arbitrary functions under a debugger)
Or to put another way; as others have observed, this could be solved with atomic updates and in some case SERIALIZABLE. These are right tools for balance operations - and if they’re used I’m not sure they need testing in this manner?
> The simplest case: no transaction, just a SELECT and an UPDATE with a barrier between them:
There is no context where you do not have a transaction. Postgres requires them.
It's likely that the library the author is using is doing automatic implicit transactions, but it seems like the author needs to understand their tools a bit better.
IMHO you should never write code like that, you can either do UPDATE employees SET salary = salary + 500 WHERE employee_id = 101;
Or if its more complex just use STORED PROCEDURE, there is no point of using database if you gonna do all transactional things in js
There's essentially no difference between putting the logic in the app vs a stored procedure (other than round trip time)
await db().transaction(async (tx) => { await hooks?.onTxBegin?.();
const [order] = await tx.select().from(orders)
.where(eq(orders.id, input.id))
.for("update");
const [status] = await tx.select().from(orderStatuses)
.where(eq(orderStatuses.orderId, input.id))
.orderBy(desc(orderStatuses.createdAt))
.limit(1);
if (input.status === status.code)
throw new Error("Status already set");
await tx.insert(orderStatuses).values({ ... });
});You need the transaction + SELECT FOR UPDATE because the validation depends on current state, and two concurrent requests could both pass the duplicate check. The hooks parameter is the barrier injection point from the article - that's how you test that the lock actually prevents the race.
Add a numeric version column to the table being updated, read and increment it in the application layer and use the value you saw as part of the where clause in the update statement. If you see ‘0 rows updated’ it means you were beaten in a race and should replay the operation.
Optimistic updates looks great when there is no contention, and they will beat locking in a toy benchmark, but if you're not very careful they can cause insane amplification under load.
It's a similar trap as spinlocks. People keep re-discovering this great performance hack that avoids the slow locks in the standard. And some day the system has a spike that creates contention, and now you have 25 instances with 24 of them spinning like crazy, slowing to a crawl the only one that could be making progress.
It's possible to implement this pattern correctly, and it can be better in some specific situations. But a standard FOR UPDATE lock will beat the average badly implemented retry loop nine times out of ten.
WITH
o AS (
SELECT FROM orders
WHERE orders.id = $1
),
os AS (
SELECT FROM orderStatuses
WHERE orderStatuses.orderId = $1
ORDER BY DESC orderStatuses.createdAt
LIMIT 1
)
INSERT INTO orderStatuses ...
WHERE EXISTS (SELECT 1 FROM os WHERE os.code != $2)
RETURNING ...something including the status differ check...
Does something like this work with postgres's default behavior?Of course, the reasons not to use stored procedures still apply. They're logic, but they're versioned with the database schema, not with your application, which can be a pain.
* Good database drivers will let you pipeline multiple queries concurrently (esp. in languages with async support), effectively eliminating the _N_x roundtrip cost (you can even execute them in parallel if you use multiple connections, not that I recommend doing that). But obviously this is only doable where the queries are independent of one another; I use this mainly to perform query splitting efficiently if the join key is already known.
* These days databases are often effectively versioned alongside the code anyway, at least for either smaller projects that "own" the database, eliminating the biggest issue with stored procedures.
Ended up using SELECT FOR UPDATE on the waitlist row before the count check. Simple but effective. The barrier testing approach described here would have caught this much earlier in development instead of discovering it under load.
One thing I'd add: in Go, it's tempting to handle this at the application level with mutexes, but that breaks the moment you have multiple instances. Pushing the serialization down to Postgres is almost always the right call for correctness.
> The transaction didn't help. Postgres's default isolation level is READ COMMITTED — each statement sees all data committed before that statement started.