This feature truly is a game-changer for adoption IMO.
Congrats on shipping to all involved!
Neon isn't in the same league at all so I think what they support doesn't matter.
Tell that to the people who have been successfully building massive databases in RDBMS for decades making extensive use of FKs and FK constraints.
Are there any plans to support recursive CTEs? What are the technical challenges there?
We recently started adding support for CTEs in Vitess! You can check out https://github.com/vitessio/vitess/pull/14321 if you want to see some technical details of the implementation.
For now, we have added preliminary support by converting them to derived tables internally, but we believe that we need to make CTEs first-class citizens themselves of query planning, specifically because recursive CTEs are very hard to model as derived tables. Once we make that change, we can look towards supporting recursive CTEs.
This however will take some time, but then, all good things do!
It's worth noting that MySQL has no problem with this kind of situation. It never cares about the existence of orphaned rows. It only cares about not letting you creating them in the first place, and it cares about cleaning up. But it doesn't blow up if orphaned rows do exist. They will just become ghosts.
Also from my PostgreSQL-infused worldview, it seems to me you are making your life too difficult by requiring a migration to make "one change" to a table or view. The brute force idiom I've seen for schema migrations is to break it into phases:
1. drop departing foreign key constraints
2. restructure table columns/types and values
3. add new foreign key constraints
This is a bit like running with constraints deferred while making data changes that might look invalid until all data changes are done. But, it defers expression of the new constraints until the table structures are in place to support their definitions too, so it isn't just about deferring enforcement.
The same strategy can be used for import scenarios to support schemas where there are circular foreign key reference constraints. I.e. tables are not in a strict parent-child hierarchy.
This is why in our design PlanetScale will not take upon itself to do this three step change. The user is more than welcome to break this into three different (likely they'll be able to make it in just two) schema changes. But then the user takes ownership of handling unchecked references.
> making data changes that might look invalid until all data changes are done
In effect, the data _will be_ invalid, and potentially for many hours.
Now, it's true that if the user messed up the data in between, then adding the foreign key constraint will fail, in both PostgreSQL and in MySQL. To me, this signals more bad news, because now the user has to scramble to clean up whatever incorrect data they have, before they're able to complete their schema change and unblock anyone else who might be interested in modifying the table.
Personally, my take is to not use foreign key constraints on large scale databases. It's nice to have, but comes at a great cost. IMHO referential data integrity should be handled, gracefully, by the app. Moreover, referential integrity is but one aspect of data integrity/consistency. There are many other forms of data integrity, which are commonly managed by the app, due to specific business logic. I think the app should own the data as much as it can. My 2c.
Is that mechanism unique to PostgreSQL? Or is it just that this transaction is impractical for you, due to the wall clock duration and how it impacts other use of the DB?
My main worry is that I’d end up with 3 classes of data; pre migration, changed / added post migration and post revert. It’s probably fine in most cases, but that could take quite some unpicking.
As you complete the migration, PlanetScale keeps your old table, and continues to sync any further incoming changes to the table, back to the old table. They're kept in sync, apart of course from what incompatible schema changes they may have. As you revert, we flip the two, placing your old table back, but now not only with the old data, but also with all the newly accumulated data.
I completely appreciate the roll-forward approach. That's something we do for code deployments. Except when we don't, when there's that particular change where the best approach is to revert a commit, revert a PR. I think of schema deployments in the same way. Reverts will be rare, hopefully, but they can save the day.
"Changing a table's schema is one of the most challenging problems in relational databases, and in MySQL in particular. In today's accelerated and rapid development cycles, engineers find that they need to make schema changes sometimes on a daily basis."
Okay, but is this really true? Is it really necessary to have no downtime except in a very few cases? It's honestly rare to have systems that absolutely must stay up all the time, but allowing fiddling with the schema as well just seems excessive.
I'd also say as software dev I'd be very uncomfortable with a system like yours that claims to maintain consistency while running and changing the schema, short of a formal proof.
We aren't talking about zero downtime here, but continual, recurring downtime due to schema changes. Once you have beyond a few million rows in a normal RDBMS, schema changes can take minutes to hours depending on the type. Do this a few times per month and you now have 'lots' of downtime and you are blocking other engineering work from happening. It eventually becomes so much of a hassle that engineers don't want to do schema changes, blocking feature work. The more seamless and painless you can make them, the better.
How often do you expect to make the schema changes? I mean I quoted this bit "...make schema changes sometimes on a daily basis" – is this realistic, or a kind of business insanity typically caused by bad management? Ditto "...but continual, recurring downtime due to schema changes". This really looks like a failure of management rather than a technical problem to be solved.
Also aren't you likely to be doing something larger than just a schema change very often, in which case that would necessitate replacing your application, so changes are not just restricted to the database. You now have a bigger problem of co-ordinating app and DB changes.
I also asked to do you need permanent uptime because in a lot of systems, especially smaller ones (and by the long tail most systems are going to be smallish) the users are very tolerant of an hours' downtime a month, for example.
"Once you have beyond a few million rows in a normal RDBMS, schema changes can take minutes to hours depending on the type"
That's a pretty strong claim; what kind of thing is going to take hours that your database can do consistently? Does it even take hours? I had a 100,000,000 row table of unique ints lying around so I put a foreign key from itself to itself (a bit daft, but just for timing purposes. DB is MS SQL, table is fully hot in memory)
alter table [tmp_ints_clustered] add constraint ffffkkkkk foreign key (x) references [tmp_ints_clustered](x);
21 seconds.
What you're doing (if you can get it correct! Which I have to wonder at) is doubtless excellent for some very large companies, but in general... I'm afraid I'm not so sure.
Edit: I feel I'm perhaps missing your bigger picture.
When you make schema evolution really easy to do with no downtime, you can also start doing phased deployments where you think in terms of backwards-compatibility. First you add some new/optional parts to the schema. Then you update the applications to use it, where the application has to tolerate data with or without the new bits.
Eventually, you might convert something from optional to required, first for new data entry and later as a conversion of existing data, where that makes sense. Then finally you might deprecate the application ability to handle the old missing bits, and remove that from your code maintenance problems.
The giant data lakes for vacuuming up tracking data generally never do schema migrations at all.
Surely that's a bad design, and capability to support it is enabling continued bad design.
Ive been researching databases now for several days straight, the choices are overwhelming but I've pretty much narrowed my use case down to an RDBMS system.
I need to essentially handle 100's of millions of "leads" (and 10s of millions per day) which can make up any number of user fields. over 1B total
I need to resolve duplicate leads either in realtime or near realtime. A duplication can occur across a combination of 1 or more fields, so basically OLTP type operations (select, update, delete on single rows)
I do need to run large OLAP queries as well across all data
I've looked at things like scylla and whatnot but they seem too heavy duty for my volume. it's not like i need to store trillions of messages like discord in some huge event log.
I was considering these 3 options...
1. planetscale
2. citus
3. cockroachdb
I havent really narrowed it down further than this, but i liked the idea of still having RDBMS features without needing to worry about storage and scaling with just sheer write volume.
It seemed i could then do my basic OLTP stuff that i need, and citus had a cool demo how some OLAP query on 1B rows ran in 20s with 10 nodes, and that also fits a reasonable time for queries (BI tools will be used for that)
OLAP works here, but is not great depending on how fast you need info to be available. If you're generating reports, Postgres is fine as long as your queries are properly optimized, and you can get them within minutes for massive workloads. If you need near-instant (sub 1s) results, I would recommend you sync your RDBMS to a columnar database like ClickHouse, and let the better data layout work in your favor, rather than trying to constrain a row-based DB to act like it's not.
Otherwise, both are rock solid and simple to use. I've dealt with more intensive workloads than you mentioned, with the same use-case and Postgres worked very well. ClickHouse never had a problem.
I was considering using something like Airbyte, but then I thought this may actually be complex if PG rows are updating/deleting it means I also need to sync single rows (or groups of rows) to clickhouse, and I wasn't sure how the support was for that.
It might be easier to find some "change data capture" product that will do that for you though (like Airbyte). I can't give any recommendations here, however.
[0] - https://www.peerdb.io/
What kind of aggregation queries? Can you do pre-aggregation?
Citus would probably be my pick if you want SQL.
Feel free to email.
Any time I am doing something that could violate some logical constraint, I am probably trying to fix something that is already fucked. The safety nannies simply add insult to injury at this stage. I will delete broken rows in whatever order I please.
If constraint violations are adding value to your solution, you probably have a far more severe problem repeatedly slamming into the wall somewhere.
Having to write code that can handle foreign key violations because the DB doesn't check it is a major pain. (we use Cassandra for example, so there is a "foreign key" usually from a PG row to a Cassandra row, obviously that can't be enforced on DB level so application code has to do the work)
As for deleting/updating data, FKs can be a bit annoying, but postgresql for example has two (possibly more) options.
1) The (possibly dangerous) cascade delete, which will traverse the FKs basically for you and deletes them 2) The check FKs (and other constraints) on commit. I.e. instead of checking every delete/update statement causes FKs violations, it'll check at the end, after having done all the delete/update statements if there are any FK violations. (or update statements). Called deferrable constraints.
If there are any downstream queries that assume the relationship between X and Y, and you accidentally violated that contract, wouldn't you WANT the database to tell you? Without a FK constraint in place, I would just have to know that I needed to update hundreds of lines of (incidentally) dependent code!
It's roughly analagous to the static vs dynamic typing debate. Do you want to discover FK violations at runtime in the future? Or head them off at insert time? Either way, you cannot sweep referential integrity under the rug... you only shift the burden from one place to another.
So I thank my database daily for it's work enforcing FK constraints. Otherwise I'd have to write that code! You can't ignore the lifecycle of references.
If code has many maintainers, pinky promises to keep referential integrity are bound to be broken eventually. Humans like shortcuts.
https://www.youtube.com/@PlanetScale
Subscribed.