The Insights tab also surfaced missing indexes we added, which sped things up further. Early days, but so far so good.
Of course it's nicer if the database can handle it, but if you are doing hundreds of sequential non-pipelined writes per second, there is a good chance that there is something wrong with your application logic.
> To create a Postgres database, sign up or log in to your PlanetScale account, create a new database, and select Postgres.
It does mention the sign up option but doesn't really give me much context about pricing or what it is. I know a bit, but I get confused by different database offerings, so it seems like a missed opportunity to give me two more sentences of context and some basic pricing - what's the easiest way for me to try this if I'm curious?
On the pricing page I can start selecting regions and moving slides to create a plan from $39/month and up, but I couldn't easily find an answer to if there's a free trial or cheaper way to 'give it a spin' without committing.
Notoriously
It's designed for businesses that need to haul ass
> It's designed for businesses that need to haul ass
Could you elaborate what you meant by this for my education?
> Benchmarks are done on a dual-core VM with "unlimited" IOPS
I'd be interested in a comparison with a pair of Beelink SER5 Pros ($300 each) in master-slave config.
Unlimited is a feature here, no need to be snarky. They famously went against the accepted practice of separating storage from compute, and as a result, you reduce latency by an order of magnitude and get unlimited IOPS.
That might not be 100% true, but I've never seen a RDBMS be able to saturate IOPS on a local NVMe. It's some quite specialized software to leverage every ounce of IOPS without being CPU bottlenecked first. Postgres and MySQL are not it.
Anyway, saying unlimited is absurd. If you think it's more than you need, say how much it is and say that's more than you need. If you have infinite IOPS why not do the benchmark on a dataset that fits in CPU cache?
Not all AWS instance types support NVMe drives. It's not the same as normal attached storage.
I'm not really sure your arguments are in good faith here tho.
This is just not a configuration you can trivially do while maintaining durability and HA.
There's a lot of hype going the exact opposite direction and more separation of storage and compute. This is our counter to that. We think even EBS is bad.
This isn't a setup that is naturally just going to beat a "real server" that also has local NVMes or whatever you'd do yourself. This is just not what things like RDS or Aurora do, etc. Most things rely on EBS which is significantly worse than local storages. We aren't really claiming we've invented something new here. It's just unique in the managed database space.
I agree that EBS and the defaults RDS tries to push you into are awful for a database in any case. 3k IOPS or something absurd like that. But that's kind of the point: AWS sells that as "SSD" storage. Sure it's SSD, but it's also 100-1000x slower than the SSDs most devs would think of. Their local "NVMe" is AFAIK also way slower than what it's meant to evoke in your mind unless you're getting the largest instances.
Actually, showing scaling behavior with large instances might make Planetscale look even better than competitors in AWS if you can scale further vertically before needing to go horizontal.
Even if you can't saturate them, even with low CPU cores, latency is drastically better which is highly important for database performance.
Having low latency is tangibly more important than throughput or number of IOPS once your dataset is larger than RAM no matter how many CPU cores you have.
Chasing down p95s and above really shine with NVMes purely from having whatever order of magnitude less latency.
Less latency also equates to less iowait time. All of this just leads to better CPU time utilization on your database.
Yes there are benefits like lower latency, which is often measured in terms of qd1 IOPS.
> Unlimited I/O — Metal's local NVMe drives offer higher I/O bandwidth than network-attached storage. You will run out of CPU long before you use all your I/O bandwidth.
Our uptime and reliability is also higher than what you might find elsewhere. It's not uncommon for companies paying lots of money to operate elsewhere to migrate to PlanetScale for that reason.
We're a serious database for serious businesses. If a business can't afford to spend $39/mo to try PlanetScale, they may be happier operating elsewhere until their business grows to a point where they are running into scaling and performance limits and can afford (or badly need, depending on the severity of those limits) to try us out.
Also totally OK if planetscale doesn't do this and that $39/month _is_ the best way to try them out, I just think it would be good for them to make explicit in the article what I should do if I think I might want it but want to try it.
If you do decide to operate on PlanetScale long-term, check out <https://planetscale.com/pricing> for consumption commitment discounting and other options that might make sense for your company.
That is rather uncommon for B2B.
Sure you can click around to determine but this always annoys me. Like everyone should know what your product is and does and all you service names. Put it front and center at the top!
> Our mission is simple: bring you the fastest and most reliable databases with the best developer experience. We have done this for 5 years now with our managed Vitess product, allowing companies like Cursor, Intercom, and Block to scale beyond previous limits.
> We are so excited to bring this to Postgres. Our proprietary operator allows us to bring the maturity of PlanetScale and the performance of Metal to an even wider audience. We bring you the best of Postgres and the best of PlanetScale in one product.
Seriously??
Did any of these companies reach out to them and say "you know, we wouldn't have been able to scale beyond our previous limits without you, thank you so much guys you saved us". If not, this is so insincere that it is cringe.
Are they implying these other companies lacked knowledge and expertise to put their databases on machines with NVMe storage? Or is it that they chose to use their product? If it is the latter, they should just say these companies chose us, instead of emphasizing how they just couldn't scale past their previous limits without PlanetScale's help.
"We chose PlanetScale to host our most demanding Vitess and Postgres workloads, doing millions of queries per second on hundreds of terabytes of data." – Sualeh Asif - Chief Product Officer @Anysphere (Cursor)
"Moving to PlanetScale added a 9 to our uptime." - Brian Scanlan @Intercom https://x.com/brian_scanlan/status/1963552743294967877
"In the past we've had issues when something unusual happens on a specific shard, resulting in spiked CPU and poor performance, and since migrating we haven't really seen instances of this, speaking to PlanetScale choosing the correct hardware for our existing load at the outset." - Aaron Young, Engineering Manager @block
It seems like you are reaching pretty hard to find an issue with this statement. Your comment seems to come from a lack of experience scaling databases and not understanding how difficult it is to do what we've done in partnership with our customers. Either that or deep or a high level of insincerity.
Up until this, I was gonna say, fair enough, I appreciate the direct replies from the staff.
But this paragraph settles it for me: PlanetScale as a company has a narcissistic personality which is fine for some I guess. Hopefully one day you will have a product that justifies that huge ego.
Now I have a higher opinion of PS
It is wild that an employee (lmao CEO) posts this way and it is sanctioned by his employer. I guess you're used to talking this way to your employees. But I am not an employee, so I can't feel your fury.
I am glad it is taking place in public, I can only imagine how poorly you must treat people behind closed doors. At least here people can see it for themselves how unprofessionally this company is run. I wish nothing but patience to your employees, God knows what they must be saying once you're out of the room.
You had one job here, to represent your company in a professional level-headed manner and you couldn't even do that. Such a shame.
Sigh.
What you wrote earlier.
>Did any of these companies reach out to them and say "you know, we wouldn't have been able to scale beyond our previous limits without you, thank you so much guys you saved us". If not, this is so insincere that it is cringe.
I guess I will let the rest of HN be the judge.
We are not saying that our customers don't have the knowledge or expertise to do what we do. Many of our customers, including the ones mentioned above, have exceptionally high levels of expertise and talent.
Even so, it is not a contradiction to say that we allowed them to scale beyond their previous limits. In some cases those limits were that their previous DBaaS providers simply lacked the ability to scale horizontally or provide blazing fast reads and writes the way we do out of the box. In other cases, we offer a degree of reliability and uptime that exceeded what customers' previous DBaaS could provide. Just two name a couple of limits customers have run into before choosing PlanetScale.
Expertise and know-how, and actually doing the thing, are different. Many of our customers who are technically capable of doing what we do would simply prefer to focus their knowledge and expertise building their core product, and let the database experts (that's us) do the databasing.
Have you worked at any web dev companies? Of the ones I’ve been at, precisely one had any desire to run their own DBs, and deaf was more out of necessity due to poor schema design needing local NVMe just to stay afloat.
Yes, most web companies lack the experience to touch a server, because their staff are all cloud-native, and their CTOs have drank the Kool-Aid and are convinced that it’s dangerous and risky to manage a server.
What is PlanetScale for Postgres?
Our mission is simple: bring you the fastest and most reliable databases with the best developer experience. We have done this for 5 years now with our managed Vitess product, allowing companies like Cursor, Intercom, and Block to scale beyond previous limits.
> PlanetScale is the world’s fastest relational database platform. We offer PostgreSQL and Vitess databases that run on NVMe-backed nodes to bring you scale, performance, reliability, and cost-efficiencies — without sacrificing developer experience.
> PlanetScale is a relational database platform that brings you scale, performance, and reliability — without sacrificing developer experience.
> We offer both Vitess and PostgreSQL clusters, powered by locally-attached NVMe drives that deliver unlimited IOPS and ultra-low latency.
> PlanetScale Metal is the fastest way to run databases in AWS or GCP. With blazing fast NVMe drives, you can unlock unlimited IOPS, ultra-low latencies, and the highest throughput for your workloads.
> The world’s fastest and most scalable cloud databases PlanetScale brings you the fastest databases available in the cloud. Both our Postgres and Vitess databases deliver exceptional speed and reliability, with Vitess adding ultra scalability through horizontal sharding.
> Our blazing fast NVMe drives unlock unlimited IOPS, bringing data center performance to the cloud. We offer a range of deployment options to cover all of your security and compliance requirements — including bring your own cloud with PlanetScale Managed.
Ironically, the _how_ is a major topic of the very page you started on (the blog).
Have some agency.
How is this any different that rds on nvme disks?
With a name like planet scale i assumed it would be some multi-master setup?
The problem is, there are not a lot of solutions to scale postgres beyond a single server. So if your DB grows to 100TB ... you have a issue as AWS does not provide a 100TB local NVME solution, only network storage.
Here comes Niki or whatever they named it. Their own alternative to Vitess (see Mysql), what is a solution that allows Mysql to scale horizontally from 1 to 1000's of servers, each with their own local storage.
So Planetscale made their own solution, so they can horizontal scale dozens, hundreds of AWS VPS with their own local storage, to give you those 100, 200, 500TB of storage space, without the need for network based storage.
There are other solutions like CockroachDB, YukubyteDB, TiDB that also allow for horizontal scaling but non are 100% postgres (and especially extensions) compatible.
Side node: The guy that wrote Vitess for Mysql, is also working on multigress (https://multigres.com/), a solution that does the same. Aka Vitess for postgres.
So yea, hope this helps a bit to explain it. If your not into dealing with DB scaling, the way they wrote it is really not helpful.
And also was the founder of planetscale
“We handle FKs in the app for flexibility.”
“And how many orphaned rows do you have?”
“…”
Not with that attitude: https://docs.postgrest.org/en/v13/index.html
Orphaned rows can very much matter for data privacy concerns, which is also where I most frequently see this approach failing.
If you are interested in their new technology that extends on hosted postgres check out Neki https://www.neki.dev/
* Do you support something like Aurora Fast Cloning (whether a true CoW fast clone or detaching a replica _without_ promoting it into its own cluster / branch with its own replicas, incurring cost)?
* Can PlanetScale Postgres set `max_standby_streaming_delay` to an indefinite amount?
* The equivalent of Aurora blue/green would be to make a branch and then switch branches, right?
We have not made max_standby_streaming_delay configurable yet. What's your use case?
I don't fully parse your question about blue/green. can you expand your question please? is this for online updrades?
Online upgrade or migration DDL - both use cases. I think Amazon's blue/green is effectively the same thing as your "branch-and-commit" strategy for schema migration. I was just looking for whether there's a significant difference.
> We have not made max_standby_streaming_delay configurable yet. What's your use case?
This goes with
>> Do you support something like Aurora Fast Cloning (whether a true CoW fast clone or detaching a replica _without_ promoting it into its own cluster / branch with its own replicas, incurring cost)?
The use case is mixing transaction processing and long-running queries/analytics in the same database using read replicas. The easiest way to do this in a native Postgres cluster is by using a "soft-real-time" read-replica with max_standby_streaming_delay set to `-1`, which is allowed to fall behind worst-case by the duration of a dashboard query and then keep up again.
This doesn't work in environments with more esoteric SAN-based replication strategies like Aurora, where max_standby_streaming_delay can't go beyond 30 seconds. In this case we have to use some combination of strategies: making CoW clones for each analytics task, architecting data to avoid leader/replication conflicts, for example by using partitioning, retrying replica queries until they don't access hot rows, or falling back to traditional analytics/data warehouse solutions at the application or logical replication layer. Not having to do these things would be a nice benefit over Aurora.
For context we are on Aurora Postgres right now, with several read replicas.
I did an interview all about PlanetScale Metal a couple of months ago: <https://www.youtube.com/watch?v=3r9PsVwGkg4>
For example, MySQL was easier to get running and connect to. These cloud offerings (Planetscale, Supabase, Neon, even RDS) have solved that. MySQL was faster for read heavy loads. Also solved by the cloud vendors.
* Scale-out inertia: yes, cloud vendors provide similar shading and clustering features for Postgres, but they're all a lot newer.
* Thus, hiring. It's easier to find extreme-scale MySQL experts (although this erodes year by year).
* Write amplification, index bloat, and tuple/page bloat for extremely UPDATE heavy workloads. It is what it is. Postgres continues to improve, but it is fundamentally an MVCC database. If your workload is mostly UPDATEs and simple SELECTs, Postgres will eventually fall behind MySQL.
* Replication. Postgres replication has matured a ridiculous amount in the last 5-10 years, and to your point, cloud hosting has somewhat reduced the need to care about it, but it's still different from MySQL in ways that can be annoying at scale. One of the biggest issues is performing hybrid OLAP+OLTP (think, a big database of Stuff with user-facing Dashboards of Stuff). In MySQL this is basically a non-event, but in Postgres this pattern requires careful planning to avoid falling afoul of max_standby_streaming_delay for example.
* Neutral but different: documentation - Postgres has better-written user-facing documentation for user-facing functions, IMO. However, _if_ you don't like reading source code, MySQL has better internals documentation, and less magic. However, Postgres is _very_ well written and commented, so if you're comfortable reading source, it's a joy. A _lot_ of Postgres work, in my experience, is reading somewhat vague documentation followed by digging into the source code to find a whole bunch of arbitrary magic numbers. If you don't believe me , as an exercise, try to figure out what `default_statistics_target` _actually_ does.
Anyway, I still would choose a managed Postgres solution almost universally for a new product. Unless I know _exactly_ what I'm going to be doing with a database up-front, Postgres will offer better flexibility, a nicer feature-set, and a completely acceptable scale story.
This is a really gnarly problem at scale I've rarely seen anyone else bring up. Either you use max_standby_streaming_delay and queries that conflict with replication cause replication to lag or you use hot_standby_feedback and long running queries on the OLAP replica cause problems on the primary.
Logical Decoding on a replica in also needs hot standby feedback which is a giant PITA for your ETL replica.
I am however highly amused that everyone in this thread defending MySQL ends with some form of "I'd still choose Postgres though!". :)
Because I have used MySQL for over 20 years and it is what I know!
If you're running it yourself I could see why you'd do that, but if you're mostly just using it now, Postgres can do all the same things in the database pretty much the same way, plus a whole lot more.
Additionally, almost all my workloads run in our own datacenters, so I haven't yet been able to offload the administration bits to the cloud.
MySQL pros:
The MySQL docs on how the default storage engine InnoDB locks rows to support transaction isolation levels is fantastic. [1] This can help you better architect your system to avoid lock contention or understand why existing queries may be contending for locks. As far as I know Postgres does not have docs like that.
MySQL uses direct I/O so it disables the OS page cache and uses its own buffer pool instead[2]. Whereas Postgres doesn't use direct I/O so the OS page cache will duplicate pages (called the "double buffering" problem). So it is harder to estimate how large of a dataset you can keep in memory in Postgres. They are working on it though [3]
If you delete a row in MySQL and then insert another row, MySQL will look through the page for empty slots and insert there. This keeps your pages more compact. Postgres will always insert at the bottom of the page. If you have a workload that deletes often, Postgres will not be using the memory as efficiently because the pages are fragmented. You will have to run the VACUUM command to compact pages. [4]
Vitess supports MySQL[5] and not Postgres. Vitess is a system for sharding MySQL that as I understand is much more mature than the sharding options for Postgres. Obviously this GA announcement may change that.
Uber switched from MySQL to Postgres only to switch back. It's a bit old but it's worth a read. [6]
Postgres pros:
Postgres supports 3rd party extensions which allow you to add features like columnar storage, geo-spatial data types, vector database search, proxies etc.[7]
You are more likely to find developers who have worked with Postgres.[8]
Many modern distributed database offerings target Postgres compatibility rather than MySQL compatibility (YugabyteDB[9], AWS Aurora DSQL[10], pgfdb[11]).
My take:
I would highly recommend you read the docs on InnoDB locking then pick Postgres.
[1] https://dev.mysql.com/doc/refman/8.4/en/innodb-locking.html
[2] https://dev.mysql.com/doc/refman/8.4/en/memory-use.html
[3] https://pganalyze.com/blog/postgres-18-async-io
[4] https://www.percona.com/blog/postgresql-vacuuming-to-optimiz...
[6] https://www.uber.com/blog/postgres-to-mysql-migration/
[7] https://www.tigerdata.com/blog/top-8-postgresql-extensions
[8] https://survey.stackoverflow.co/2024/technology#1-databases
This made me laugh pretty hard, but it's basically my take too.
I'd pretty much go with the same thing. It's interesting to me, though, that people see Postgres as the "big database" and MySQL as the "hobby database." I basically see things as the exact opposite - Postgres is incredibly flexible, very nice to use, and these days, has fewer foot guns at small scale (IMO) than MySQL. It's more academically correct and it generally tends to "work better" at almost any achievable "normal" scale.
On the other hand, Postgres is full of pitfalls and becomes very difficult at exceptionally large scale (no, not "your startup got traction" scale). Postgres also doesn't offer nearly the same quality of documentation or recipes for large scale optimization.
Almost everything in the 2016 Uber article you link, which is a _great_ read, is still true to some extent with vanilla Postgres, although there are more proprietary scale-out options available now. Postgres simply has not been "hyper-scaled" to the extent that MySQL has and most massive globally sharded/replicated systems started as MySQL at some point.
For this same reason, you are likely to be able to hire a MySQL-family DBA with more experience at hyper-scale than a Postgres one.
With all that said, I still agree - I'd almost universally start with Postgres, with MySQL as a back-pocket scale-up-and-out option for specific very large use-cases that don't demand complex query execution or transactional workload properties. Unless you have an incredibly specific workload which is a very specific combination of heavy UPDATE and `SELECT * FROM x WHERE id=y`, Postgres will do better at any achievable scale you will find today.
Haha glad you enjoyed it.
> It's interesting to me, though, that people see Postgres as the "big database" and MySQL as the "hobby database." I basically see things as the exact opposite
I agree. As I understand Postgres started as a challenger to SQL[1][2] with support for more complicated data types but then in the mid '90s they added SQL support and it was renamed PostgreSQL.
Anecdotally I have heard from people working in industry in the 2000s-2010s that Postgres was viewed as less mature so many of the large web applications were on MySQL. This is a bit confusing to me because MySQL was released around the same time Postgres added SQL support but maybe it was because MySQL had a company behind it.
Many large scale applications of those days were using MySQL. Facebook developed RocksDB and then MyRocks[3] based on MySQL. Youtube built Vitess [4] which was sharded MySQL which was later used by Slack [5], Square, Pintrest and others.
> It's more academically correct
I'm curious about this. I know that Postgres implements MVCC in a wasteful way and uses the OS page cache in addition to its buffer pool resulting in double buffering rather than direct I/O. I feel like the more I learn about database internals the more I learn about how MySQL did things the "right" way and Postgres's approach is a bit odd. But perhaps I'm missing something.
[1] https://en.wikipedia.org/wiki/PostgreSQL#History
[2] https://db.cs.cmu.edu/papers/2024/whatgoesaround-sigmodrec20...
[3] https://engineering.fb.com/2016/08/31/core-infra/myrocks-a-s...
[4] https://vitess.io/docs/22.0/overview/history/
[5] https://slack.engineering/scaling-datastores-at-slack-with-v...
This is a good distinction too; I was thinking from the end-user’s standpoint, where Postgres has historically been seen as more faithful to both SQL standards and consistency guarantees.
I think the main reason MySQL took off faster than Postgres originally is because it had better defaults. MySql worked out of the box on modern hardware. Postgres assumed you only have 4MB of memory until well into the 2010s, in part to make it keep running on everything it ever ran on in the past.
So when you first installed Postgres, it would perform terribly until you optimized it.
It's really a fantastic case study in setting good defaults.
Perhaps could you share some of those good / bad things InnoDB / MySQL does?
Note: some of this may have changed in the past 6+ years I've avoided looking at it again.
Close [0]: (VAR)CHAR BINARY, which is its own type, uses the `_bin` collation for the column or table character set. (VAR)BINARY stores binary strings, and uses the `binary` character set and collation.
In fairness, the amount of options, gotchas, and WTF involved with collations in any DB is mind-boggling.
[0]: https://dev.mysql.com/doc/refman/8.4/en/binary-varbinary.htm...
By converted it was legacy records where the original int value was injected to uuid format. This was then stored in a binary field as a primary key.
I've done extensive work on improving the Postgres B-Tree code, over quite a number of releases. I'm not aware of any problems with high-insert workloads in particular. I have personally fixed a number of subtle issues that could lead to lower space utilization with such workloads [1][2] in the past, though.
if there's a remaining problem in this area, then I'd very much like to know about it.
[1] https://www.youtube.com/watch?v=p5RaATILoiE [2] https://speakerdeck.com/peterg/nbtree-arch-pgcon
I've also heard similar behaviors exhibited from other folks who had similar high-write workloads on postgres.
Sorry, I don't have anything super tangible to provide off the top of my head, or metrics/code I can share to recreate! It was also a project that required a lot of data to recreate the setup for.
It's possible to recycle pages within indexes that have some churn (e.g., with workloads that use bulk range deletions). But it's not possible for indexes to shrink on their own, in a way that can be observed by monitoring the output of psql's "\di+" command. For that you'd need to REINDEX or run VACUUM FULL.
Is there no way to automatically clean up indexes then?
That only happens when it is possible to give back space to the OS filesystem using relation truncation in the first place -- which isn't all that common (it's generally only seen when there are bulk range deletions that leave lots of contiguous empty space at the end of a table/heap structure). But you said that this is an append-only workload.
This behavior can be disabled by setting the vacuum_truncate table storage parameter to "off". This is useful with workloads where relation truncation is disruptive (truncation needs to acquire a very heavyweight table lock).
> Is there no way to automatically clean up indexes then?
What I meant was that indexes do not support relation truncation. It follows that the amount of space used for an index (from the point of view of the OS) cannot ever go down, barring a REINDEX or a VACUUM FULL.
This does not mean that we cannot reuse space for previously freed/deleted pages (as long as we're reusing that space for the same index). Nor does it mean that "clean up" isn't possible in any general sense.
Postgres is involved somehow. I get that.
the very first line:
> The world’s fastest and most scalable cloud databases
the second line:
> PlanetScale brings you the fastest databases available in the cloud. Both our Postgres and Vitess databases deliver exceptional speed and reliability, with Vitess adding ultra scalability through horizontal sharding.
i know exactly what they do. zero fluff. and, i'm now interested.
Also, the architecture of Aurora is very different from PlanetScale's:
* AWS Aurora uses storage-level replication, rather than traditional Postgres replication. This architecture has the benefit that a change made on an Aurora primary is visible very quickly on the read replicas. * PlanetScale is a "shared nothing" architecture using what I would call traditional methods of data replication, where the primary and the replicas have independent copies of the data. This means that replication lag is a possibility customers must consider, whereas Aurora customers mostly ignore this. * If you set up 3 AWS RDS Postgres instances in separate availability zones and set up replication between them, that would be roughly similar to PlanetScale's architecture.
Scaling postgres is not that informative. I am sorry if I annoyed the people working on it. I think the USP could be explained more obviously.
More features will come later on which I think will set us apart even more from RDS, Aurora and other providers, but too early to talk about those.
Beyond features, there are other reasons you might choose us. For example, we've built a reputation on being having excellent reliability/uptime, with exceptionally good support. These are harder to back up with hard data, but our customer testimonials are a good testament to this.
On the Postgres side: https://planetscale.com/blog/benchmarking-postgres
On the Vitess side, I would point to our customers, who, on individual databases, have achieved pretty high QPS (millions), on large datasets (100s of TiBs), at a latency that is lower than what other DBaaS providers can offer: https://planetscale.com/case-studies/cash-app
> Our mission is simple: bring you the fastest and most reliable databases with the best developer experience.
We're presently in a migration for our larger instances on Heroku, but were able to test on a new product (fairly high writes/IOPs) and it's been nice to have more control vs. Heroku (specifically, ability to just buy more IOPs or storage).
Had one incident during the beta which we believed we caused on our own but within 5 minutes of pinging them they had thrown multiple engineers on it to debug and resolve quickly. For me, that's the main thing I care about with managed DB services as most tech is commoditization at this point.
Just wish the migration path from Heroku was a tad easier (Heroku blocks logical replication on all instances) but pushing through anyway because I want to use the metal offering.
https://planetscale.com/benchmarks/aurora
Seems a bit better, but they benchmarked on a kind of small db (500gb db / db.r8g.xlarge)
The product we are GA'ing today has the option of PlanetScale Metal which is extremely fast and scales write QPS further than any of the other single-primary Postgres hosts.
So yes, the data per-node is ephemeral, but it is redundant and durable for the whole cluster.
While that's not impossible, the reality is that's very low.
So simply restarting nodes wouldn't trigger restoring from backup, but yes, in our case, replacing nodes entirely does require that node to restore from a backup/WALs and catch back up in replication.
EBS doesn't entirely just solve this, you still have failures and still need/want to restore from backups. This is built into our product as a fundamental feature. It's transparent to users, but the upside is that restoring from backups and creating backups is tested every day multiple times per day for a database. We aren't afraid of restoring from backups and replacing nodes by choice or by failure. It's the same to us.
We do all of the same operations already on EBS. This magic is what enables us to be able to use NVMe's since we treat EBS as ephemeral already.
I did an interview all about PlanetScale Metal a couple of months ago: <https://www.youtube.com/watch?v=3r9PsVwGkg4>
"We guarantee durability via replication". I've starting noticing this pattern more where distributed systems provide durability by replicating data rather than writing it to disk and achieving the best of both worlds. I'm curious
1. Is there a name for this technique?
2. How do you calculate your availability? This blog post[1] has some rough details but I'd love to see the math.
3. I'm guessing a key part of this is putting the replicas in different AZs and assuming failures aren't correlated so you can multiply the probabilities directly. How do you validate that failures across AZs are statistically independent?
Thanks!
[1] https://planetscale.com/blog/planetscale-metal-theres-no-rep...
2. The thinking laid out in the blog post you linked to is how we went about it. You can do the math with your own parameters by computing the probability of a second node failure within the time it takes to recover from a first node failure. These are independent failures, being on physically separate hardware in physically separate availability zones. It's only when they happen together that problems arise. The core is this: P(second node failure within MTTR for first node failure) = 1 - e^( -(MTTR node failure) / (MTBF for a node) )
3. This one's harder to test yourself. You can do all sorts of tests yourself (<https://rcrowley.org/2019/disasterpiece-theater.html>) and via AWS FIS but you kind of have to trust the cloud provider (or read their SOC 2 report) to learn how availability zones really work and really fail.
independence simplifies things
= P(one failure)P(second failure within MTTR of first node)
= P(one failure) * (1 - e^-λx)
where x = MTTR for first node
λ = 1/MTBF
plugging in the numbers from your blog post
P(one failure within 30 days) = 0.01 not sure if this part is correct.
MTTR = 5 minutes + 5 hours =~ 5.083 hours
MTBF = 30 days / 0.01 = 3000 days = 72000 hours
0.01 * (1 - e^(-5.083 / 72000)) = 0.0000007 ~= 0.00007 %
I must be doing something wrong cuz I'm not getting the 0.000001% you have in the blog post. If there's some existing work on this I'd be stoked to read it, I can't quite find a source.
Also there's two nodes that have the potential to fail while the first is down but that would make my answer larger not smaller.
Even writing to one disk, though, isn't good enough. So we write to three and wait until two have acknowledged before we acknowledge that write to the client.
Reboots typically don't otherwise do anything special unless they also trigger a host migration. GCP live migration has some mention of support though
GCP mentions data persists across reboots here https://cloud.google.com/compute/docs/disks/local-ssd#data_p...
note that stop/terminate via cloud APIs usually releases host capacity for other customers and would trigger data wipe, a guest initiated reboot typically will not.
1. https://planetscale.com/pricing?architecture=x86-64&cluster=...
1. You say "ephemeral", but my understanding is that NVMe is non-volatile so upon crash and restart we should be able to recover the state of the memory. Is is ephemeral because of how EC2 works where you might not get that same physical box and memory addresses back?
2. Can you explain what "Semi-synchronous replication" is? Your docs say "This ensures every write has reached stable storage in two availability zones before it’s acknowledged to the client." but I would call that synchronous since the write is blocked until it is replicated.
Thanks!
When we say ephemeral we mean that if the host compute dies in a permanent way (which happens from time to time) the data on the NVMe drives attached to that host is not recoverable by us. AWS/GCP might have recovery mechanisms internally it, but we don't have API access to those APIs.
When we say "semi-synchronous replication" we mean it in the sense of MySQL semi-synchronous replication: https://dev.mysql.com/doc/refman/8.4/en/replication-semisync.... To be honest I'm not exactly sure where the "semi" comes from but here are two possible reasons I can think of why:
1. We actually only require that 1 of the 2 replicas sends an acknowledgement to the primary that it has durably stored the transaction to its relay log before the primary in turn sends an acknowledgement back to the client application. 2. The transaction is visible (can be read) on both the primary and the replica _before_ the primary sends back an acknowledgement that the transaction was committed back to the client application.
For (2), semi-synchronous replication is a MySQL term which we realize in Postgres is by using synchronous replication with ANY one of the available replicas acknowledging the write. This allows us to guarantee durability in two availability zones before acknowledging writes to clients.
In MySQL the _semi_ part of semi-synchronous replication refers to the write only needing to be written to the binary log on the replica and not (necessarily) applied to InnoDB. This is why a MySQL database might be both acknowledging semi-synchronous writes and reporting non-zero replication lag.
Ah. I wonder are writes in the log but not yet in InnoDB are available for reads? Then your write may succeed but a subsequent read from a replica would not see it so you lose read-after-write consistency. Perhaps that's another tradeoff.
I'll have to research a bit more but the MySQL docs [1] say "requires only an acknowledgement from the replicas, not that the events have been fully executed and committed on the replica side" which implies that it can't be read yet.
Thanks!
[1] https://dev.mysql.com/doc/refman/8.4/en/replication-semisync...
First and foremost, the extra copies of the data are for fault tolerance. In specific circumstances they may offer some slack capacity that you can use to serve (potentially stale) reads but they're never going to offer read-your-writes consistency.
The docs you quote are a bit obtuse but the "acknowledgement" is the row event being written to the binary log. "Fully executed and committed" is when it makes its way into InnoDB and becomes available for future reads.
Thanks!
I read the comments and it seems that in one of them they mention between supabase vs planetscale postgres that maybe they can use a project like supabase and then come to planetscale when their project grows enough to support that decision.
How would a migration from supabase to planetscale even go and at what scale would something like that be remotely better i suppose.
Great project tho and I hope that planetscale's team doesn't get bored listening to all requests asking for a free tier like me, maybe I am addicted on those sweet freebies!
I will try to create a product one day that will have a supabase -> planetscale migration one day to know that I have made it lol (jk)
have a nice day
Ah, overlooked first sentence, read only all headings and navigation and footer:
> is now generally available and out of private preview