Ugh. I know this gives the illusion of fairness, but it's not how any self-respecting software engineer should approach benchmarks. You have hardware. Perhaps you have virtualized hardware. You tune to the hardware. There simply isn't another way, if you want to be taken seriously.
Some will say that in a container-orchestrated environment, tuning goes out the window since "you never know" where the orchestrator will schedule the service but this is bogus. If you've got time to write a basic deployment config for the service on the orchestrator, you've also got time to at least size the memory usage configs for PostgreSQL and/or Redis. It's just that simple.
This is the kind of thing that is "hard and tedious" for only about five minutes of LLM query or web search time and then you don't need to revisit it again (unless you decide to change the orchestrator deployment config to give the service more/less resources). It doesn't invite controversy to right-size your persistence services, especially if you are going to publish the results.
If the defaults are fine for a use case then unless I want to tune it for personal interest it’s either a poor use of my fun time or a poor use of my clients funds.
It doesn't matter if you've crippled the benchmark if the performance of both options still exceeds your expectations. Not all of us are trying eek out every drop of performance
And, well, if you are then you can ignore the entire post because Redis offers better perf than postgres and you'd use that. It's that simple.
You probably mean "eke out". Unless the performance is particularly scary :)
Amazon actually moved away from caches for some parts of its system because consistent behavior is a feature, because what happens if your cache has problems and the interaction between that and your normal thing is slow? What if your cache has some bugs or edge case behavior? If you don't need it you are just doing a bunch of extra work to make sure things are in sync.
I don't think this holds true. Caches are used for reasons other than performance. For example, caches are used in some scenarios for stampede protection to mitigate DoS attacks.
Also, the impact of caches on performance is sometimes negative. With distributed caching, each match and put require a network request. Even when those calls don't leave a data center, they do cost far more than just reading a variable from memory. I already had the displeasure of stumbling upon a few scenarios where cache was prescribed in a cargo cult way and without any data backing up the assertion, and when we took a look at traces it was evident that the bottleneck was actually the cache itself.
Not really. Running out of computational resources to fulfill requests is not a performance issue. Think of thinks such as exhausting a connection pool. More often than not, some components of a system can't scale horizontally.
Is that production? When you basket it into "low level" it sounds like a base case but it really isn't.
In production you don't have local storage, RAM being used for all kinds of other things, your CPU only available in small slices, network effects and many others.
> If the defaults are fine for a use case
Which I hope isn't the developer's edition of it works on my machine.
I think I have more trust in the PG defaults that in the output of a LLM or copy pasting some configuration I might not really understand ...
Uh, yea... why would you? Do you do that for configurations you found that weren't from LLMs? I didn't think so.
I see takes like this all the time and I'm really just mind-boggled by it.
There are more than just the "prompt it and use what it gives me" use cases with the LLMs. You don't have to be that rigid. They're incredible learning and teaching tools. I'd argue that the single best use case for these things is as a research and learning tool for those who are curious.
Quite often I will query Claude about things I don't know and it will tell me things. Then I will dig deeper into those things myself. Then I will query further. Then I will ask it details where I'm curious. I won't blindly follow or trust it like I wouldn't a professor or anyone or any thing else, for that matter. Just like I would when querying a human for or the internet in general for information, I'll verify.
You don't have to trust it's code, or it's configurations. But you can sure learn a lot from them, particularly when you know how to ask the right questions. Which, hold onto your chairs, only takes some experience and language skills.
If you have 5 minutes then you can't as you say :
> Then I will dig deeper into those things myself ...
So my point is I don't care if it's coming from LLM or a random blog, you won't have time to know if it's really working (ideally you would want to benchmark the change).
If you can't invest the time better to stay with the defaults, which in most project the maintainers spent quite a bit of time to make sensible.
I happen to have a good bit of experience with PostgreSQL, so that colored the "5 minutes" part of it. Still, most of the time, you "have" more than 5 minutes to create the orchestrator's deployment config for the service (which never exists by default on any k8s-based orchestrator). I'm simply saying to not be negligent of the service's own config, even though a default exists.
I've asked ChatGPT to summarize Go build constraints, especially in the context of CPU microarchitectures (e.g. mapping "amd64.v2" to GOARCH=amd64 GOAMD64=v2). It repeatedly smashed its head on GORISCV64, claiming all sorts of nonsense such as v1, v2; then G, IMAFD, Zicsr; only arriving at rva20u64 et al under hand-holding. Similar nonsense for GOARM64 and GOWASM. It was all right there in e.g. the docs for [cmd/go].
This is the future of computer engineering. Brace yourselves.
Remember, an LLM is a JPG of all the text of the internet.
Isn't that the whole point, to ask it specific tidbits of information? Are we to ask it large, generic pontifications and claim success when we get large, generic pontifications back?
The narrative around these things changes weekly.
You can see if it's used search in the interface, which helps evaluate how likely it is to get the right answer.
Stay brief. Do not use emoji.
Check primary sources, avoid speculation.
Do not suggest next steps.
Do I have to repeat this every time I suspect the answer will be incorrect?You might think of it as a cache, worth checking first for speed reasons.
The big downside is not that they sometimes fail, its that they give zero indication when they do.
You can put the relevant docs in your prompt, add them to a workspace/project, deploy a docs-focused MCP server, or even fine-tune a model for a specific tool or ecosystem.
> You can put the relevant docs in your prompt
I've done a lot of experimenting with these various options for how to get the LLM to reference docs. IMO it's almost always best to include in prompt where appropriate.
For a UI lib that I use that's rather new, specifically there's a new version that the LLMs aren't aware of yet, I had the LLM write me a quick python script that just crawls the docs site for the lib and feeds the entire page content back into itself with a prompt describing what it's supposed to do (basically telling it to generate a .md document with the specifics about that thing, whether it's a component or whatever, ie: properties, variants, etc in an extremely brief manner) as well as build an 'index.md' that includes a short paragraph about what the library is and a list of each component/page document that is generated. So in about 60 seconds it spits out a directory full of .md files and I then tell my project-specific LLM (ie: Claude Code or Opencode within the project) to review those files with the intention of updating the CLAUDE.md in the project to instruct that any time we're building UI elements we should refer to the index.md for the library to understand what components are available and when appropriate to use one of them we _must_ review the correlating document first.
Works very very very well. Much better than an MCP server specifically built for that same lib. (Huge waste of tokens, LLM doesn't always use it, etc) Well enough that I just copy/paste this directory of docs into my active projects using that library - if I wasn't lazy I'd package it up but too busy building stuff.
Where's that productivity increase everyone's been talking about?
Don't ask LLMs that are trained on a whole bunch of different versions of things with different flags and options and parameters where a bunch of people who have no idea what they're doing have asked and answered stackoverflow questions that are likely out of date or wrong in the first place how to do things with that thing without providing the docs for the version you're working with. _Especially_ if it's the newest version, regardless if it's cutoff date was after that version was released - you have no way to know if it was _included_. (Especially about something related to a programming language with ~2% market share)
The contexts are so big now - feed it the docs. Just copy paste the whole damn thing into it when you prompt it.
That's what you'd do by hand if you were optimizing, so save some time and point Claude Code or Codex CLI or GitHub Copilot at it and see what happens.
Just like Mr Meeseeks, it’s only a matter of time before it realizes that deleting all the data will make the DB lightning fast.
I run a pricing calculator here - for 50,000 input tokens, 5,000 output tokens (which I estimate would be about right for a PostgreSQL optimization loop) GPT-5 would cost 11.25 cents: https://www.llm-prices.com/#it=50000&ot=5000&ic=1.25&oc=10
I use Codex CLI with my $20/month ChatGPT account and so far I've not hit the limit with it despite running things like this multiple times a day.
If that is true in some months there will be no dba jobs.
Funny that at the same time SQL is one of the most requested languages in job postings.
Knowing how to "run an agentic loop to optimize the config file" is meaningless techno-jabber to 99.99% of the world's population.
I am entirely unconcerned for my future career prospects.
I don't think end users want to "optimize their PostgreSQL servers" even if they DID know that's a thing they can do. They want to hire experts who know how to make "that tech stuff" work.
Saying that anybody can learn to unblock a sink by watching youtube is your tipical HN mentality of stating opinons as facts.
I don't understand what you mean. Are you saying that it's not true that anyone could learn to unblock a sink by watching YouTube videos?
Is not that hard to understan mate. Maybe put my comment in the LLM so you can get it.
What is your point again?
If you don't like the sink analogy what analogy would you use instead for this? I'm confident there's a "people could learn X from YouTube but chose to pay someone else instead" that's more effective than the sink one.
It does make people FEEL more productive.
There are thousands (probably millions) of us walking around with anecdotal personal evidence at this point.
If the personal opinions on this site were true, half of the code in the world would be functional, lisp would be one of the languages most used and Microsoft would have not bougth DropBox.
I really think HN hive minds opinions means nothing. Too much money here to be real.
I hardly remember anyone on HN, a tech audience, saying they used blockchain everyday. Why don't you go find some of that evidence?
You can't become a db expert with a promt.
I hope you make a lot of money with your lies and good luck.
These days you can replace those books and forums with a top tier LLM, but you still need to put in the practice yourself. Even with AI assistance that's still a lot of work.
You can replace books with your own time and research.
Again making statements that are just not true. Typical HN behavior.
As far as I know in the field of logic the one making a statement, in this case you, is the one who has to prove it.
But in this case you make a statemen and then ask ME to prove it wrong? Makes zero fucking sense.
As much as you don't apreciate it, that is how debate and logic works.
You buy a non-fiction book to learn something, or to act as a reference.
An LLM provides an alternative mechanism for learning that thing, or looking up those reference points.
What am I missing here?
Do you think a search engine could replace a book?
It is so self evident true that you don't even need to reason about it.
That LLMs can replace a book is a fundamental truth of the universe like the euclid postulates or like 1=1.
Well then there is no way to continue the conversation, because by definition axioms can't be false.
They charge per token, everyone charges per token.
Benchmarking the defaults and benchmarking a tuned setup will measure very different things, but both of them matter.
For example, if you keep adding data to a Redis server under default config, it will eat up all of your RAM and suddenly stop working. Postgres won't do the same, because its default buffer size is quite small by modern standards. It will happily accept INSERTs until you run out of disk, albeit more slowly as your index size grows.
The two programs behave differently because Redis was conceived as an in-memory database with optional persistence, whereas Postgres puts persistence first. When you use either of them with their default config, you are trusting that the developers' assumptions will match your expectations. If not, you're in for a nasty surprise.
Enough people use the default settings that benchmarking the default settings is very relevant.
It often isn't a good thing to rely on the defaults, but it's nevertheless the case that many do.
(Yes, it is also relevant to benchmark tuned versions, as I also pointed out, my argument was against the claim that it is somehow unfair not to tune)
> if you want to be taken seriously
For someone so enthusiastic about giving feedback you don't seem to have invested a lot of effort into figuring out how to give it effectively. Your done and demeanor diminish the value of your comment.
Postgres is a power tool usable for many many use cases - if you want performance it must be tuned.
If you judge Postgres without tuning it - that's not Postgres being slow, that's the developer being naive.
Didn't OP end by picking Postgres anyway?
It's the right answer even for a naive developer, perhaps even more so for a naive one.
At the end of the post it even says
>> Having an interface for your cache so you can easily switch out the underlying store is definitely something I’ll keep doing
IOW, he judged it fast enough.
Everyone was talking about C++ optimizations, mutex everywhere etc - which was in fact a problem.
However.. I seemed to be the first person to actually try to debug what the database was doing, and it was going to disk all the time with a very small cache.. weird..
I see the MySQL settings on a 1TB ram machine and they were... out-of-the-box settings.
With small adjustments I improved the performance of this core system an order of magnitude.
not even! if you don't need to go super deep with tablespace configs or advanced replication right away, pgtune will get you to a pretty good spot in the time it takes to fill out a form.
And TFA shows you that in this world Postgres is close enough to Redis.
Otherwise, the article does well to show that we can get a lot of baseline performance either way. Sometimes a cache is premature optimisation.
I sometimes read this stuff like people explaining how they replaced their spoon and fork with a spork and measured only a 50% decrease in food eating performance. And have you heard of the people with a $20,000 Parisian cutlery set to eat McDonalds? I just can't understand insane fork enjoyers with their over-engineered their dining experience.
The less dependencies my project has the better. If it is not needed why use it?
Writes will go to RAM as well if you have synchronous=off.
Your comments suggest that you are definitely missing some key insights onto the topic.
If you, like the whole world, consume Redis through a network connection, it should be obvious to you that network is in fact the bottleneck.
Furthermore, using a RDBMS like Postgres may indeed imply storing data in a slower memory. However, you are ignoring the obvious fact that a service such as Postgres also has its own memory cache, and some query results can and are indeed fetched from RAM. Thus it's not like each and every single query forces a disk read.
And at the end of the day, what exactly is the performance tradeoff? And does it pay off to spend more on an in-memory cache like Redis to buy you the performance Delta?
That's why real world benchmarks like this one are important. They help people think through the problem and reassess their irrational beliefs. You may nitpick about setup and configuration and test patterns and choice of libraries. What you cannot refute are the real world numbers. You may argue they could be better if this and that, but the real world numbers are still there.
I think "you are definitely missing some key insights onto the topic". The whole world is a lot bigger than your anecdotes.
Not to be annoying - but... what?
I specifically _do not_ use Redis over a network. It's wildly fast. High volume data ingest use case - lots and lots of parallel queue workers. The database is over the network, Redis is local (socket). Yes, this means that each server running these workers has its own cache - that's fine, I'm using the cache for absolutely insane speed and I'm not caching huge objects of data. I don't persist it to disk, I don't care (well, it's not a big deal) if I lose the data - it'll rehydrate in such a case.
Try it some time, it's fun.
> And at the end of the day, what exactly is the performance tradeoff? And does it pay off to spend more on an in-memory cache like Redis to buy you the performance Delta?
Yes, yes it is.
> That's why real world benchmarks like this one are important.
That's not what this is though. Just about nobody who has a clue is using default configurations for things like PG or Redis.
> They help people think through the problem and reassess their irrational beliefs.
Ok but... um... you just stated that "the whole world" consumes redis through a network connection. (Which, IMO, is wrong tool for the job - sure it will work, but that's not where/how Redis shines)
> What you cannot refute are the real world numbers.
Where? This article is not that.
Eh - while surely not everyone has the benefits of doing so, I'm running Laravel and using Redis is just _really_ simple and easy. To do something via memory mapped files I'd have to implement quite a bit of stuff I don't want/need to (locking, serialization, ttl/expiration, etc).
Redis just works. Disable persistence, choose the eviction policy that fits the use, config for unix socket connection and you're _flying_.
My use case is generally data ingest of some sort where the processing workers (in my largest projects I'm talking about 50-80 concurrent processes chewing through tasks from a queue (also backed by redis) and are likely to end up running the same queries against the database (mysql) to get 'parent' records (ie: user associated with object by username, post by slug, etc) and there's no way to know if there will be multiples (ie: if we're processing 100k objects there might be 1 from UserA or there might be 5000 by UserA - where each one processing will need the object/record of UserA). This project in particular there's ~40 million of these 'user' records and hundreds of millions of related objects - so can't store/cache _all_ users locally - but sure would benefit from not querying for the same record 5000 times in a 10 second period.
For the most part, when caching these records over the network, the performance benefits were negligible (depending on the table) compared to just querying myqsl for them. They are just `select where id/slug =` queries. But when you lose that little bit of network latency and you can make _dozens_ of these calls to the cache in the time it would take to make a single networked call... it adds up real quick.
PHP has direct memory "shared memory" but again, it would require handling/implementing a bunch of stuff I just don't want to be responsible for - especially when it's so easy and performant to lean on Redis over a unix socket. If I needed to go faster than this I'd find another language and likely do something direct-to-memory style.
My own conclusions from your data:
- Under light workloads, you can get away with Postgres. 7k RPS is fine for a lot of stuff.
- Introducing Redis into the mix has to be carefully weighted against increased architectural complexity, and having a common interface allows us to change that decision down the road.
Yeah maybe that's not up to someone else's idea of a good synthetic benchmark. Do your load-testing against actual usage scenarios - spinning up an HTTP server to serve traffic is a step in the right direction. Kudos.
I mean what if an actual benchmark showed Redis is 100X as fast as postgres for a certain use case? What are the constraints you might be operating with? What are the characteristics of your workload? What are your budgetary constraints?
Why not just write a blog post saying "Unoptimized postgres vs redis for the lazy, running virtualized with a bottleneck at the networking level"
I even think that blog post would be interesting, and might be useful to someone choosing a stack for a proof of concept. For someone who to scale to large production workloads (~10,000 requests/second or more), this isn't a very useful article, so the criticism is fair, and I'm not sure why you're dismissing it off hand.
Would it bother you as well if the conclusion was rephrased as "based on my observations, I see no point in rearchitecting the system to improve the performance by this much"?
I think you are too tied to a template solution that not only you don't stop to think why you're using it or even if it is justified at all. Then, when you are faced with observations that challenge your unfounded beliefs, you somehow opt to get defensive? That's not right.
Within the constraints of my setup, postgres came out slower but still fast enough. I don't think I can quantify what fast enough is though. Is it 1000 req/s? Is it 200? It all depends on what you're doing with it. For many of my hobby projects which see tens of requests per second it definitely is fast enough.
You could argue that caching is indeed redundant in such cases, but some of those have quite a lot of data that takes a while to query.
Add an app that actually uses postgres as a database, you will probably see its performance crumble, as the app will content the cache for resources.
Nobody asked for benchmarking as rigorous as you would have in a published paper. But toy examples are toy examples, be it in a publication or not.
Conclusions aren't incorrect either, so what's the problem?
A takeaway could be that you can dedicate a postgres instance for caching and have acceptable results. But who does that? Even for a relatively simple intranet app, your #1 cost when deploying in Google Cloud would probably be running Postgres. Redis OTOH is dirt cheap.
Maybe I'm reading the article wrong, but it is representative of any application that uses a PosgreSQL server for data, correct?
In what way is that not a real-life scenario? I've deployed Single monolith + PostgreSQL to about 8 different clients in the last 2.5 years. It's my largest source of income.
And... do you do that with the default configuration?
Yes. Internal apps/LoB apps for a large company might have, at most 5k users. PostgreSQL seems to manage it fine, none of my metrics are showing high latencies even when all employees log on in the morning during the same 30m period.
Kudos to you sir. Sincerely, I'm not hating, I'm actually jealous of the environment being that mellow.
If your don't mind overprovisioning your postgres, yes I guess the presented benchmarks are kind of representative. But they also don't add anything that you didn't know without reading the article.
Why would I mind it? I'm not using overpriced hosted PostgreSQL, after all.
> The way it is presented, a casual reader would think Postgres is 2/3rds the performance of Redis.
If a reader cares about the technical choice, they'll probably at least read enough to learn of the benchmarks in this popular use case, or even just the conclusion:
> Redis is faster than postgres when it comes to caching, there’s no doubt about it. It conveniently comes with a bunch of other useful functionality that one would expect from a cache, such as TTLs. It was also bottlenecked by the hardware, my service or a combination of both and could definitely show better numbers. Surely, we should all use Redis for our caching needs then, right? Well, I think I’ll still use postgres. Almost always, my projects need a database. Not having to add another dependency comes with its own benefits. If I need my keys to expire, I’ll add a column for it, and a cron job to remove those keys from the table. As far as speed goes - 7425 requests per second is still a lot. That’s more than half a billion requests per day. All on hardware that’s 10 years old and using laptop CPUs. Not many projects will reach this scale and if they do I can just upgrade the postgres instance or if need be spin up a redis then. Having an interface for your cache so you can easily switch out the underlying store is definitely something I’ll keep doing exactly for this purpose.
I might take an issue with the first sentence (might add "...at least when it comes to my hardware and configuration."), but the rest seems largely okay.
As a casual reader, you more or less just get:
* Oh hey, someone's experience and data points. I won't base my entire opinion upon it, but it's cool that people are sharing their experiences.
* If I wanted to use either, I'd probably also need to look into bottlenecks, even the HTTP server, something you might not look into at first!
* Even without putting in a lot of work into tuning, both of the solutions process a lot of data and are within an order of magnitude when it comes to performance.
* So as a casual reader, for casual use cases, it seems like the answer is - just pick whatever feels the easiest.
If I wanted to read super serious benchmarks, I'd go looking for those (which would also have so many details that they would no longer be a casual read, short of just the abstract, but them I'm missing out on a lot anyways), or do them myself. This is more like your average pop-sci article, nothing wrong with that, unless you're looking for something else.Eliminating the bottlenecks would be a cool followup post though!
A lot of us ate shit to stay in the Bay Area, to stay in computing. I have stories of great engineers doing really crappy jobs and "contracting" on the side.
I couldn't really have a 'startup' out of my house and a slice of rented hosting. Hardware was expensive and nothing was easy. Today I can set up a business and thrive on 1000 users at 10 bucks a month. Thats a viable and easy to build business. It's an achievable metric.
But Im not going to let amazon and its infinite bill you for everything at 2012 prices so it can be profitable hosting be my first choice. Im not going to do that when I can get fixed cost hosting.
For me, all the interesting things going on in tech aren't coming out of FB, Google and hyperscalers. They aren't AI or ML. We dont need another Kubernetes or Kafka or react (no more Conways law projects). There is more interesting work going on down at the bottom. In small 2 and 3 man shops solving their problems on limited time and budget with creative "next step" solutions. Their work is likely more applicable to most people reading HN than another well written engineering blog from cloud flare about their latest massive rust project.
What exactly is your point? That you can further optimize either option? Well yes, that comes at no suprise. I mean, the latencies alone are in the range of some transcontinental requests. Were you surprised that Redis outperformed Postgres? I hardly think so.
So what's the problem?
The main point that's proven is that there is indeed diminishing returns in terms of performance. For applications where you can afford an extra 20ms when hitting a cache, caching using a persistent database is an option. For some people, it seems this fact was very surprising. That's food for thought, isn't it?
Comes with ttl support (which isn't precise so you still need to check expiration on read), and can support long TTLs as there's essentially no limit to the storage.
All of this at a fraction of the cost of HA redis Only if you need that last millisecond of performance and have done all other optimizations should one consider redis imho
This depends on your scale. Dynamodb is pay per request and the scaling isn’t as smooth. At certain scales Redis is cheaper.
Then if you don’t have high demand maybe it’s ok without HA for Redis and it can still be cheaper.
For HA redis you need at least 6 instances, 2 regions * 3 AZs. And you're paying for all of that 24/7.
And if you truly have 24/7 use then just 2 regions won't make sense as the latency to get to those regions from the other side of the globe easily removes any caching benefit.
It's $15/mo for 2x cache.t4g.micro nodes for ElastiCache Valkey with multi-az HA and a 1-year commitment. This gives you about 400 MB.
It very much depends on your use case though if you need multiple regions then I think DynamoDB might be better.
I prefer Redis over DynamoDB usually because it's a widely supported standard.
You need to be more specific with your scenario. Having to cache 100MB of anything is hardly a scenario that involves introducing a memory cache service such as Redis. This is well within the territory of just storing data in a dictionary. Whatever is driving the requirement for Redis in your scenario, performance and memory clearly isn't it.
If you're given the requirement of highly available, how do you not end up with at least 3 nodes? I wouldn't consider a single region to be HA but I could see that argument as being paranoid.
A cache is just a store for things that expire after a while that take load of your persistent store. It's inherently eventually consistent and supposed to help you scale reads. Whatever you use for storage is irrelevant to the concept of offloading reads
Tell that to Github or HN or many other sites? So caching for them doesn't make sense?
Can you specify in which scenario you think Redis is cheaper than caching things in, say, dynamodb.
You posted a vague and meaningless assertion. If you do not have latency numbers and cost differences, you have absolutely nothing to show for, and you failed to provide any rationale that justified even whether any cache is required at all.
ElastiCache Serverless (Redis/Memcached): Typical latency is 300–500 microseconds (sub-millisecond response)
DynamoDB On-Demand: Typical latency is single-digit milliseconds (usually between 1–10 milliseconds for standard requests)
You would've used local memory first. At which point I cannot see getting to those request levels anymore
> ElastiCache Serverless (Redis/Memcached): Typical latency is 300–500 microseconds (sub-millisecond response)
Sure
> DynamoDB On-Demand: Typical latency is single-digit milliseconds (usually between 1–10 milliseconds for standard requests)
I know very little use cases where that difference is meaningful. Unless you have to do this many times sequentially in which case optimizing that would be much more interesting than a single read being .5 ms versus the typical 3 to 4 for dynamo (that last number is based on experience)
You need to be more specific than that. Depending on your read/write patterns and how much memory you need to allocate to Redis, back of the napkin calculations still point to the fact that Redis can still cost >$1k/month more than DynamoDB.
Did you actually do the math on what it costs to run Redis?
When not hosted on AWS? Who says we have to compare dynamodb to AWS managed Redis? Redis the company has paid hosted versions. You can run it as part of your k8s cluster too.
Exactly. I think nosql offerings from any cloud provider already supports both TTL and conditional requests out-of-the-box, and the performance of basic key-value CRUD operations is often <10ms.
I've seem some benchmarks advertise memory cache services as having latencies around 1ms. Yeah, this would mean the latency of a database is 10 times higher. But relative numbers matter nothing. What matters is absolute numbers, as they are the ones that drive tradeoff analysis. Does a feature afford an extra 10ms in latency, and is that performance improvement worth paying a premium?
I don't see any point to this blend of cynical contrarianism. If you feel you can do better, put your money where your mouth is. Lashing at others because they went through the trouble of sharing something they did is something that's absurd and creates no value.
Also, maintaining a blog doesn't make anyone an expert, but not maintaining a blog doesn't mean you are suddenly more competent than those who do.
This is totally misguided and incorrect.
Redis can be easily deployed such that any request returns in less than a millisecond, and this is where it's most useful. It's also consistent and stable as hell. There are many use-cases for Redis where Postgres is totally unsuitable and doesn't make sense, and vice versa.
Do yourself a favour and ignore this blog (again: inaccurate, poorly benchmarked, misleading) and do your own research and use better sources of information.
I have witnessed many incidents when DB was considerably degrading. However, thanks to the cache in redis/memcache, a large part of the requests could still be processed with minimal increase in latency. If I were serving cache from the same DB instance, I guess, it would cause cache degradation too when there are any problems with the DB.
I don't think it is reasonable to assume or even believe that connection exhaustion is an issue specific to Postgres. If you take the time to learn about the topic, you won't need to spend too much time before stumbling upon Redis and connection pool exhaustion issues.
This was the very first time I heard anyone even suggest that storing data in Postgres was a concern in terms of reliability, and I doubt you are the only person in the whole world who has access to critical insight onto the matter.
Is it possible that your prior beliefs are unsound and unsubstantiated?
> I have witnessed many incidents when DB was considerably degrading.
This vague anecdote is meaningless. Do you actually have any concrete scenario in mind? Because anyone can make any system "considerably degrading", even Redis, if they make enough mistakes.
Besides, having the cache on separate hardware can reduce the impact on the db on spikes, which can also factor into reliability.
Having more headroom for memory and CPU can mean that you never reach the load where ot turns to service degradation on the same hw.
Obviously a purpose-built tool can perform better for a specific use-case than the swiss army knife. Which is not to diss on the latter.
You're confusing being "combative" with asking you to substantiate your extraordinary claims. You opted to make some outlandish and very broad sweeping statements, and when asked to provide any degree of substance, you resorted to talk about "chill pills"? What does that say about the substance if your claims?
> If postgres has issues, it can affect the reliability of the service further if it's also running the cache.
That assertion is meaningless, isn't it? I mean, isn't that the basis of any distributed systems analysis? That if a component has issues, it can affect the reliability of the whole system? Whether the component in question is Redis, Postgres, doesn't that always hold true?
> Besides, having the cache on separate hardware can reduce the impact on the db on spikes, which can also factor into reliability.
Again, isn't this assertion pointless? I mean, it holds true whether it's Postgres and Redis, doesn't it?
> Having more headroom for memory and CPU can mean that you never reach the load where ot turns to service degradation on the same hw.
Again, this claim is not specific to any specific service. It's meaningless to make this sort of claim to single out either Redis or Postgres.
> Obviously a purpose-built tool can perform better for a specific use-case than the swiss army knife. Which is not to diss on the latter.
Is it obvious, though? There is far more to life than synthetic benchmarks. In fact, the whole point of this sort of comparison is that for some scenarios a dedicated memory cache does not offer any tangible advantage over just using a vanilla RDBMS.
This reads as some naive auto enthusiasts claiming that a Formula 1 car is obviously better than a Volkswagen Golf because they read somewhere they go way faster, but in reality what they use the car for is to drive to the supermarket.
You are not answering to OP here. Maybe it's time for a little reflection?
what are these "extraordinary claims" you speak of? I believe it's you who are confusing me with someone else. I am not GP. You appear to be fighting windmills.
The claim that using postgres to store data, such as a cache, "is a bit concerning in terms of reliability".
This is a class of error a human is extremely unlikely to make.
You seem to be reading "reliability" as "durability", when I believe the parent post meant "availability" in this context
> Do you actually have any concrete scenario in mind? Because anyone can make any system "considerably degrading", even Redis
And even Postgres. It can also happen due to seemingly random events like unusual load or network issues. What do you find outlandish about the scenario of a database server being unavailable/degraded and the cache service not being?
I always find these "don't use redis" posts kind of strange. Redis is so simple to operate at any scale, I don't quite get why it is important to remove it.
It seems like the autovacuum could take care of these expired rows during its periodic vacuum. The query planner could automatically add a condition that excludes any expired rows, preventing expired rows from being visible before autovacuum cleans them up.
I'm a big "just use Postgres" fan but I think Redis is sufficiently simple and orthogonal to include in the stack.
Don't get me wrong, the idea that he wants to just use a RDMBS because his needs aren't great enough, is a perfectly inoffensive conclusion. The path that led him there is very unpersuasive.
It's also dangerous. Ultimately the author is willing to do a bit more work rather than learn something new. This works because he's using a popular tool people like. But overall, he doesn't demonstrate he's even thought about any of the things I'd consider most important; he just sort of assumes running a Redis is going to be hard and he'd rather not mess with it.
To me, the real question is just cost vs. how much load the DB can even take. My most important Redis cluster basically exists to take load off the DB, which takes high load even by simple queries. Using the DB as a cache only works if your issue is expensive queries.
I think there's an appeal that this guy reaches the conclusion someone wants to hear, and it's not an unreasonable conclusion, but it creates the illusion the reasoning he used to get there was solid.
I mean, if you take the same logic, cross out the word Postgres, and write in "Elasticsearch," and now it's an article about a guy who wants to cache in Elasticsearch because it's good enough, and he uses the exact same arguments about how he'll just write some jobs to handle expiry--is this still sounding like solid, reasonable logic? No it's crazy.
What exactly is the challenge you're seeing? In the very least, you can save an expiry timestamp as part of the db entry. Your typical caching strategy already involves revalidating cache before it expires, and it's not as if returning stale while revalidating is something completely unheard of.
Maybe Postgres could use a caching feature. Until then, I'm gonna drop in Redis or memcached instead of reinventing the wheel.
Personally for a greenfield project, my thinking would be that I am paying for Postgres already. So I would want to avoid paying for Redis too. My Postgres database is likely to be underutilized until (and unless) I get any real scale. So adding caching to it is free in terms of dollars.
Usually Postgres costs a lot more than Redis if you're paying for a platform. Like a decent Redis or memcached in Heroku is free. And I don't want to waste precious Postgres connections or risk bogging down the whole DB if there's lots of cache usage, which actually happened last time I tried skipping Redis.
Postgres might cost more but I'm probably already paying. I agree that exhausting connections and writing at a high rate are easy ways to bring down Postgres, but I'm personally not going to worry about exhausting connections to Postgres until I have at least a thousand of them. Everything has to be considered within the actual problem you are solving, there are definitely situations to start out with a cache.
Edit: well a tiny bit, max $3/mo
You need to back up your unbelievable assertion with facts. Memory cache is typically far more expensive than a simple database, specially as provisioning the same memory capacity as RAM is orders of magnitude more expensive than storing the equivalent data in a database.
I have no ideas where did you got that from.
I'm not sure how else to interpret this
So be specific. What exactly did you wanted to say?
> But yeah a base tier Redis that will carry a small project tends to be a lot cheaper than the base tier Postgres.
This is patently false. I mean,some cloud providers offer nosql databases with sub-20ms performance as part of their free tier.
Just go ahead and provide any evidence, any at all,that support the idea that Redis is cheaper than Postgres. Any concrete data will do.
You do not need cron jobs to do cache. Sometimes you don't even need a TTL. All you need is a way to save data in a way that is easy and cheaper to retrieve. I feel these comments just misinterpret what a cache is by confusing it with what some specific implementation does. Perhaps that's why we see expensive and convoluted strategies using Redis and the like when they are absolutely not needed at all.
Do you have a bound? I mean, with Redis you do, but that's primarily a cost-driven bound.
Nevertheless, I think you're confusing the point of a TTL. TTLs are not used to limit how much data you cache. The whole point of a TTL is to be able to tell whether a cache entry is still fresh or it is stale and must be revalidated. Just because some cache strategies use TTL to determine what entry they should evict, that is just a scenario that takes place when memory is at full capacity.
Non-sequitur,and imaterial to the discussion.
> You should probably evict it when the write comes in.
No. This is only required if memory is maxed out and there is no more room to cache your entry. Otherwise you are risking cache misses by evicting entries that are still relatively hot.
You said:
> The whole point of a TTL is to be able to tell whether a cache entry is still fresh or it is stale and must be revalidated.
So I responded to it. I don't really understand why you think that's nonsequiter.
> No.
I'm a bit confused. We're not using TTLs and we're not evicting things when they become invalid. What is your suggestion?
1. im surprised the latencies for redis are as high as they are. single key lookup operator is very commonly <= 2ms in my experience. 2. your throughput looks pretty solid, which makes it even weirder that latencies are relatively high.
Like I said its probably just hardware though. Ive written a handful of redis backed services where the read APIs have a p99 10ms. In those instances the api infra wasn't anything too special, but it was a decent redis setup (elasticache of some middling tier).
I think the jist of it is, you probably have sufficiently low requests/second (<1000) that using postgres as a cache is totally reasonable - which it is. If your hitting your load tests and hardware spend, no need to optimise more.
If you see yourself starting with simple key/value setup and then feature requests come in that make you considering having "references" to other keys, it is time to re-consider Redis, not double down on it. Even if you insist on continuing, at the very least, add a service to manage it with some clean abstractions instead of raw-dogging it.
Why do people complicate things? We've solved caching ages ago.
I didn't measure setting keys or req/sec because for my use case keys were updated infrequently.
I generally find ms to be a more useful metric than reqs/sec or latency at full load, as this is not a typical load. Or at least wasn't for my use case.
Of course all depends on your use case etc. etc. In some cases throughput does matter. I would encourage everyone to run their own benchmarks suited to their own use case to be sure – should be quick and easy.
As I rule I recommend starting with PostgreSQL and using something else only if you're heavily using the cache or you run in to problems. Redis isn't too hard to run, but still just one less service to worry about. Or alternatively, just use a in-memory DB. Not always appropriate of course, but sometimes it is.
Of course such sensitive environments are easily imaginable but I wonder why you'd select either in that case.
Yes, that was my take-away.
I am just a little bit surprised on the relatively low write performance for both Postgres and Redis here; but as I can see, the tests were run on just 2 CPUs and 8 GB of RAM machine. In my experience, with 8 CPUs, Postgres can easily handle more than 15 000 writes per second using regular tables; I would imagine that it can easily be 20 000+ for the Unlogged variety - who needs more than that to cache?
Comparing throttled pg vs non-throttled redis is not a benchmark.
Of course when pg is throttled you will see bad results and high latencies.
A correct performance benchmark would be to give all components unlimited resources and measure performance and how much they use without saturation. In this case, PG might use 3-4 CPUs and 8GB of RAM but have comparable latencies and throughput, which is the main idea behind the notion “pg for everything”.
In a real-world situation, when I see a problem with saturated CPU, I add one more CPU. For a service with 10k req/sec, it’s most likely a negligible price.
And their point is that it's good enough as is.
e.g. 4k/sec saturates PG CPU to 95%, you get only 20% on redis at this point. Now you can compare latencies and throughput per $.
In the article PG latencies are misleading.
Yeah ok, you have 30 million entries? Sure.
You need to sync something over multiple nodes? Not sure I would call that a cache.
If you actually need lower latency then great, design for it. But it should be a conscious decision, not a default one.
One thing that I think has gotten lost in the "I need redundant redundancy for my redundantly redundant replicas of my redundantly-distributed resources" world is that you really only need all that for super-real-time systems. Which a lot of things are, such as, all user-facing websites need to be up the moment the user hits them and not 30 seconds later. But when you don't have that constraint, if things can take an extra few minutes or drop some requests and it's not a big deal, you can get away with something a lot cheaper, made even more cheap by the fact that running things on a single node gets you access to a lot of performance you simply can not have in a distributed system because nothing is as fast as the RAM bus being accessed by a single OS process. And sometimes you have enough flexibility to design your system to be that way in the first place instead of accidentally wiring it up to be dependent on complicated redundancy schemes.
(Next up after that, if that isn't enough, is the system where you have redundant nodes but you make sure they don't need to cross-talk at all with something like Redis. Observation: If you have two nodes for redundancy, and they are doing something with caching, and the cached values are generally stable for long periods of time, it is often not that big a deal just to let each node have its own in-memory cache and if they happen to recreate a value twice, let them. If you work the math out carefully, depending on your cache utilization profile you often are losing less than you think here (in particular, if the modal result is that you never hit a given cached value again, it's cheap especially if the ones you hit you end up hitting a lot, and if on average you get cached values all the time, the amortized cost of the second computation is nearly nothing, it's only in the "almost always hit them 2 or 3 times" case that this incurs extra expense and that's actually a very, very specific place in the caching landscape), especially since the in-process caching and such is faster on its own terms too which mitigates the problem, especially because you can set it up so you have no serialization costs in this case, and the architectural simplicity can be very beneficial. No, by no means does this work with every system, and it is helpful to scan out into the future to be sure you probably won't ever need to upgrade to a more complicated setup, but there's a lot of redundantly redundant systems that really don't need to be written with such complication because this would have been fine for them.)
Perhaps you could have a second cron job that runs to verify that the first one completed. It could look for a last-ran entry. You should put it in the same database, so maybe perhaps you could a key value store like redis for that.
I'd suggest using Redis pipelining -- or better: using the excellent rueidis redis client which performs auto-pipelining. Wouldn't be surprising to see a 10x performance boost.
To this I would add that more often than not the extra cost and complexity of a memory cache does not justify shaving off a few hypothetical milliseconds from a fetch.
On top of that, some nosql offerings from popular cloud providers already have CRUD operations faster than 20ms.
Validating assumptions
Curiosity/learning
Enraging a bunch of HN readers who were apparently born with deep knowledge of PG and Redis tuning
Definitely a premature optimization on my part.
We can't get rid of Postgres, but since we run Postgres on GCP we really never even think about it.
Wherever you go, there you are.
Is redis not improving your latency? Is it adding complexity that isn’t worth it? Why bother removing it?
But when you have 0-10 users and 0-1000 requests per day, it can make more sense to write something more monolithic and with limited scalability. Eg, doing everything in Postgres. Caching is especially amenable to adding in later. If you get too far into the weeds managing services and creating scalability you might bogged down and never get your application in front of potential users in the first place.
Eg, your UX sucks and key features aren't implemented, but you're tweaking TTLs and getting a Redis cluster to work inside Docker Compose. Is that a good use of your time? If your goal is to get a functional app in front of potential users, probably not.
But I agree that it would be appropriate to start out that way in some projects.
What I'd be interested to see is a benchmark that mixes lots of dumb cache queries with typically more complex business logic queries to see how much Postgres performance tanks during highly concurrent load.
When I worked in a bigger company in the past we used postgres to store raw blobs for data transformation pipelines (+1T table), and queries to it were instant due to good index usage. Others might have gone to BigQuery or something else making more complex and def more expensive.
PS: I love tools like Temporal, Hatchet or Oban that rely on Postgres and can scale a ton too.
In that sense, seeing if the latency impact of postgres is tolerable is pretty reasonable. You may be able to get away with postgres putting things on disk (yes, redis can too), and only paying the overhead cost of allocating sufficient excess RAM to one pod rather than two.
But if making tradeoffs like that, for a low-traffic service in a small homelab, I do wonder if you even need a remote cache. It's always worth considering if you can just have the web server keep it's own cache in-memory or even on-disk. If using go like in the article, you'd likely only need a map and a mutex. That'd be an order of magnitude faster, and be even less to manage... Of course it's not persistent, but then neither was Redis (excl. across web server restarts).
1. The use-case is super specific to homelab where consistency doesn't matter. You didn't show us the Redis persistence setup. What is the persistence/durability setting? I bet you'd lose data the one day you forgot and flip the breaker of your homelab.
2. What happened when data is bigger than your 8GB of RAM on Redis?
3. You didn't show us the PG config as well, it is possible to just use all of your RAM as buffer and caching.
4. Postgres has a lot of processes and you give it only 2 CPU? Vanilla Redis is single core so this race is rigged to begin with. The UNLOGGED table even things out a bit.
In general, what are you trying to achieve with this "benchmark"? What outcome would you like to learn? Because this "benchmark" will not tell you what you need to know in a production environment.
Side note for other HN readers: The UNLOGGED table is actually very nifty trick for speeding up unit tests. Just perform ALTER to UNLOGGED tables inside the PG that's dedicated for CI/CD: ALTER TABLE my_test_table SET UNLOGGED;
I wanted to compare how would my http server behave if I used postgres for caching and what the difference would be if I used redis instead.
This benchmark is only here to drive the point that sometimes you might not even need a dedicated kv store. Maybe using postgres for this is good enough for your use case.
The term production environment might mean many things. Perhaps you're processing hundreds of thousands of requests per second then you'll definitely need a different architecture with HA, scaling, dedicated shared caches etc. However, not many applications reach such a point and often end up using more than necessary to serve their consumers.
So I guess I'm just trying to say keep it simple.
In this case, I would expect that a fairer comparison would be running Postgres on tmpfs. UNLOGGED only skips WAL writes, not all writes; if you do a clean shutdown, your data is still there. It's only lost on crash.
[0]: https://dizzy.zone/2025/03/10/State-of-my-Homelab-2025/
To be fair, I asked the question and you found the answer - lol, my bad.
Yes I agree using a nas that adds latency would reduce the TPS and explain his results. “Littles law”
This is also why I rarely use redis - Postgres at 100k TPS is perfectly fine for all my use cases, including high usage apps.
[0]: https://www.cybertec-postgresql.com/en/unexpected-downsides-...
- Reduce Page Size from 8KB to 4KB, great for write heavy operations and indexed reads. Needs to compile source with those flags, cant configure once installation is done.
- Increase Buffer cache
- Table partitioning for UNLOGGED Table which the author is using
- At connection session level, lower the transaction level from SERIALIZABLE
- The new UUID7 in PG 18 as a key might also help as primary indexed KEY type as it also supports range queries on timestamp
It's a travesty to run it on default settings.
All it takes is 5 mins to do it.
Use pgtune - https://pgtune.leopard.in.ua/
I know the author concludes that he would still use Postgres for his projects.
But, he would get much better benchmark numbers if it was tuned.
And dealing with unlogged table contents are not crash-safe.
I believe Redis would have performed better with more allocated CPU.
It looks like it's just storing the session in postgres/redis.
Caching implies there's some slower/laggier/more remote primary storage, for which the cache provides faster/readier access to some data of the primary storage.
In Rails we just got database backed everything with the option to go to special backends if need be.
The only question I have is how do I notice that my current backend doesn’t scale anymore and who or what would tell me to switch.
Still leaves you with needing to perfomantly hash your strings into those IDs on top, but mostly as the Postgres plateau of performance compared to purposely built KV DBs.
There are async functions provided by PostgreSQL client library (libpq). I've used it to process around 2000 queries on a single connection per second on a logged table.
Also does anyone like memcached anymore? When I compared with Redis in the past it appeared more simple.
If your cache is so performance critical that you can't lose the data then it sounds like you need a (denormalized) database.
Also, no discussion of indexes or what the data looks like, so we must assume no attention has been paid to their critical factors either.
So, another case of lies, damned lies and benchmarks.
It seem strange to me people are so willing to post such definitive and poorly researched/argued things - if you're going to take a public position don't you want to be obviously right instead of so easily to discount?
(And to the people complaining about this benchmark not being extremely scientifically rigorous: Nobody cares.)
Doesn’t require SQLite.
Works with other DBs:
"i do not care. i am not adding another dependency"