The author is comparing "off-the-shelf" caching with custom caching. They're coming from the assumption that you must be caching somehow and arguing that the word "caching" should be understood to mean only particular approaches to the general idea of caching. And obviously the whole point of the general idea is to optimize things.
It's a rhetorical mess
The joke form of this quote goes along the lines of:
There are only two hard things in Computer Science: cache
invalidation, naming things, and off-by-one errors.
:-Dthere's two hard problems in computer science: we only have one joke and it's not funny.
Apparently⁰ by Philip Scott Bowden¹
Likewise, naming things is simple as long as you alone, or a in a small team. But as soon as there are multiple organizations with all their own traditions, it gets tricky. Just witness the eternal flame wars about camelCase, PascalCase, snake_case, kebab-case, and UPPER_CASE. It is almost as hopeless culture clash as Emacs vs Vi vs PowerPoint...
(I leave the off-by-one errors as an exercise for the reader)
- The language dimension - choice of words, that are good enough for the purpose, and not confusing. For example, "Manager" is as ambiguous as it gets, it can mean many thing, except we've been using it long enough that there's a more specific shape of meaning[0] for that word in code/program architecture contexts - so you still would use it instead of, say "Coordinator", which would raise all kinds of questions that "Manager" no longer does.
- The epistemological dimension - whether the word you chose correctly names the concept you meant, and whether the concept you meant is actually the right one to describe the thing you're trying to describe. Ultimately, this is the hard thing at the root of philosophy. In practice, it manifests like e.g. choice between digging into some obscure branches of mathematics to correctly name the thing "endofunctor" or something, or calling it "Square" and saying "fuck it, we'll clarify the exceptions in the comments".
--
[0] - I mean "more specific" in the sense it's distinct from the other meanings and somewhat narrow - but still it's fuzzy as heck and you can't describe it fully in words; it's basically tacit knowledge.
I also avoid letting the reader make assumptions. HasPlayerJumpedRecently is bad. What does recently mean? HasPlayerJumpedInLastTenMs is better, even if it's a bit long...Which highlights that it should probably be refactored into a more flexible value; MsSincePlayerLastJumped.
If you arent assuming a time var wth Ms is milliseconds you aren't doing games dev so that one slides with me.
I use to be quite fond of short identifiers, especially ones the make the signs “line up”… until I worked with code long enough that I forgot what I did and had to read it again.
If you have a system with "slow storage" and "fast storage", caching is a way to abstract that away to just "storage".
The author is arguing that the latter is the default way we should think about the concept of caching, which is a valid opinion to have.
All software has to name things, and count. Caching (including invalidation) is best understood as a liability. If you can foist it off on your CPU and OS and DB, good for you. Programming whatever you're actually trying to get done is already hard enough.
They also tend not to be very hard.
Really the only time in my entire professional career that off-by-one errors have actually given me headaches.
We use caching a lot, anything that gets cached can only be written by one service each. The writing services emit cache invalidation messages via SNS that cache users must listen to via SQS, to clear/update their cache.
Alternatively we cache stuff with just a TTL, when immediate cache invalidation is not important.
Where‘s the struggle?
If there are no real consequences for reading stale data, and your writes are infrequent enough, then indeed you're lucky and have a relatively simple problem.
If it doesn’t guarantee delivery, then I believe you will at some point have a client that reads a cached value thinking it’s still valid because the invalidation message got lost in the network.
If you don't understand how and why and when eventual consistency is a problem, you will never understand why cache invalidation is hard.
By the sound of your example, you only handle scenarios where naive approaches to cache invalidation serve your needs, and you don't even have to deal with problems caused by spikes to origin servers. That's perfectly fine.
Others do. They understand the meme. You can too if you invest a fee minutes reading up on the topic.
Both are not that difficult, honestly.
Aren’t there a lot harder things out there
Think about all those times your program isn't building and `make clean` fixes it.
I don't think that's a good example. I've worked with more than 20 different build tools by now, and I cannot recall a single instance where the problem actually came down to cache invalidation. Every time I dug into it, the real cause was something else: a mistake in the build script, an incorrectly declared dependency, or something similar.
So when you say "think about all those times", not a single such time comes to my mind!
You could group these two things into "getting the data model right" as the single hard thing, perhaps that rings more true to you :)
Heck you can probably prove that any system for naming things is either inconsistent or incomplete.
Well, I for one feel that "naming things" ultimately boils down to the latter, which may or may not be harder than the former.
It's also possible that these used to be the only two hard problems at the time the aphorism was first recorded, but the underlying state of the world has changed since then and the aphorism, as recorded, is no longer current.
It becomes way more difficult when you have N servers all of which could potentially serve the same data. Local caches could then, yes easily become stale, and even if you have a propagation mechanism, you couldn't guarantee against network failures or latency issues.
But suppose you have an HA redis instance that you store your cached data in. Even with a write-through cache, you basically need to implement a 2-phase commit to ensure consistency.
Here's one for PHP: https://github.com/Qbix/Platform/blob/main/platform/classes/...
And here is for the Web: https://github.com/Qbix/Platform/blob/dc95bd85fa386e45546b0b...
I also discuss caching in the context of HTTP: https://community.qbix.com/t/caching-and-sessions/192
WRITING ABOUT CACHING AND INVALIDATION
You should especially start here: https://community.qbix.com/t/qbix-websites-loading-quickly/2...And then you can read about incremental updates: https://community.qbix.com/t/streams-plugin-messages-subscri...
Caching becomes hard when such a stores are used in distributed or concurrent contexts.
An example: imagine Qcache being used when fetching data from an SQL database. And data in the database changes with a direct SQL query from another process.
How will your key-value store know that the value in it is stale and needs refreshing?
Caching systems know when something needs updating several ways. One is pubsub: When the information is loaded from the cache, you want to set up a realtime websocket or webhook for example, to reflect any changes since the cached info was shown. If you can manage to have a server running, you can simply receive new data (deltas) and update your cached data — in that case you can even have it ready and never stale by the time it is shown.
If you can’t run a server and don’t want to reveive pushes then another approach simply involves storing the latest ordinal or state for EACH cached item locally (e-tags does this) and then before rendering (or right after you do) bulk-sending the tags and seeing what has been updated, then pulling just that. The downside is that you may have a flash of old content if you show it optimistically.
If you combine the two methods, you can easily get an up-to-date syndication of a remote mutable data store.
My whole framework is built around such abstractions, to take care of them for people.
Phrases like "simply" and "never stale" are doing a lot of heavy lifting. Yet you haven't answered a very simple question I posted above: how would Q_Cache structure handle a direct SQL write query from another process on the database it is being used as a cache for?
SQL databases don't run websockets to external servers, nor do they fire webhooks. If you are running a server which you are hitting with every update instead of doing direct SQL on the DB, there's always a small amount of time where a cache user can see stale data before this server manages to trigger the update of the cache.
The SQL server should have a way to do pubsub. It can then notify your PHP webhook via HTTP, or run a script on the command line, when a row changes. It should have an interface to subscribe to these changes.
If your SQL data store lacks the ability to notify others that a row changed, then there’s your main problem. Add that functionality. Make a TRIGGER in SQL for instance and use an extension.
If MySQL really lacks this ability then just put node.js middleware in front of it that will do this for you. And make sure that all mysql transactions from all clients go through that middleware. Now your combined MySQL+Node server is adequate to help clients avoid stale data.
As I already said, if you for some reason refuse to handle push updates, then you have to store the latest row state (etags) in your PHP cache (apcu). And then when you access a bunch of cached items on a request, at the end collect all their ids and etags and simply bulk query node.js for any cached items that changed. You can use bloom filters or merkle trees or prolly trees to optimize the query.
Joel Gustafson had a great article on this and now uses it in gossiplog: https://joelgustafson.com/posts/2023-05-04/merklizing-the-ke...
A lot of people haven't caught on, and try to cache things using ambiguous names, hence the struggle to invalidate their caches when the meaning changes.
[1] This can be applied even if you don't know the content yet; you just have to unambiguously name the inputs to the function that produces it. You might not know what all the inputs are, and then you have to start adding stuff like "unknown-unknown-2025-07-03T16", but it'll still basically work.
and - as the OP suggests - it works best when the cache is a well-defined abstraction with properties and rules about how it works
just because "caching" is mentioned in a meme doesn't mean it can't be true that it can simplify software
I have to push back here, I think this is objectively untrue. By definition a system or piece of code on where you add a condition where something else happens (cache) that behaves differently than the uncached path increases complexity.
I'm not saying it's wrong to cache things or that they aren't useful, but I think they absolutely are an abstraction and an optimization at the cost of complexity. Good code bases hide complexity from the devs all the time, so it's not a question of whether you can code it away, but rather how difficult is it to troubleshoot the internals of the system.
The only scenario where it would simplify software is if a bunch of complex (non-cache) things are being done to improve perf, and a cache would be the simpler solution. But in that case the simplifying step is not adding a cache, it is removing complex things that aren't actually required. After that you add a cache to improve performance (which increases complexity but is worth it for this imagined use-case). But maybe you remove the complex perf shenanigans, and realize that perf is still "good enough" even without a cache, keeping your software even simpler.
Look at how CPU cache line behaviors radically change the performance of superficially similar algorithms.
Look at how query performance for a database server drops off a cliff the moment the working cache no longer fits in memory.
Hiding complexity can be a simplification, until you exceed the bounds of the simplification and the complexity you hid demands your attention anyway.
There's a long history in computer architecture of cores and accelerators that don't have a cache but instead rely on explicitly programmed local scratchpads. They are universally more difficult to program than general purpose CPUs because of that.
But getting them right can easily cross the boundary of purely optimizing performance towards simplifying public API of something. I think this is true.
I'd imagine an involved example where semantics and caching really start to offer a trade-off.
Imagine that somehow querying the actual meteorological data is quite expensive, and consider this badly written pseudocode (equals sign denoting default parameters):
- measureCurrentTemparature()
- retrieveAccurateTemperatureForNanoSecond(momentInTime)
-> cached abstractions which would access cached data:
- getTempearature(moment = now(), tolerance = 1min)
- getCurrentTemperature(tolerance = MIN_TOLERANCE)
I know, reality is much more complicated, and using time (seeing it as quasi-continuous) as a caching parameter is already stretching it so far.
Just a stupid example that came to my mind.
I've bitten myself in the ass with caching rasterized reprentations of images more than once, where the input were SVG images or limited formats that convert to SVG.
When code has abstract interfaces for data access, introducing caching can be simpler (but not simple) by localizing it in the abstraction implementation which has or doesn't have caching.
But it is not an abstraction (you can perfectly well do caching without any abstractions, and it's frequently done exactly that way).
The concern you're talking about is about the actual access to the data. My understanding of the article is that it's about how caching algorithms can abstract the concern of minimising retrieval cost.
So in some ways you're coming at it from opposite directions. You're talking about a prior of "disk by default" and saying that a good abstraction lets you insert cache layers above that, whereas for the author the base case is "manually managing the layers of storage".
Algorithms can't really abstract anything since they are, well, just algorithms (formal descriptions of how a computation should be done).
Looking at the author's examples again, I think most everybody would say that caching is used in both:
if data_id in fast_storage:
return fast_storage.get(data_id)
else:
data = slow_storage.get(data_id)
fast_storage.set(data_id, data)
return data
and # Uses fast storage or slow storage just like above, but behind the get() method.
return storage.get(data_id)
The first one does not make an abstraction on storage, the second one does, but they are both "caching" data internally.While there are generic implementations of caching algorithms and we can consider those abstractions, "caching" is a wider term than those implementations, and is specifically not an abstraction (the fact that there is a caching implementation that abstracts something does not make all caching an abstraction).
Edit: Let me also point out that "abstract the concern of minimising retrieval cost" is not caching — I can say that eg. a simple interface with method FastGet(id) does the former, and it needs not use any caching if the underlying structure is fast enough and eg. directly in memory.
What you describe as "caching algorithms" are not really caching algorithms, but cached object lifetime management algorithms (LRU, LFU...).
"Abstraction" is a higher level, simplified view of a set of concepts, yet caching is a single concept. See eg. https://en.wikipedia.org/wiki/Abstraction_(computer_science)
It sounds like you are both trying to redefine what "caching" means (tying it to implementations of particular algorithms), but also what "abstraction" means.
We should be very deliberate with the language we use, and our main goal should be to make it simpler to understand, not harder — I believe you are doing the latter here.
In this article, it's cache levels forcing you to separate different types of data because they're accessed at different frequencies. Another example is Rust's borrow checker, whose main purpose is arguably to facilitate both safe and efficient memory management, but which can also be used to enforce invariants that aren't clearly memory-related (e.g. builder pattern, temp files that auto-delete after they're dropped).
These aren't abstractions though. An abstraction is the opposite, hiding structure when it's noisy and making it easier to change. For example, if you already have an architecture in mind and don't want to manually determine how frequently each type of data is accessed, it's better to use a compiler or library that automatically determines what to cache with little to no code or thought on your end; that's abstraction. Similarly, the abstract analogue to Rust's borrow checker is garbage collection, which allows programmers to not think about their data-structures' lifetimes at all. The cost is usually performance and you understand your application less in some ways (although you understand it more in other ways; abstraction hides details but too many details make it hard to see the big picture. Ideally, with abstractions in the right places, you hide only the "unimportant" details in ways that insignificantly affect performance).
This was confusing to me – the most obvious way to judge the purpose of a system is to compare with the baseline of not having that system at all, especially in the case of caching where the program is functionally complete and correct without a cache. Anyway, there may not be a right or wrong here. Just tripped me up.
[0] https://brooker.co.za/blog/2021/08/27/caches.html
[1] https://sigops.org/s/conferences/hotos/2021/papers/hotos21-s...
Which is why solving the "I want my data in fast storage as often as possible" problem may be counter-productive on the whole: you ain't the only client of the system; let it breath and server requests from others.
(Although a materialised view is more like an index than a cache. The view won't expire requiring you to rebuild.)
In RDBMS contexts, index really is a caching mechanism (a cache) managed by the database system (query planner needs to decide when it's best to use one index or another).
But as you note yourself even in these cases where you've got cache management bundled with the database, having too many can slow down (even deadlock) writes so much as the database tries to ensure consistency between these redundant data storage elements.
Oof you're trying so hard you could cut diamond with that line.
Who told you that?
> you don't have to go all the way back to some backend database or API server or SSD [...] Caching is thus a tool to improve performance.
That's called "latency." This is not at all the same as "performance."
> My feelings now are that that perspective on caching is wrong
I agree.
If everything is caching, why even introduce the term: language should help us describe ideas, it should not be superfluous.
Caching is storing a copy of data in a place or way that it is faster to retrieve than it would be otherwise. Caching is not an abstraction; it is a computer science technique to achieve improved performance.
Caching does not make software simpler. In fact, it always, by necessity, makes software more complex. For example, there are:
- Routines to look up data in a fast storage medium
- Routines to retrieve data from a slow storage medium and store them in a fast storage medium
- Routines to remove the cache if an expiration is reached
- Routines to remove cache entries if we run out of cache storage
- Routines to remove the oldest unused cache entry
- Routines to remove the newest cache entry
- Routines to store the age of each cache entry access
- Routines to remove cache entries which have been used the least
- Routines to remove specific cache entries regardless of age
- Routines to store data in the cache at the same time as slow storage
- Routines to store data in cache and only write to slow storage occasionally
- Routines to clear out the data and get it again on-demand/as necessary
- Routines to inform other systems about the state of your cache
- ...and many, many more
Each routine involves a calculation that determines whether the cache will be beneficial. A hit or miss can lead to operations which may add or remove latency, may or may not run into consistency problems, may or may not require remediation. The cache may need to be warmed up, or it may be fine starting cold. Clearing the cache (ex. restarts) may cause such a drastic cascading failure that the system cannot be started again. And there is often a large amount of statistics and analysis needed to optimize a caching strategy.These are just a few of the considerations of caching. Caching is famously one of the hardest problems in computer science. How caching is implemented, and what it affects, can be very complex, and needs to be considered carefully. If you try to abstract it away, it usually leads to problems. Though if you don't try to abstract it away, it also leads to problems. Because of all of that, abstracting caching away into "general storage engine" is simply impossible in many cases.
Caching also isn't just having data in fast storage. Caching is cheating. You want to provide your data faster than actually works with your normal data storage (or transfer mechanism, etc). So you cheat, by copying it somewhere faster. And you cheat again, by trying to figure out how to look it up fast. And cheat again, by trying to figure out how to deal with its state being ultimately separate from the state of the "real" data in storage.
Basically caching is us trying to be really clever and work around our inherent limitations. But often we're not as smart as we think we are, and our clever cheat can bite us. So my advice is to design your system to work well without caching. You will thank yourself later, when you finally are dealing with the bug bites, and realize you dodged a bullet before.