Scaling One Million Checkboxes to 650M checks
237 points
1 month ago
| 15 comments
| eieio.games
| HN
pizzafeelsright
1 month ago
[-]
Many lessons learned along with great historical knowledge of (distributed) systems.

I think you hit every type of interruption and point of failure, except storage space, and it is great to see your resolutions.

I wasn't aware Redis could do the Lua stuff which makes me very interested in using it as an alternative state.

As for the bandwidth - one of my biggest gripes with cloud services as there is no hard limit to avoid billing overages.

reply
eieio
1 month ago
[-]
Thank you!

FWIW I certainly hit storage space in some boring ways - I didn't have a good logrotate setup so I almost ran out of disk, and I sent my box-check logs to Redis and had to set something up to offload old logs to disk to not break Redis. But neither of those were very big deals - pretty interesting to have a problem like this where storage just wasn't a meaningful problem! That's a new one for me.

And yeah, thinking about bandwidth was such a headache. I was on edge for like 2 days, constantly checking outbound bytes on my nic and redoing the math - not having a hard cap is just really scary. And that's with Digital Ocean, which has pretty sane pricing! I haven't used the popular serverless stuff at all, but my understanding is that you get really gouged on bandwidth there.

(also yes, lua-in-redis is really incredible and lets you skip sooo many hard/racey problems as long as you're ok with a little performance hit, it was a joy to work with)

reply
jorl17
1 month ago
[-]
This was a fantastic writeup!!! Congratulations on the website. To me, though, the writeup is what you should be most proud of!!
reply
eieio
1 month ago
[-]
Thank you! I spent substantially more time on the writeup than I did on the site pre-launch, which is pretty funny to me
reply
adityaathalye
1 month ago
[-]
Hah! Hard relate... Once or twice I've gotten side-tracked by the writing of, and ended up shipping a blog post about the code I was writing, before I finished the code itself.
reply
xnx
1 month ago
[-]
> Building the site in two days with little regard for scale was a good choice.

Probably the key takeaway that many early-career engineers need to learn. Scaling's not a problem until it's a problem. At that point, it's a good problem t have, and it's not as hard to fix as you might think anyway.

reply
hobs
1 month ago
[-]
As long as you also took "so keep the system simple and basic" to heart as well - I have seen many systems where the obvious choice was microservices, not for scaling or team separation mind you, but because the devs felt like it.

Scaling those systems is a total bitch.

reply
dang
1 month ago
[-]
Recent and related:

One Million Checkboxes - https://news.ycombinator.com/item?id=40800869 - June 2024 (305 comments)

reply
winrid
1 month ago
[-]
These are fun projects. About six years ago I launched Pixmap on android, which is a little collaborative pixel editing app, supporting larger images (like 1024x1024 grids etc). I had a queue that would apply each event to png images, and then clients would load the initial PNG on connect, and then each pixel draw event is just one small object sent to the client. This way I could take advantage of image compression on initial load, and then the change sets are very small. Also, since each event is stored in a log, you can "rewind" the images [0].

[0] 22mb: https://blog.winricklabs.com/images/pixmap-rewind-demo.gif

reply
usernamed7
1 month ago
[-]
Neat! I was exploring a similar idea about per-pixel updates to many web clients, but i found it would be way too bandwidth/storage intensive to do what I wanted to do. So I've been tinkering with canvases that can be addressed by API calls

https://x.com/RussTheMagic/status/1816749136487588311

reply
winrid
1 month ago
[-]
I see what you're doing is submitting shapes over the wire. That's a bit different than what a pixel art application has to do, unless we were doing like color fill and stuff, which it doesn't support. Every pixel you draw it shows your name next to the square, too, so it's not like "draw a shape and submit" kinda thing.

regarding storage of the log, compression is a thing :)

reply
usernamed7
1 month ago
[-]
For sure, different projects - what I was doing when i started, expressing individual pixel updates was a huge storage hog, as trying to draw the most rudimentary shapes resulted in quite large storage (even after gzip) - and would translate to large bandwidth requirements (as I was going for realtime). I moved over to canvas drawings because I could express rendering with much more expressive syntax.
reply
winrid
1 month ago
[-]
Yeah, my goal at some point was to add animations and stuff, and then I'd do something like you're doing. But I moved onto other projects :)
reply
xivzgrev
1 month ago
[-]
Nice write up - curious much did it end up costing?
reply
eieio
1 month ago
[-]
Ah I should have included this (I'll edit the post shortly)

Think the total cost was about $850, which was (almost) matched by donations.

I made a mistake and never really spun down any infrastructure after moving to go, and also could have retired the second redis Replica I spun up; think I could have cut those costs in half if I had been focused on it. But given that donations were matching costs and there was so much else going on I wasn't super focused on that.

I kept the infra up for a while after I shut down the site (to prepare graphs etc) which burned a little more money, so I'm slightly in the hole at this point but not in a major way.

reply
isoprophlex
1 month ago
[-]
I wonder what that would have cost you on Hetzner, for example. I have a dedicated 20 vCPU box with 64 GB of ram and a terabyte of SSD. Bandwidth is free... and all this for 40 eur/month.

This should, especially after your Go rewrite, be enough to host everything?

reply
eieio
1 month ago
[-]
you're right, I probably could have saved some money by using Hertzner! But I'm used to using Digital Ocean, and most of my projects haven't had these scaling problems, and I think changing my stack ahead of launching the project would have been a mistake.

If I was planning to keep the site up for long I would have moved, but in this case I knew it was temporary and so toughing it out on DO seemed like a better choice.

reply
isoprophlex
1 month ago
[-]
Haha yeah of course, the click a button thing is super convenient.

I recently had to rebuild my box because I left a postgres instance open on 5432 with admin:admin credentials, and without default firewalls in place it got owned immediately.

That would have been less painful on DO for sure.

reply
eieio
1 month ago
[-]
Yeah I think this was particularly handy for Redis - click a box to upgrade my Redis instance (or add a Replica) while maintaining data with full uptime was really really nice.

Painful to use managed redis when I was debugging (let me log in! let me change stuff! give me the IP!! ahhhhh!!!!) but really nice otherwise. A little painful to think about giving that up, although I coulda run a really hefty redis setup on hertzner for very little money!

reply
wonger_
1 month ago
[-]
As someone new to backend - is there a simple alternative architecture for this project? I hope there's an easier way to host a million bits of state and sync with clients. Some of the solutions in the post went over my head.

Kudos to the author - your projects are great.

reply
eieio
1 month ago
[-]
Author here!

Sorry that some of the stuff went over your head! I wanted to include longer descriptions of the tech I was using but the post was already suuuper long and I felt like I couldn't add more.

Very happy to answer any questions that you've got here!

I'm not sure how you'd simplify the architecture in a major way to be honest. There are certainly services you could use for stuff like this, but I think there you're offloading the complexity to someone else? Ultimately you need:

    * A database that tracks which boxes are checked (that's Redis)
    * A choice about how to put your data in your database (I chose to just store 1 million bits for simplicity)
    * A way to tell your clients what the current state is (I chose to send them all 1 million bits - it's nice that this is not that much data)
    * A way for clients to tell you when they check a box + update your state (that's Flask + the websocket)
    * A way to tell your clients when a box is checked/unchecked (that's also Flask + websockets. I chose to send both updates about individual boxes and also updates about all 1 million boxes)
    * A way to avoid rendering 1 million dom elements all the time (react-window)
The other stuff (nginx for static content + a reverse proxy) is mostly just to make things easier to scale; you could implement this solution without those details and the site would work fine, it just wouldn't be able to handle the same load.
reply
sa46
1 month ago
[-]
Just spitballing: could you change the database to a bool array? Guard it with a RWMutex and persist on server shutdown. The bottleneck probably moves to pushing updates from a single server, but Go can probably handle a few tens of thousands of goroutines.
reply
stefs
1 month ago
[-]
one RW mutex would mean you'd lock the whole array; that way data access becomes pretty much single-threaded. simplest solution that comes to mind: AtomicIntegerArray (or whatever it is in your language of choice).

you could also implement a bitset over AtomicLongArray.

more complicated: partition into x*x chunks and rw-locking those. this could be backed by an mmap'ed a million bytes for persistence, but no idea if that'd make the app disk io bound or something.

reply
summerlight
1 month ago
[-]
> persist on server shutdown

Probably this is not the simplest thing to do if you want a certain degree of reliability. Should be definitely easier than writing the entire storage engine, but likely an overkill for this kind of overnight hobby projects.

reply
10000truths
1 month ago
[-]
Sure. Everything described in the article could be crammed into a single process. Instead of using a database, you could store the bitset in a file and mmap it. And instead of using a reverse proxy, you could handle the HTTP requests and WebSocket connections directly from the application.
reply
intelVISA
1 month ago
[-]
All that cloud infra and Very Smart system design for what could be served by a single dedi & AF_XDP in C++/Rust.

Always wonder why clearly smart people put themselves through these heroic gymnastics just to avoid learning a bit of <sys/socket.h>

Still a cool write-up B).

reply
eieio
1 month ago
[-]
FWIW I think it'd be a huge mistake for me to spend a bunch of time learning to write performant code!

My takeaway from this experience is that I have no idea which of my sites is going to be popular - most of my projects are not nearly this successful - and that I should be optimizing for speed of creation, not speed of an individual project. My comparative advantage is "writing ok code and making ok decisions very quickly" and I'm going to continue to lean into that.

reply
paxys
1 month ago
[-]
This is pretty much as simple as it gets tbh. A couple of web servers backed by a cache/pubsub queue.

It may have been possible to do it all in-memory on a single large host, but then if it is unable to meet the demand or fails for whatever reason then you are completely out of luck.

reply
isoprophlex
1 month ago
[-]
I don't think it gets much simpler than this to be honest... except for un-scalable things like keeping a single global list of a million booleans in the same process as your backend api.
reply
wild_egg
1 month ago
[-]
Maybe dumb question but why is keeping a million booleans in one process un-scalable? That's only 125KB of memory which can easily live inside CPU cache and be operated on atomically.
reply
eieio
1 month ago
[-]
(I'm the author)

I don't think it's unscalable at all - it's just that if you do this there's not a great story for adding a second machine if the first one can't handle the load (but ofc a beefy machine with a fast implementation could handle a lot of load).

When we did the go rewrite we considered just getting one beefier box and doing everything there (and maybe moving Redis state in memory), but it felt easier and safer to do a drop-in rewrite.

reply
Boxxed
1 month ago
[-]
But it's all going through one redis box, isn't it? That feels like you're still limited by your one beefy machine.
reply
eieio
1 month ago
[-]
Eh, like I said below, I think this gets into "do I think I can write something that scales substantially better than Redis" / "was I at risk of generating load that Redis couldn't handle" and I think the answer to both of those questions is no.

But that's based on my skillset and the timeframe of the project; I'm not disputing that the project is bounded by Redis's performance.

reply
alright2565
1 month ago
[-]
> there's not a great story for adding a second machine if the first one can't handle the load

I mean this is basically your redis situation right? Just with a very specialized "redis".

You could scale this out, even after the pretty massive ability to scale up is exceeded. Have some front-end servers that act as a connection pooler to your datastore. Or shard the datastore, and have clients only request from the shards that they are currently looking at.

reply
eieio
1 month ago
[-]
Right, and then the question is "is my specialized datastore gonna be faster than Redis" right? And it seems totally reasonable that you could make something faster eventually - but I think it's not a reasonable goal within the timeframe of the go rewrite (one Sunday afternoon and evening). Especially if you want to extend that system such that other services could talk to it!

The entire timeframe of this project was 2 weeks, and the critical period (most activity / new eyes) was a couple of days.

reply
alright2565
1 month ago
[-]
Sorry, I'm talking hypothetically about how this would be designed, not in the context of your specific timeframe!

> "is my specialized datastore gonna be faster than Redis"

Absolutely! With how efficient this code would be, you'd likely never need to scale horizontally, and in that case it is extremely easy to compete with a network hop (at least 1ms latency) versus L1 cache (<50ns)

The comparison with redis otherwise only applies once you do need to scale horizontally.

There's also the fact that redis must be designed to support any query pattern efficiently; a much harder problem then supporting just 1-2 queries efficiently.

reply
eieio
1 month ago
[-]
> Sorry, I'm talking hypothetically about how this would be designed, not in the context of your specific timeframe!

Oh sure, yeah, I think we just agree then!

reply
isoprophlex
1 month ago
[-]
Oh no that's absolutely fine. I wasn't thorough. As commented by the author, that probably gets you very far.

However... You'd need to persist those booleans somewhere eventually, of course, if you want the state to survive a process restart. And if you want multiple concurrent connections from the same box, you have to somehow allow multiple writers to the same object. And if you want multiple boxes (for redundancy, load spreading, geo distribution...), you need a way to do the writing and reading from several different boxes...

By this time you're basically building redis.

reply
imtringued
1 month ago
[-]
Use a quad tree to summarize entire blocks of checkboxes with the same state as a tuple (checked, start_x, start_y, end_x, end_y). You know, the blatantly obvious things.
reply
whynotmaybe
1 month ago
[-]
Awesome !

Will your next post be a statistical analysis of which checkboxes' were the less/most checked ?

I remember scrolling way down and being kind of sad that the one I choose was almost instantly unchecked.

reply
eieio
1 month ago
[-]
I’m gonna share the raw data soon! Just have one more story about the site I need to tell first.
reply
geek_at
1 month ago
[-]
Is the game still live?

When I go to https://onemillioncheckboxes.com/ nothing is checked and in the JS console I just see

{"total":0,"totalGold":0,"totalRed":0,"totalGreen":0,"totalPurple":0,"totalOrange":0,"recentlyChecked":false}

reply
kube-system
1 month ago
[-]
From TFA:

> We passed 650 million before I sunset the site 2 weeks later.

reply
geek_at
1 month ago
[-]
Ah thanks. Makes sense since the server costs were 800+$
reply
huem0n
1 month ago
[-]
Here's the opposite of your scalable implementation: 1mil checkboxes in under 1000 characters! (Deno)

https://gist.github.com/jeff-hykin/4cdebafd8698298d021f103e2...

reply
creativenolo
1 month ago
[-]
> This also validated my belief that people are hungry for constrained anonymous interactions with strangers.
reply
butz
1 month ago
[-]
This project won't be complete until it supports the elusive "indeterminate" state of the checkbox.
reply
layer8
1 month ago
[-]
It should go indeterminate whenever clicks from two users on the same checkbox are detected where the second click occurred before that copy of the checkbox received the result of the first click.
reply
isoprophlex
1 month ago
[-]
TIL you can run Lua scripts on your Redis server. That's so nice, I never knew!
reply
junon
1 month ago
[-]
Really cool, awesome followup to the original project.
reply
j0hnyl
1 month ago
[-]
Would you be willing to share your nginx config?
reply