The weekend hack turned into a headless city simulation platform where anyone can get an API key (no signup) and have their AI agent play mayor. The simulation runs the real Micropolis engine inside Cloudflare Durable Objects, one per city. Every city is public and browsable on the site.
LLMs are awful at the spatial stuff, which sort of makes it extra fun as you try to control them when they scatter buildings randomly and struggle with power lines and roads. A little like dealing with a toddler.
There's a full REST API and an MCP server, so you can point Claude Code or Cursor at it directly. You can usually get agents building in seconds.
Website: https://hallucinatingsplines.com
API docs: https://hallucinatingsplines.com/docs
GitHub: https://github.com/andrewedunn/hallucinating-splines
Future ideas: Let multiple agents play a single city and see how they step all over each other, or a "conquest mode" where you can earn points and spawn disasters on other cities.
It'd be kind of fun to just let this run on a raspberry pi using a local model and display the emergent world on a wall hanging display :P
Thanks for sharing.
Update: What would it take to run this locally / offline? I'm not quite sure how the cloud flare layer works. Is it just for cheap/free object storage so the cities can live somewhere?
I don't think it would take much to run locally. In fact, before I did this public version I did a local version on an exe.dev VM (more details here: https://dunn.us/notes/vibe-gaming-simcity/).
So you can either use my code, or just have your coding your agent of choice pull in the Micropolis repo and give it some guidance.
So far this is running quite nicely on a $5 cloudflare account. It was running on a free account but I upgraded so we don't hit the daily limit with all the extra mayors.
Shoot me a message if I can help.
PS: Absolutely nailed the name of the project :P "Hallucinating Splines" is genius.
And some kid is going to come in, make an agent to play this, and accidentally figure out some clever trick to getting an LLM to understand spacial stuff!
This is exactly why "toys" are so critical, especially now.
https://github.com/lawless-m/FacRepl
It did make a REPL, in order for it to place objects within the game using a DSL.
I kind of gave up on the Constraints Based bit, and never returned.
So while using LLMs is the natural/fun thing to do with it, I actually have one mayor just using parameterized code and natural selection.
It has a "genome" of 26 tunable parameters controlling zone ratios, tax rates, building placement, terrain preference, service spacing, and more. Each city, it stamps down 11x11 blocks (roads, zones, power corridors). After the city is retired, it scores the result and decides: did this beat my best? If yes, save those params. If no, mutate and try again. Exploration strategy: 20% exploit best params, 40% gentle mutation, 20% aggressive mutation, 20% totally random. Over ~250 cities it's discovered things like heavily favoring residential (6:1:1 ratio), preferring river valley maps, setting taxes to 6%, and starting builds in the upper-left.
"it's currently Flan Sam's at pokemon"
1. People discover things LLMs can kind of do, but very poorly.
2. Frontier labs sample these discoveries and incorporate them into benchmarks to monitor internally.
3. Next generation model improves on said benchmarks, and the improvements generalize to improvements on loosely correlated real world tasks.
https://github.com/andrewedunn/hallucinating-splines/blob/ma...
But you can tell it to do different things, somewhere someone made a city that spells "HI".
The key "Aha!" moment was when I was trying to get it to play the SNES ROM and it was struggling with screenshots/inputs. Then I came across the open-source of the original SimCity engine (Micropolis) and pulled that repo down and Claude starting building an internal API to interface with it.
But to read someone else's strategy from just a document, and then implement it, that is new. The old civ did not do that, each AI just had pre-programmed rules.
Mayor Compounded Wonder - Claude Opus 4.6
https://hallucinatingsplines.com/mayors/compounded-wonder-2c...
Mayor Bronze Offramp - OpenAI Codex 3.6
https://hallucinatingsplines.com/mayors/bronze-offramp-09941...
TL;DR: Opus won.
Have also thought about using openrouter and getting one mayor per model running the same prompt through all of them to create potentially the world's dumbest LLM benchmark.
Which LLMs are you specifically referring to?
Are any of them trained with Micropolis data?