Exa-d: How to store the web in S3
40 points
12 hours ago
| 2 comments
| exa.ai
| HN
exa-d is our internal data processing framework that stores the web in S3. It helps deal with the complexity of data at (web) scale using specific design decisions like declarative typed dependencies and enabling sparse updates.
timvdalen
4 hours ago
[-]
Opening this page makes my (quite beefy) machine grind to a halt! Almost all CPU threads and the GPU jump up to 80% usage
reply
neilv
4 hours ago
[-]
Would be funny if blog visitors were the distributed compute nodes.
reply
swyx
6 hours ago
[-]
hi will! super nicely written, nice look under the hood of your processing. as an orchestration guy i always wondered why everyone seems to converge on using Ray, and as a secondary thought, how well is Anyscale capturing the Ray market.

if i were doing what you do i might set up a lot of rate limits/anomaly detection in case some weird unintended invalidation causes a weird spike in your dependency graphs. is there good practice there for anomaly detection other than "setup a bunhc of dashboards and be on call"?

reply