Show HN: Archil's one-click infinite, S3-backed local disks now available
13 points
1 hour ago
| 1 comment
| HN
Hey everyone, I’m Hunter, the founder of Archil. Archil is transforming object storage, like Amazon S3, into infinite, local file systems that provide instant access to massive data sets.

Last year, we launched Archil’s NFS-based product publicly on Hacker News (https://news.ycombinator.com/item?id=42174204), and we were absolutely thrilled to see the response of this community.

Since our last launch, we took a 10 month company-wide bet to build our own, custom storage protocol to deliver true, local-like performance to cloud instances – by behaving closer to a block storage product. Today, we are excited to announce the public availability of this high-performance protocol. You can try it for free today with a single click at https://disk.new (docs at https://docs.archil.com).

After spending 10 years in the storage industry building products like Amazon’s Elastic File System and buying AWS storage on Netflix’s core storage team, I knew that developers faced tremendous problems using storage. Getting easy-to-use persistent storage on a Kubernetes cluster is non-trivial, determining how much storage an application needs can be impossible if it’s bursty, and companies are tired of overpaying for either very expensive storage (like EFS) or unused capacity (like with EBS).

From my time on EFS, I also knew that NFS was not the solution. Users were disappointed to see a 10x performance drop when they moved data from EBS to EFS to share it across multiple instances.

This is why we built Archil.

Archil is a pay-as-you-go disk that automatically grows with your application. Because it synchronizes your data bidirectionally to sources like your Amazon S3 buckets (and other S3-compatible locations), you can connect it instantly to existing data sets and use your file data directly from S3 itself.

When you mount an Archil disk, your instance connects to our managed caching fleet of instances with NVMe devices that provide read-through and write-back caching to your disk’s data sources, like S3. Because we designed a distributed, replicated, highly-durable cache, it’s a safe place for data to be stored before being written back to S3. We only charge users for data that’s actively in the cache, and when you aren’t using your disk, you don’t pay Archil anything.

We’re really excited about the early companies that we’re working with who are building cutting edge CI/CD workers, performing satellite image processing, building serverless Jupyter Notebooks, creating AI-native code sandboxes (have you ever tried to run git directly on an NFS share?), building best-in-class AI agents that use file systems instead of MCP tools, and helping to deliver core AI infrastructure like gateways and compute.

I’m excited to give you all access today, with no charges in the month of September, and I’d love any feedback on the product that you may have. If you’re interested in working on deep, technical problems in the core infrastructure space, we’re also hiring in San Francisco and would love to hear from you (see https://www.ycombinator.com/companies/archil/jobs/svfkDVv-fo...).

huntaub
1 hour ago
[-]
Some quick questions that came up in the last post, that I wanted to go ahead and address:

How are you different than existing products like S3 Mountpoint, S3FS, ZeroFS, ObjectiveFS, JuiceFS, and cunoFS?

Archil is designed to be a general-purpose storage system to replace networked block storage like EBS or Hyperdisk with something that scales infinitely, can be shared across multiple instances, and synchronizes to S3. Existing adapters that turn S3 into a file system are either not POSIX-compliant (such as Mountpoint for S3, S3FS, or GoofyFS), do not write data to the S3 bucket in its native format (such as JuiceFS, ObjectiveFS – preventing use of that data directly from S3), or are not designed for a fully-managed one-click set up (such as cunoFS). We have massive respect for folks who build these tools, and we’re excited that the data market is large enough for all of us to see success by picking different tradeoffs.

What regions can I launch an Archil disk in?

We’re live in 3 regions in AWS (us-east-1, us-west-2, and eu-west-1) and 1 region in GCP (us-central1). Today, we’re also able to deploy directly into on-premises environments and smaller GPU clouds. Reach out if you’re interested in an on-premises deployment (hleath [at] archil.com).

Can I mount Archil from a Kubernetes cluster?

Yes! We have a CSI driver that you can use to get ReadWriteOnce and ReadWriteMany volumes into your Kubernetes cluster.

What performance benchmarks can you share?

We generally don’t publish specific performance benchmarks, since they are easy to over-index on and often don’t reflect how real-world applications run on a storage system. In general, Archil disks provide ~1ms latency for hot data and can, by default, scale up to 10 GiB/s and tens of thousands of IOPS. Contact me at hleath [at] archil.com, if you have needs that exceed these numbers.

What happens if your caching layer goes down before a write is synchronized to S3?

Our caching layer is, itself, highly-durable (~5 9s). This means that once a write is accepted into our layer, there are no individual components (such as an instance or an AZ) failure which would cause us to lose data.

What are you planning next for Archil?

By moving away from NFS and using our new, custom protocol, we have a great foundation for the performance work that we’re looking to accomplish in the next 6 months. In the short-term, we plan to launch: one-click Lustre-like scale-out performance (run hundreds of GiB/s of throughput and millions of IOPS without provisioning), the ability to synchronize data from non-object storage sources (such as HuggingFace), and the ability to use multiple data sources on a single disk.

How can I learn more about how the new protocol works?

We’re planning on publishing a bunch more on the protocol in the coming weeks, stay tuned!

reply