FilterHN

Show HN: Jbofs – explicit file placement across independent disks

4 points

by aozgaa

1 hour ago

| past

| 2 comments

| github.com

| HN

Hey all,

I created `jbofs` ("Just Bunch of File Systems") as an experiment to workaround some recurring issues I have run into with suprising filesystem performance.

My use case is storing large media (think pcaps, or movie files), where some hierarchal organization is still useful.

Whether it’s NFS mounts (and caches), ZFS, or RAID (mis)configurations, I’ve run into surprising(ly bad) performance on many occasions. Doubtless this is largely user error, but it can be hard to diagnose what went wrong, and I’ve resorted to things like copying a file to `/tmp` or some other local mount with a simple ext4/XFS filesystem that I understand. When I see r/w happening at 200MB/s but know that 66GB/s is possible[1], it can be quite disheartening

I’ve wanted something dead simple which peels back the curtains and provides minimal abstraction (and overhead) atop raw block devices. I’ve messed around with FUSE a bit, and did some simple configuration experiments on my machine (a workstation), but came back to wanting less, not more. I did do some RTFM with immediate alternatives[2], but could have missed something obvious -- let me know!

As a compromise to avoid implementing my own filesystems, I built this atop existing filesystems.

The idea is pretty simple -- copy files to separate disks/filesystems, and maintain a unified “symlink” view to the various underlying disks. Avoid magic and complication where possible. Keep strict filesystem conventions.

Of course, this is not a filesystem. Maybe that’s a bad thing -- which is one thing I’m trying to figure out with this experiment.

If you have experience using anything like the other filesystems or similar stuff, would love to get your feedback, and especially thoughts about why this symlink thing is not the way to go!

Lastly, thanks for taking the time to look at this!

[1] https://tanelpoder.com/posts/11m-iops-with-10-ssds-on-amd-th...

[2] https://github.com/aozgaa/jbofs/blob/main/docs/comparison.md

▲

silentvoice

1 hour ago

[-]

what are some bad behaviors you've seen with NFS,ZFS,RAID and how do you diagnose it and how did it lead you to this solution

▲

aozgaa

1 hour ago

[-]

NFS -- very slow reads, much slow than `cp /nfs/path/to/file.txt ~/file.txt`. I generally suspect these have to do with some pathological behavior in the app reading the file (eg: doing a 1-byte read when linearly scanning through the file). diagnose with simple `iotop`, timing the application doing the reads vs cp, and looking at some plethora or random networking tools (eg: tcptop, ...). I've also very crudely looked at `top`/`htop` output to see that an app is not CPU-bound as a first guideline.

ZFS -- slow reads due to pool-level decompression. zfs has it's own utilities, iirc it's something like `zpool iostat` to see raw disk vs filesystem IO.

RAID -- with heterogenous disks in something like RAID 6, you get minimum disk speed. This shows up when doing fio benchmarking (the first thing I do after setting up a new filesystem/mounts). It could be that better sw has ameliorated this since (last checked something like 5ish years ago).

▲

emanuele-em

1 hour ago

[-]

Honestly this is way more appealing than fighting mergerfs when you just want explicit disk placement. Doctor + prune for orphaned symlinks is exactly what you'd need to keep things sane over time. Saw it's written in Zig, how's that been for this kind of systems tooling?

▲

aozgaa

47 minutes ago

[-]

> Honestly this is way more appealing than fighting mergerfs when you just want explicit disk placement. Doctor + prune for orphaned symlinks is exactly what you'd need to keep things sane over time.

That's the hope!

> Saw it's written in Zig, how's that been for this kind of systems tooling?

Zig has been pretty fine. It could have just as well been done in C/C++ but as a hobby thing I value (a) fast compilation (rules out building stuff in C++ without jumping through hoops like avoiding STL altogether) and (b) slightly less foot guns than C.

The source code itself is largely written with LLM's (alternating between a couple models/providers) and has a bit of cruft as a result. I've had to intervene on occasion/babysit small diffs to maintain some structural coherence; I think this pretty par for the course. But I think having inline unit tests and instant compilation helps the models a lot. The line noise from `defer file.close();` or whatever seems pretty minor.

Zig has pretty easy build/distribution since the resulting executable has a dependency on just libc. I haven't really looked into packaging yet but imagine it will be pretty straightforward.

My one gripe would be that the stdlib behavior is a bit rough around the edges. I ran into an issue where a dir was open(2)'d with `O_PATH` by default, which then makes basically all operations on it fail with `EBADF`. And the zig stdlib convention is to panic on `EBADF`. Which took a bit or reading zulip+ziggit to understand is a deliberate-ish convention.

All this to say, it's pretty reasonable and the language mostly gets out of the way, and let me make direct libc/syscalls where I want.