I created `jbofs` ("Just Bunch of File Systems") as an experiment to workaround some recurring issues I have run into with suprising filesystem performance.
My use case is storing large media (think pcaps, or movie files), where some hierarchal organization is still useful.
Whether it’s NFS mounts (and caches), ZFS, or RAID (mis)configurations, I’ve run into surprising(ly bad) performance on many occasions. Doubtless this is largely user error, but it can be hard to diagnose what went wrong, and I’ve resorted to things like copying a file to `/tmp` or some other local mount with a simple ext4/XFS filesystem that I understand. When I see r/w happening at 200MB/s but know that 66GB/s is possible[1], it can be quite disheartening
I’ve wanted something dead simple which peels back the curtains and provides minimal abstraction (and overhead) atop raw block devices. I’ve messed around with FUSE a bit, and did some simple configuration experiments on my machine (a workstation), but came back to wanting less, not more. I did do some RTFM with immediate alternatives[2], but could have missed something obvious -- let me know!
As a compromise to avoid implementing my own filesystems, I built this atop existing filesystems.
The idea is pretty simple -- copy files to separate disks/filesystems, and maintain a unified “symlink” view to the various underlying disks. Avoid magic and complication where possible. Keep strict filesystem conventions.
Of course, this is not a filesystem. Maybe that’s a bad thing -- which is one thing I’m trying to figure out with this experiment.
If you have experience using anything like the other filesystems or similar stuff, would love to get your feedback, and especially thoughts about why this symlink thing is not the way to go!
Lastly, thanks for taking the time to look at this!
[1] https://tanelpoder.com/posts/11m-iops-with-10-ssds-on-amd-th...
[2] https://github.com/aozgaa/jbofs/blob/main/docs/comparison.md
ZFS -- slow reads due to pool-level decompression. zfs has it's own utilities, iirc it's something like `zpool iostat` to see raw disk vs filesystem IO.
RAID -- with heterogenous disks in something like RAID 6, you get minimum disk speed. This shows up when doing fio benchmarking (the first thing I do after setting up a new filesystem/mounts). It could be that better sw has ameliorated this since (last checked something like 5ish years ago).
That's the hope!
> Saw it's written in Zig, how's that been for this kind of systems tooling?
Zig has been pretty fine. It could have just as well been done in C/C++ but as a hobby thing I value (a) fast compilation (rules out building stuff in C++ without jumping through hoops like avoiding STL altogether) and (b) slightly less foot guns than C.
The source code itself is largely written with LLM's (alternating between a couple models/providers) and has a bit of cruft as a result. I've had to intervene on occasion/babysit small diffs to maintain some structural coherence; I think this pretty par for the course. But I think having inline unit tests and instant compilation helps the models a lot. The line noise from `defer file.close();` or whatever seems pretty minor.
Zig has pretty easy build/distribution since the resulting executable has a dependency on just libc. I haven't really looked into packaging yet but imagine it will be pretty straightforward.
My one gripe would be that the stdlib behavior is a bit rough around the edges. I ran into an issue where a dir was open(2)'d with `O_PATH` by default, which then makes basically all operations on it fail with `EBADF`. And the zig stdlib convention is to panic on `EBADF`. Which took a bit or reading zulip+ziggit to understand is a deliberate-ish convention.
All this to say, it's pretty reasonable and the language mostly gets out of the way, and let me make direct libc/syscalls where I want.