Mounting Git commits as folders with NFS
211 points
2 years ago
| 14 comments
| jvns.ca
| HN
mbakke
2 years ago
[-]
For those who resonate with "why might this be useful", here are "plain git" alternatives to this tool:

> searching for a function I deleted

    git log -G someFunc
> quickly looking at a file on another branch to copy a line from it

I use `git worktree` to "mount" long-running branches much to the same effect as Julias tool. To quickly look at a file from a non-mounted branch/commit, I use:

    git show $REF:$FILENAME
> searching every branch for a function

    git log --all -G someFunc
Note that -G can be replaced with -F for a speedup if the pattern you are searching for is a fixed string.
reply
masklinn
2 years ago
[-]
> searching for a function I deleted

> git log -G someFunc

This will look for all changes mentioning someFunc throughout the history of the project.

Usually -S is more valuable, as it will look for changes in occurrence counts. So if you moved a call in a commit -G will flag it, but -S will ignore it (+1-1 = 0).

-S also defaults to fixed string, so no need for -F. Instead you need —pickaxe-regex to switch it to regex search.

reply
globular-toast
2 years ago
[-]
Magit has a really easy to use way to "step" through previous versions of files. It's usually bound to something like "C-f p". You get a read only buffer of the previous version open in the best text editor (emacs). You can then press n and p to step through next and previous versions of that file. Can be pretty useful!

It's kind of funny, I think, how most git users don't seem to know how to access any version other than the current one. So many people think of it simply as the annoying tool you have to use to make code changes but don't really know what version control is.

reply
divbzero
2 years ago
[-]
That’s a pretty cool feature of Magit.

I was inspired to look for something similar for the next best text editor (vim) and came across this: https://salferrarello.com/using-vim-view-git-commits/

    git log | vim -R -
Placing your cursor over a commit hash and entering K displays git show for that commit.
reply
sourcegrift
2 years ago
[-]
I've been told by my elders that when a vim user encounters an emacs supremacist, they must fight back. You can't just call it "the next best text editor".

Jokes aside, As a vim user of 6 years, I did learn just enough emacs for magit (TM) and have also been making quick bucks on the side teaching it to my friends , so I guess I can't help with the "fight back" part :-)

reply
cnity
2 years ago
[-]
If you're not using neovim you're really missing out right now IMO. It's a renaissance for a hackable text editor because it uses a sensible modern programming language (Lua) rather than vimscript (yikes) or elisp (eh).
reply
klibertp
2 years ago
[-]
Having worked with Lua and Elisp extensively, they have very different pros/cons profiles. In my experience, Lua is great as a scripting language - tiny, speedy, and completely dynamic. For "programming in the large(r)," Lua is just a little better than early JavaScript (i.e., tragic). The purity of the design - tables, metatables, closures, coroutines, and that's it - necessitates reinvention of tens of wheels (either in Lua or in the host app) when your codebase grows and complexity increases. Elisp provides two orders of magnitude more "bells and whistles" than Lua out of the box. Additionally, while Lua's primitives are extremely powerful, they are all strictly run-time constructs. Elisp has macros, so many abstractions can be (and are) shifted to compile time.

I use Awesome WM. It's essentially an "Emacs of Window Managers," and the codebase is very well written, with a small C core and everything else implemented in Lua. It's even very well documented. Yet, writing a nontrivial program (call it an "applet" or something) for Awesome is a nightmare compared to doing the same in Emacs.

LuaJIT is an excellent runtime, and Lua is a great IR, but writing it by hand for anything that's not strictly scripting within a previously established framework is challenging. It's to the point where I'm using Haxe to produce Lua for my Awesome scripts. I know a few people who use Haxe to script NeoVim, too. Really, having to reinvent inheritance and method resolution order every time you start writing Lua in a new project gets old fast.

I genuinely like Lua as a language - the same way I like Tcl, Scheme, and Io. They are all beautiful and powerful and perform very well in some scenarios. Elisp is ugly in comparison, but it's way more practical for medium-sized codebases. Being tied to Emacs is a considerable downside which limits its applicability, but focusing on language features alone, larger codebases are more practical to write in Elisp than in Lua. Plus, there's an escape hatch - Common Lisp or Clojure, pick your poison - for cases where Elisp actually doesn't cut it. There's no such easy way out for Lua.

reply
vfclists
2 years ago
[-]
> There's no such easy way out for Lua.

I think there are some Lisp/Clojure inspired languages developed for Neovim which are used in Lua.

https://github.com/Olical/conjure

https://github.com/Olical/aniseed

https://github.com/Olical/nfnl

reply
globular-toast
2 years ago
[-]
You may be missing the point of Elisp. Elisp isn't an "extension language". It's the language Emacs is built with. When you run Emacs you're actually running a Lisp interpreter with a load of text editing features pre-loaded. When you eval some Lisp you're modifying the runtime, essentially live patching your editor in real time. So really any comparisons with Elisp are irrelevant unless you can do what it can do. Common Lisp and Scheme (Guile) are real contenders but the challenge is not giving up the enormous amount of useful code that is already written in Elisp.
reply
galangalalgol
2 years ago
[-]
Does the elisp interpreter that runs emacs have jit?
reply
fourthark
2 years ago
[-]
Only with v28, a year or two ago.
reply
klibertp
2 years ago
[-]
That's not a JIT. It uses `libgccjit` (IIRC the name), but the native code is produced ahead-of-time. JITs compile using info available on runtime, and native-comp doesn't do that. LuaJIT, by contrast, is a "real" JIT. Still, native-comp does speed things up considerably.
reply
cnity
2 years ago
[-]
I'm a big lisp fan. I know about all of this, I used emacs for maybe a decade, and I still don't like elisp. I love the hackability of Emacs, but it's OK to dislike the language itself. Disliking semantic choices of the language doesn't mean I'm missing the point either!

And on "it's not just an extensibility language": in my experience this doesn't matter. I get that "well the editor itself is half written in elisp" and so vaguely that is superior, but it is only so in an academic sense.

Expose the primitives for the editor in some API in _any_ langauge and you can basically achieve the same thing anyway, so pick a language that doesn't make me want to poke my eyeballs out with a hot skewer.

Sorry, rant over.

reply
globular-toast
2 years ago
[-]
I think if you're just talking about writing extensions/packages (like magit) then the difference is not as big. But for using Emacs it makes a big difference. I can just start hacking on package code by redefining functions etc. and using/testing them straight away. The power of Emacs is not about being able to write extensions (most editors can do that), it's about being able to write tiny little bits of code to change your editing experience as you go. There are specific things in Elisp that make it good for this, like dynamically scoped variables. Writing extensions for other editors is always a "thing", a project. Writing Elisp to change how Emacs works is just using Emacs.
reply
cnity
2 years ago
[-]
I will concede that you are probably a very different kind of Emacs user than I, since I pretty much exclusively used Elisp to set up and tune my editor to my liking as a totally independent act from actually using my editor to write programs.
reply
fiddlerwoaroof
2 years ago
[-]
For me, even when working on something besides my editor configuration, having access to the parts of the “editor primitives” makes for a lot of powerful one-off editing tools. It was relatively easy, for example, for me to us lsp features to get a list of undefined JS variables in the current scope and add them to the function argument list. And, since lsp and the other bits I put together are all in elisp, I could use jump-to-definition to quickly find the itnernals I need to make the change.
reply
klibertp
2 years ago
[-]
What you're describing is a feature of the system as a whole. You can have the same workflow with any dynamic, reflective environment. All Smalltalks give you the same ability to "jump to definition" of anything, turtles all the way down, and fiddle with those definitions. You could have the same ability in a system written in Lua - you just generally don't, because it requires designing the system as a whole specifically to allow it.
reply
fiddlerwoaroof
2 years ago
[-]
Sure, but I’m not particularly defending emacs lisp (would prefer Common Lisp). It’s not exactly true that any language can have it, though: the language has to be designed to handle redefinition correctly.
reply
klibertp
2 years ago
[-]
> the language has to be designed to handle redefinition correctly.

I believe it's more a question of how easy the redefinition is to implement. You can live-update a running Java or C programs, it's just less convenient/harder to pull off than with Forth, Smalltalk, Lisp, Prolog, and the like. So I think that yes, in principle, every language could have it - it's just that you'd need a huge pile of hacks for some and a few simple instructions for others to get it.

reply
fiddlerwoaroof
2 years ago
[-]
I think it’s basically impossible to safely live-patch part of a compilation unit in most programming languages: you’d have to account for inlining and other optimizations to do this correctly. You _can_ patch at linkage seams and other places, but this is a fraction of the sorts of redefinitions that you get easily in systems designed for it. (And I’ve spent a lot of time trying to make various programming languages more Lispy so I can get stuff done: you always discover there are static presumptions that make it impossible to get the full experience)
reply
klibertp
2 years ago
[-]
> I think it’s basically impossible to safely live-patch part of a compilation unit in most programming languages

There's no argument there; you're right. That's why V and Nim, for example, put reloadable things in a separate compilation unit and handle some things (global state at least) specially upon reload (if I understand what they do correctly.)

My point was that you can get quite close (sometimes with a massive pile of hacks and/or developer inconvenience), not that you can get the full experience (as in Smalltalk or Lisp) everywhere. Especially since the reloading being convenient is a large part of the experience, I think.

reply
fiddlerwoaroof
2 years ago
[-]
Or I could use magit inside the best implementation of the vi standard.
reply
masklinn
2 years ago
[-]
My bread and butter is jumping via blame, though I don’t really like emacs’ blame view so I generally use intellij or git gui.

e.g. see something odd / interesting, activate blame view, and “show diff” / “annotate previous revision” to see how the hunk evolved. Often this provides fast-tracking through the file’s history as the hunk goes through inconsequential changes (e.g. reformatting, minor updates) without changing the gist of what sparked your interest.

reply
fiddlerwoaroof
2 years ago
[-]
vc-annotate in emacs is my favorite blame view I’ve seen, if you haven’t tried it.
reply
cnity
2 years ago
[-]
Similarly with fugitive in vim, which is fantastic. Diffing, resolving conflicts, and moving through file revisions (and a lot more).
reply
tambourine_man
2 years ago
[-]
If you’re a Vim user, fugitive by tpope is a great tool
reply
levidos
2 years ago
[-]
Extremely handy, saving these. Thank you.
reply
masklinn
2 years ago
[-]
> Git repositories sometimes have submodules. I don’t understand anything about submodules so right now I’m just ignoring them.

Submodules are interesting, because they’re next to unusable from a user perspective (they’re a pain to maintain and interact with unless you never ever update them) but they’re ridiculously simple technically which I assume is what made them attractive.

A submodule is an entry in “.gitmodules” mapping a path to a repository URL (and branch), then at the specified path in the repository is a tree entry of mode 160000 (S_IFDIR + S_IFLNK), whose oid is the commit to check out (in the submodule-linked repository).

reply
AceJohnny2
2 years ago
[-]
To add to this, Submodules are a hack on Git's data model.

Git's data model, put simply, is this:

* Branches are pointers to Commit objects

* Commit objects are a composite of {Commit_Comment, Tree, Parent Commit(s)}, referenced by the hash of that set

* a Tree (like a directory) is a list of Blobs and/or Trees (associating filenames with them) referenced by the hash of its contents.

* a Blob is a file, referenced by the hash of its contents.

So the set of types of objects in Git are: Commits, Trees, and Blobs.

Note that I said that a Tree can contain other Trees or Blobs... but what if... you put a Commit in it!?

That's a submodule!

Now if the Commit you reference doesn't exist in the current repo, Git can't do anything with it. That's where the .gitmodules file comes in, to associate a given path with a repo, so that Git can look up the Commit object in that repo.

reply
masklinn
2 years ago
[-]
> Branches are pointers to Commit objects

Nit: the data model has refs, which are pointer to objects.

Branches are the subset of refs in the special-cased heads/ namespace which should be pointing to commit objects.

And there’s also tags, which are the subset of refs in the special-cased tags/ namespace, which should be pointing to commit (“lightweight”) or tag (“annotated”) objects.

reply
AceJohnny2
2 years ago
[-]
Absolutely.

Going further, the little-known git-notes [1] feature also uses its own reference namespace, `refs/notes/`

Going even further, Gerrit [2] leverages the wide-open reference namespace/directories to create its own. For example, pushing under the `refs/for/` namespace creates a new review, and specific reviews can be looked up under the `refs/changes/` namespace.

Even even further, Gerrit's special repos All-Users.git and All-Projects.git are "databases" for project configuration and user configuration, where for example external IDs (like usernames) are stored under the special `refs/meta/external-ids` ref/branch. This has the notable benefit that all configuration changes are tracked and auditable.

I believe git-appraise [3] also leverages special reference namespaces in Git for performing its review duties (but I don't know details) Edit: actually no, it "just" leverages git-notes.

[1] https://git-scm.com/docs/git-notes

[2] https://gerrit-review.googlesource.com/Documentation/note-db...

[3] https://github.com/google/git-appraise

reply
masklinn
2 years ago
[-]
GitHub also leverages refs, for each PR there are two magical refs in refs/pull/<number>/ which point to the PR’s head and its prospective merged head
reply
colejohnson66
2 years ago
[-]
So, one could theoretically "embed" an older commit into their repository as a pointer (submodule folder)? And because Git knows what that commit ID means it will show it fine?
reply
AceJohnny2
2 years ago
[-]
Unfortunately no, because of how it's a hack, and so extra logic had to be tacked on to support it (logic which the `git submodule` command implements).

I tried it out, and the local commit object just appears as an empty folder.

To try for yourself, do this:

    git init recursive-submodule && cd recursive submodule

    echo "foo" > file1 && git add file1 && git commit -m "Add file1"

    # this will have created an initial commit with hash 2d49d729. Then:

    git update-index --add --cacheinfo 160000 2d49d729fe39d1def8ce537d7efeeabbf3efb4f2 submodule && git commit -m "Add submodule"
The `update-index` command is the plumbing command for adding an arbitrary object, which I used to add the previous commit object. Since we only updated the index and not the workspace, git will note that the submodule is missing. You can then run `git restore .` to set the workspace to the state the history says (ie with the missing submodule)... but that just creates an empty directory.

Vanilla git won't try much further. To actually populate the submodule requires running `git submodule update --init`, and that requires a `.gitmodules` file, even for a local commit object.

(To learn more about plumbing commands like update-index, look at the Git Book chapter on Internal Git Objects: https://git-scm.com/book/en/v2/Git-Internals-Git-Objects )

reply
nrclark
2 years ago
[-]
I feel like submodules are one of Git's most misunderstood features. I agree that they're really not great for the use-case of "I have to work in a bunch of repos at the same time", but they're also not designed for that.

Submodules are a really good solution for problems that look like "this repo depends on some upstream repos that I don't control", and a bad solution for any other problem. They do what they were designed to do.

Imagine that your build-script needs to clone a bunch of third-party dependencies. So maybe you write some kind of clone.sh that loops through a bunch of Git repo URLs. Then later you want to also specify specific commit hashes, so you add a commit-ID field. Then you write a tool that makes it easy to update the fields in your clone.sh file. Guess what you've got? Git submodules.

reply
darkwater
2 years ago
[-]
Git submodules can be useful for vendoring internal parts without code duplication. It can help you if the tech you are writing the code in that repo doesn't have any specific/advanced tool for dependency management. By using `git checkout --recurse-submodules` you have a poor-man version of a package system.

I'm not endorsing it as the best feature ever or as the way to do dependency management but it can be used in certain situations.

reply
okl
2 years ago
[-]
Git already has a similar feature. It's called worktree https://git-scm.com/docs/git-worktree

For example: `git worktree add <folder> <commit-hash>` checks out that commit in that folder.

reply
masklinn
2 years ago
[-]
Worktrees require checking out each branch individually.

And they’re something of a pain in the ass to manage as you can’t have two worktrees to the same branch.

So for temporary querying across a few branches doing clones is a lot easier (git will hardlink when cloning on the same FS so performances are not an issue, and you can just delete the scratch clones afterwards without the need to prune the worktrees), and if you regularly need to query things across a lot of branches worktrees are unusable.

reply
Vohlenzer
2 years ago
[-]
I find it helpful to keep a few worktrees around for common tasks.

This saves me from throwing away my working copy when I switch to a different task;

  /source/repos/AcmeCorp            2fddd74f9a [bug/CurrentWork]
  /source/repos/AcmeCorp-hotfix     27175cf6c5 [hotfix/2023-11-27.1]
  /source/repos/AcmeCorp-master     016ca20b75 [master]
  /source/repos/AcmeCorp-reference  454be5348d [feature/RecentReviewedWork]
  /source/repos/AcmeCorp-release    95027177d7 [release9.12.0]
reply
vnorilo
2 years ago
[-]
TFA is for the actual data in the commit, not a parallel worktree.
reply
skadamat
2 years ago
[-]
Big fans of NFS, we built a Rust library that implemented NFS for a similar same use case (mounting Git repos): https://about.xethub.com/blog/nfs-fuse-why-we-built-nfs-serv...
reply
evgpbfhnr
2 years ago
[-]
> stale file handles... This is still a problem and I’m not sure how to fix it. I don’t understand how real NFS servers do this, maybe they just have a really big cache?

While some implementations probably cache things, most filesystems have an underlying file handle which allow opening the file directly (see name_to_handle_at(2)/open_by_handle_at(2) on linux, but other systems also have something similar it's just not necessarily exposed to userspace)

NFS servers can then just expose this along with some internal stuff (like export id for internal permissions checks etc)

This can lead to some interesting misfeature if a user can guess the handles -- on most filesystems it'll just be something like the inode number and some extra stuff that depending on the filesystem can be contiguously allocated, so +1 can work -- in that open by handle at won't check that the user has access to the full path: so a malign user could access some/subdir/file if file itself is open even if subdir isn't accessible. Thanksfully most recent filesystems allocate inodes pseudo-randomly, so this won't work well for most people.

(In this particular case, you could have commit id + a big hash table with paths inside the repo? Most recently large repos would fit in memory for one single revision and layout doesn't change all that much between commits, so that could work fairly well a bit past linux kernel size. For humongous repos some more tricks might be needed though)

reply
EdSchouten
2 years ago
[-]
Worth mentioning that if you use NFSv4, file handles can be up to 128 bytes in size. That’s big enough that you can sometimes get away with dumping the entire file state in it.

For Buildbarn (a distributed build cluster for Bazel) I also wrote an NFSv4 server in Go to act as a lazy-loading file system for input files. As the remote execution protocol uses Content Addressable Storage, I ended up dumping the entire SHA-256, size and permission bits of files in the NFSv4 file handle. This allows me to reconstruct files as needed.

Some notes I made at the time (which are written with Buildbarn knowledge in mind):

https://github.com/buildbarn/bb-adrs/blob/master/0009-nfsv4....

reply
cesarb
2 years ago
[-]
> most filesystems have an underlying file handle which allow opening the file directly (see name_to_handle_at(2)/open_by_handle_at(2) on linux, but other systems also have something similar it's just not necessarily exposed to userspace)

On Windows, access to this is controlled by the confusingly named "Bypass traverse checking" (aka SeChangeNotifyPrivilege) permission (see for instance https://techcommunity.microsoft.com/t5/windows-blog-archive/... for some information about it; I recall once reading an article from either Raymond Chen or Larry Osterman explaining the naming of that permission, but can't find it at the moment).

reply
trollied
2 years ago
[-]
Reminds me of Rational ClearCase, which probably inspired the idea. You could specify a "view" using tags, and it'd present it as a filesystem to remote machines.

I think IBM now own them.

reply
bigstrat2003
2 years ago
[-]
It also reminds me of ClearCase, which is why I kind of hate this, lol. ClearCase is fucking awful to work with, the last thing I want is to turn git into it.
reply
codewiz
2 years ago
[-]
Several years ago, one of my customers deployed Rational ClearCase across the company. Every employee and consultant, me included, was signed up for a week-long course on how to use the "Rational Unified Process" and other nonsense like that.

By day 2, it was clear that they were selling us snake oil, but my customer was paying for my time, so I sit through it to the end. A few engineers in the room were genuinely hooked up by the promise of generating 90% of the code from UML, integration streams, automatic "un-branching" and all that.

I wonder what they spent overall in licensing fees, training and lost productivity? And that's probably a fraction of the long-term damage dealt by this absurd process to a large C++ codebase.

Somewhere, I should still have a "degree" issued by Rational University :-)

reply
jerf
2 years ago
[-]
Wow. Circa 1999 my software engineering class, the college class that has the distinction of being the only class I've ever taken that I'm pretty sure I now disagree with literally everything that was taught in it, had us using that software suite to try to do that.

I recall it made a very uncompelling case, on the grounds that it was putatively the future of software engineering, and you could hardly right-click on anything without the stupid thing hard crashing. Pro-tip: If your software is the future of software engineering, an engineering student using your software to do exactly what it was designed to do should not be able to crash it in under two minutes. And then keep crashing it.

But even when it was working, it was literally virtually impossible to so much as contort a student assignment into that model. I can't imagine working somewhere that insisted on building production software that way, and I'm shocked they're still finding enough chumps to stay in business with that complete and utter trip pipe dream.

(In a nutshell, the primary problem with such systems is that they do not account for the fact that every entity in a diagram is a cost. Every entity in a diagram needs to carry a value in excess of that cost, preferably comfortably so. Any methodology that insists on a totalizing view of the world in which everything must be in a diagram will produce diagrams so cognitively expensive that they are just a blurred mass of diagram entities no easier to understand than the underlying code. The people pushing these systems build nice little demo diagrams with 10-15 entities in them that are easy to understand, and then incorrectly attribute the ease of understanding to the fact that it is a diagram, rather than the fact the diagram only contains a handful of entities! Then they build systems based on the utterly incorrect belief that all diagrams are simple. The results are entirely predictable when you see it from this point of view. The part that's mindblowing is just how hard some people need to be beaten over their skull with the fact that diagrams aren't necessarily simple once they've ingested this idea, no matter how many hundreds of insanely complicated diagrams they stare at over the years....)

reply
donaldihunter
2 years ago
[-]
Yeah, that's mostly nothing to do with ClearCase tho. Raw ClearCase was just a VCS. It was Unified Change Management (UCM) that brought in the Rational Unified Process garbage.
reply
donaldihunter
2 years ago
[-]
Clearcase also had version extended naming so you could access a specific revision as sort.c@@/main/bugfix/4 or tag as sort.c@@/RLS_1.3
reply
trollied
2 years ago
[-]
That was great for diffing/creating patches etc. A gitfs that exposed a repos files like that would be great, but I suspect there are patents.
reply
roca
2 years ago
[-]
IBM bought them in 2003.
reply
franky47
2 years ago
[-]
Is there a project that does the reverse of this, namely mounting a filesystem where every write to a file in the repository becomes a Git commit?
reply
extraduder_ire
2 years ago
[-]
You mean like when nothing has the file open any more, or doing a commit with every flush to disk? I think the latter is a bit excessive.

One of the FUSE implementations mentioned in this post may do something like that.

reply
dekhn
2 years ago
[-]
I tried this with mod_dav_svn back in the day and never really managed to convince myself that commit-on-write is a good idea.
reply
franky47
2 years ago
[-]
File handle refcounting would be nice to avoid spamming Git with commits, indeed.
reply
eterps
2 years ago
[-]
I have been looking for that as well. Some sort of 'implicit versioning', a bit like how version history works for Google Docs [1], but instead for the filesystem.

[1] https://support.google.com/docs/answer/190843

reply
pacifika
2 years ago
[-]
reply
joh6nn
2 years ago
[-]
doesn't seem like the project is in active development anymore, but there's gitFS: https://github.com/presslabs/gitfs
reply
orf
2 years ago
[-]
It’s a shame this is NFSv3. It’s my understanding that NFSv4 is significantly faster to access due to pipelining requests?
reply
reverius42
2 years ago
[-]
Very nice! I really like the design choices here. Though I (edit: am biased and) would personally have used Rust and https://github.com/xetdata/nfsserve.
reply
eru
2 years ago
[-]
I've built something like this as well, but in FUSE for Linux, not via NFS.

It really brings home how close git's design already is to being a file system.

reply
bmacho
2 years ago
[-]
A virtual file system? How usable are those? I want something that can store slightly different data (modifies data on the fly), and one, that I can use as a tagged file system (generates folder structure on the fly). Is that viable with FUSE/NFS? Do programs play nice with them (no caching, no recursive lookahead and such)?

Is it hard to write one? Which language did you used?

reply
eru
2 years ago
[-]
I wrote version in Python and Rust. (And ended up contributing to libfuse and its Rust equivalent along the way.)

> Do programs play nice with them (no caching, no recursive lookahead and such)?

Other programs play nice with my peculiar file systems as far as I can tell: they just look like ordinary files and directories and symlinks etc to them.

I don't know how well that would work for your usecase. Most of the caching etc that I am aware of happens at the layer of the Linux kernel, and your FUSE daemon can configure what caching it wants or doesn't want.

I don't know much about NFS. I only used FUSE directly.

For me, presenting git as a filesystem was relatively easy to write. I think that comes down to two main factors:

* Git's internal structure is very close to how Linux (and especially fuse) do filesystems. That's probably not a big surprise, given that git and the Linux kernel are both projects started by the same guy.

* I exposed the git data as a read-only filesystem. In general, if you can avoid mutation, you can avoid a lot of complexity. (The main source of complexity in the Python version I wrote first was that we allowed wanted to reflect changes in the contents of git branches to be reflected in the filesystem side. That's mutability flowing in only one direction (from git to the filesystem), but it was already annoying enough to deal with.)

On a more general note: I see filesystems mostly as one peculiar interface you can put in front of your data. So one some level git has an underlying database, and we just expose it. The same could be done for eg postgres or mongodb with a suitable fuse daemon.

But you can also think of ext4 being such a database with a weird UI layer on top. The main difference in practice is that they run their 'fuse daemon' in the kernel.

From what I've read the new bcachefs explicitly embraces this view of 'filesystems are just databases', and uses techniques borrowed from relational databases for its internal datastructures.

reply
rapnie
2 years ago
[-]
Functional but not complete. The project is looking for maintainers. Overall it is a pity how so many potentially great Rust projects are inactive and left in incomplete state.
reply
reverius42
2 years ago
[-]
It's not complete in its current state with respect to the NFS protocol, but is very usable as a base for things that work (for instance, https://github.com/xetdata/xet-core uses it to mount git(-xet) repos as directories, similarly to the linked post but with some different design choices). It is maintained for this use case and contributors are welcome for other use cases that aren't covered yet.
reply
CodeCompost
2 years ago
[-]
IMO one of the major flaws of Subversion is that branches are folders.
reply
masklinn
2 years ago
[-]
OTOH it’s also a major advantage of subversion, because if you have long-running branches and need to fix something in all of them all the fixes can be in the same commit across every branch instead of needing a commit in each branch informally linked to the previous one (or formally so via a merge commit if every fix in an older branch has to get ported forwards, as an empty commit if not applicable).
reply
tambourine_man
2 years ago
[-]
I don’t understand why Apple doesn’t provide a rock solid first party FUSE implementation. This and a container story are the main things holding macOS back.
reply
mdaniel
2 years ago
[-]
I believe both of your complaints have the same underlying cause: they are Linux-centric (that goes extra for the "container story" part since I doubt very seriously that you have a use case for running containerized Darwin executables)

I believe Apple's version of FUSE is https://developer.apple.com/documentation/fileprovider/nonre... and, while I am absolutely certain someone will point out things that FUSE has that the macOS version doesn't, I'm only saying that, just like Windows, macOS has its own way of doing things and thus very little incentive to implement someone else's vision

reply
tambourine_man
2 years ago
[-]
Apple goes to great lengths to increase security while annoying its users. Containers could help in that regard. Having an app (and GUI too) run in a container of sorts could avoid the countless dialog boxes we’ve grown accustomed to.

Regarding FUSE, there have been many implementations for macOS, just not very good ones. I don’t see a fundamental limitation on XNU, Darwin, Mach.

I haven’t though of File Provider in this way, but maybe it’s a good solution. Apple seems reasonably committed to it so that’s encouraging.

reply
callalex
2 years ago
[-]
If it doesn’t increase revenue that they can tax at 30% they won’t bother to touch it.
reply
avgcorrection
2 years ago
[-]
> The main reason I wanted to make this was to give folks some intuition for how git works under the hood. After all, git commits really are very similar to folders – every Git commit contains a directory listing of the files in it, and that directory can have subdirectories, etc.

> It’s just that git commits aren’t actually implemented as folders to save disk space.

Okay. So what. People are used to archive formats.

> git worktree also lets you have multiple branches checked out at the same time, but to me it feels weird to set up an entire worktree just to look at 1 file.

But one directory per commit is less weird.

reply
kristopolous
2 years ago
[-]
This reminds me of the SVN days.
reply
assimpleaspossi
2 years ago
[-]
I just hate these things where people use the terms "folder" and "directory" as if they were the same thing. They are not and I wish some people would wise up to that.

Yes. This is a pet peeve of mine.

reply
ongy
2 years ago
[-]
Please don't just rant about it, but explain, or link to an explanation, the difference so others can see if there's actually a substantial difference.
reply
arbitrandomuser
2 years ago
[-]
> Please don't just rant about it, but explain, or link to an explanation, the difference so others can see if there's actually a substantial difference.

Fairly easy to look up and I just did , https://stackoverflow.com/questions/5078676/what-is-the-diff... but I agree , comments like this come across as smug and unhelpful. A few short lines explaining and a link would have been much helpful and greatly improves the quality of discussion on this site.

reply
avgcorrection
2 years ago
[-]
The top answer

> > There is a difference between a directory, which is a file system concept, and the graphical user interface metaphor that is used to represent it (a folder).

So there really is no difference in a technical (programmers talking) context. :)

reply
crazygringo
2 years ago
[-]
Yeah, 99.9% of the time they are interchangeable.

There are exceptions where a GUI folder doesn't have a corresponding directory (e.g. "Control Panel") or when a directory doesn't appear as a GUI folder (a mounted volume, maybe?).

But if you're just talking about regular everyday use cases for organizing files, folders are directories and directories are folders. In those cases, there's zero reason to distinguish the two, no matter whether you happen to be using a GUI or terminal at the moment.

reply
assimpleaspossi
2 years ago
[-]
And yet you listed several reasons why folders are not directories and vice-versa.

No computer system, from the command line, has a command that allows you to "make_folder" but every system has a "make_directory" (mkdir) or "change_directory" (cd) and so on. If there is little to no difference, then why is this so?

Why are there no folders from the command line?

reply
crazygringo
2 years ago
[-]
> And yet you listed several reasons why folders are not directories and vice-versa.

No I didn't. I listed the exceptions where they are not the same thing. The fact that they're exceptions implies the existence of a rule, which is that they're usually the same.

Generally speaking, every Mac and Windows system I use, when I create a directory, I see a folder created in the GUI. And when I create a folder in the GUI, I see a directory created in the terminal.

So pretty sure that, aside from rare exceptions, they're identical and synonymous. So as long as you're not dealing with an exceptional situation, the terminology is literally interchangeable.

> Why are there no folders from the command line?

For historical terminology reasons, mostly. By the time GUI's introduced a new terminology, all the terminal commands had long since been named.

I really don't understand what overall point you're trying to make. But if you create a directory in the terminal that shows up in the GUI as a folder, it's 100% perfectly correct to say you created a folder in the terminal. Because you did.

reply
assimpleaspossi
2 years ago
[-]
And what happens when you are not using a GUI? Aren't you incapable of creating folders? If a folder is the same thing, why can't you create one then?

A folder does not exist without a GUI. A folder "might" represent a clickable interface that shows the same contents as a directory but a browser will read the contents of that directory using pathnames and not folders.

And mounting a disk drive onto a folder makes no sense at all.

Your "for historical reasons" is a made up reason unless you can point to an authoritative source for that. If nothing else, the source to Wikipedia in this thread says what I say and not what you wrote.

reply
crazygringo
2 years ago
[-]
Sorry, but this is just being overly pedantic without any practical value at all.

Of course you can create a folder without a GUI, because in all cases but a few exceptions, a folder is a directory. "mkdir" creates a folder, regardless of the name of the command. When I load my GUI later, I see the folder.

Until now, I have never knowingly met anybody who thought there was any value in drawing a distinction between the two for the basic purposes of routine file organization. So I think you're fighting a lost cause here.

Good luck with your crusade though! Sorry our usage of the term "folder" bothers you so much.

reply
assimpleaspossi
2 years ago
[-]
What you call pedantic I call technically correct. Nowadays too many people make up words that have no meaning and misuse words but think that's OK cause "you know what I mean". No, it is not clear what you mean using your Windows terminology while I'm working on my FreeBSD server attempting to mount hardware.
reply
dag11
2 years ago
[-]
> No computer system, from the command line, has a command that allows you to "make_folder" but every system has a "make_directory" (mkdir) or "change_directory" (cd) and so on. If there is little to no difference, then why is this so?

But doesn't that prove the concept that they're the same thing? If they were different things, surely there would be a different command for creating a folder and versus creating a directory. But there's just one command, and nobody is going to make a command called `mkdirakafol` ("make directory a.k.a. folder"). Sometimes multiple names can mean the same thing.

Like, a "trunk" and a "boot" can refer to the same thing. So what you just said is like "my car has an Open Trunk button, not an Open Boot button, so surely those are different concepts."

I've fallen for xkcd 386...

reply
boffinAudio
2 years ago
[-]
There is no difference between a directory and a folder except the term used to refer to the special kind of file that can hold links to other files. :)

Bonus points if you can find any kind of modern user that knows the difference between a FILE and a FOLDER (directory) ..

reply
assimpleaspossi
2 years ago
[-]
I thought technical people on HN would know the difference. Of course, my complaint is about technical people not knowing the difference and more evidence is here in the comments.
reply
dag11
2 years ago
[-]
What's the difference? When I select "New Folder" on my computer's file browser, it creates a directory. Your comment is the first time I'm hearing they're not interchangeable.
reply
assimpleaspossi
2 years ago
[-]
What do you do when you don't have a GUI--you are on the command line--and there is no "New Folder" to select?

Do you mount a disk drive onto a folder? How does that make sense?

reply
creatonez
2 years ago
[-]
> Do you mount a disk drive onto a folder?

Yes? It looks exactly the same as a folder in the UI either way.

reply
assimpleaspossi
2 years ago
[-]
What it looks like and what it is are two different things. A folder is a graphical representation. How do you stuff a hard drive into a paper, manila folder? How do I create a folder on a BSD server when there is no such term on a BSD server (but there is on a Windows machine)?

Why does even Windows use both terms if, as some want to claim, they have the same meaning? Why does Wikipedia state they are not the same thing?

reply
emsimot
2 years ago
[-]
What's the difference?
reply
DonHopkins
2 years ago
[-]
On the Mac, a folder can have a "/" in its name, but a directory can't. Don't believe me? I bet you $10. Paypal my winnings to don@donhopkins.com please!
reply