> searching for a function I deleted
git log -G someFunc
> quickly looking at a file on another branch to copy a line from itI use `git worktree` to "mount" long-running branches much to the same effect as Julias tool. To quickly look at a file from a non-mounted branch/commit, I use:
git show $REF:$FILENAME
> searching every branch for a function git log --all -G someFunc
Note that -G can be replaced with -F for a speedup if the pattern you are searching for is a fixed string.> git log -G someFunc
This will look for all changes mentioning someFunc throughout the history of the project.
Usually -S is more valuable, as it will look for changes in occurrence counts. So if you moved a call in a commit -G will flag it, but -S will ignore it (+1-1 = 0).
-S also defaults to fixed string, so no need for -F. Instead you need —pickaxe-regex to switch it to regex search.
It's kind of funny, I think, how most git users don't seem to know how to access any version other than the current one. So many people think of it simply as the annoying tool you have to use to make code changes but don't really know what version control is.
I was inspired to look for something similar for the next best text editor (vim) and came across this: https://salferrarello.com/using-vim-view-git-commits/
git log | vim -R -
Placing your cursor over a commit hash and entering K displays git show for that commit.Jokes aside, As a vim user of 6 years, I did learn just enough emacs for magit (TM) and have also been making quick bucks on the side teaching it to my friends , so I guess I can't help with the "fight back" part :-)
I use Awesome WM. It's essentially an "Emacs of Window Managers," and the codebase is very well written, with a small C core and everything else implemented in Lua. It's even very well documented. Yet, writing a nontrivial program (call it an "applet" or something) for Awesome is a nightmare compared to doing the same in Emacs.
LuaJIT is an excellent runtime, and Lua is a great IR, but writing it by hand for anything that's not strictly scripting within a previously established framework is challenging. It's to the point where I'm using Haxe to produce Lua for my Awesome scripts. I know a few people who use Haxe to script NeoVim, too. Really, having to reinvent inheritance and method resolution order every time you start writing Lua in a new project gets old fast.
I genuinely like Lua as a language - the same way I like Tcl, Scheme, and Io. They are all beautiful and powerful and perform very well in some scenarios. Elisp is ugly in comparison, but it's way more practical for medium-sized codebases. Being tied to Emacs is a considerable downside which limits its applicability, but focusing on language features alone, larger codebases are more practical to write in Elisp than in Lua. Plus, there's an escape hatch - Common Lisp or Clojure, pick your poison - for cases where Elisp actually doesn't cut it. There's no such easy way out for Lua.
I think there are some Lisp/Clojure inspired languages developed for Neovim which are used in Lua.
https://github.com/Olical/conjure
And on "it's not just an extensibility language": in my experience this doesn't matter. I get that "well the editor itself is half written in elisp" and so vaguely that is superior, but it is only so in an academic sense.
Expose the primitives for the editor in some API in _any_ langauge and you can basically achieve the same thing anyway, so pick a language that doesn't make me want to poke my eyeballs out with a hot skewer.
Sorry, rant over.
I believe it's more a question of how easy the redefinition is to implement. You can live-update a running Java or C programs, it's just less convenient/harder to pull off than with Forth, Smalltalk, Lisp, Prolog, and the like. So I think that yes, in principle, every language could have it - it's just that you'd need a huge pile of hacks for some and a few simple instructions for others to get it.
There's no argument there; you're right. That's why V and Nim, for example, put reloadable things in a separate compilation unit and handle some things (global state at least) specially upon reload (if I understand what they do correctly.)
My point was that you can get quite close (sometimes with a massive pile of hacks and/or developer inconvenience), not that you can get the full experience (as in Smalltalk or Lisp) everywhere. Especially since the reloading being convenient is a large part of the experience, I think.
e.g. see something odd / interesting, activate blame view, and “show diff” / “annotate previous revision” to see how the hunk evolved. Often this provides fast-tracking through the file’s history as the hunk goes through inconsequential changes (e.g. reformatting, minor updates) without changing the gist of what sparked your interest.
Submodules are interesting, because they’re next to unusable from a user perspective (they’re a pain to maintain and interact with unless you never ever update them) but they’re ridiculously simple technically which I assume is what made them attractive.
A submodule is an entry in “.gitmodules” mapping a path to a repository URL (and branch), then at the specified path in the repository is a tree entry of mode 160000 (S_IFDIR + S_IFLNK), whose oid is the commit to check out (in the submodule-linked repository).
Git's data model, put simply, is this:
* Branches are pointers to Commit objects
* Commit objects are a composite of {Commit_Comment, Tree, Parent Commit(s)}, referenced by the hash of that set
* a Tree (like a directory) is a list of Blobs and/or Trees (associating filenames with them) referenced by the hash of its contents.
* a Blob is a file, referenced by the hash of its contents.
So the set of types of objects in Git are: Commits, Trees, and Blobs.
Note that I said that a Tree can contain other Trees or Blobs... but what if... you put a Commit in it!?
That's a submodule!
Now if the Commit you reference doesn't exist in the current repo, Git can't do anything with it. That's where the .gitmodules file comes in, to associate a given path with a repo, so that Git can look up the Commit object in that repo.
Nit: the data model has refs, which are pointer to objects.
Branches are the subset of refs in the special-cased heads/ namespace which should be pointing to commit objects.
And there’s also tags, which are the subset of refs in the special-cased tags/ namespace, which should be pointing to commit (“lightweight”) or tag (“annotated”) objects.
Going further, the little-known git-notes [1] feature also uses its own reference namespace, `refs/notes/`
Going even further, Gerrit [2] leverages the wide-open reference namespace/directories to create its own. For example, pushing under the `refs/for/` namespace creates a new review, and specific reviews can be looked up under the `refs/changes/` namespace.
Even even further, Gerrit's special repos All-Users.git and All-Projects.git are "databases" for project configuration and user configuration, where for example external IDs (like usernames) are stored under the special `refs/meta/external-ids` ref/branch. This has the notable benefit that all configuration changes are tracked and auditable.
I believe git-appraise [3] also leverages special reference namespaces in Git for performing its review duties (but I don't know details) Edit: actually no, it "just" leverages git-notes.
[1] https://git-scm.com/docs/git-notes
[2] https://gerrit-review.googlesource.com/Documentation/note-db...
I tried it out, and the local commit object just appears as an empty folder.
To try for yourself, do this:
git init recursive-submodule && cd recursive submodule
echo "foo" > file1 && git add file1 && git commit -m "Add file1"
# this will have created an initial commit with hash 2d49d729. Then:
git update-index --add --cacheinfo 160000 2d49d729fe39d1def8ce537d7efeeabbf3efb4f2 submodule && git commit -m "Add submodule"
The `update-index` command is the plumbing command for adding an arbitrary object, which I used to add the previous commit object. Since we only updated the index and not the workspace, git will note that the submodule is missing. You can then run `git restore .` to set the workspace to the state the history says (ie with the missing submodule)... but that just creates an empty directory.Vanilla git won't try much further. To actually populate the submodule requires running `git submodule update --init`, and that requires a `.gitmodules` file, even for a local commit object.
(To learn more about plumbing commands like update-index, look at the Git Book chapter on Internal Git Objects: https://git-scm.com/book/en/v2/Git-Internals-Git-Objects )
Submodules are a really good solution for problems that look like "this repo depends on some upstream repos that I don't control", and a bad solution for any other problem. They do what they were designed to do.
Imagine that your build-script needs to clone a bunch of third-party dependencies. So maybe you write some kind of clone.sh that loops through a bunch of Git repo URLs. Then later you want to also specify specific commit hashes, so you add a commit-ID field. Then you write a tool that makes it easy to update the fields in your clone.sh file. Guess what you've got? Git submodules.
I'm not endorsing it as the best feature ever or as the way to do dependency management but it can be used in certain situations.
For example: `git worktree add <folder> <commit-hash>` checks out that commit in that folder.
And they’re something of a pain in the ass to manage as you can’t have two worktrees to the same branch.
So for temporary querying across a few branches doing clones is a lot easier (git will hardlink when cloning on the same FS so performances are not an issue, and you can just delete the scratch clones afterwards without the need to prune the worktrees), and if you regularly need to query things across a lot of branches worktrees are unusable.
This saves me from throwing away my working copy when I switch to a different task;
/source/repos/AcmeCorp 2fddd74f9a [bug/CurrentWork]
/source/repos/AcmeCorp-hotfix 27175cf6c5 [hotfix/2023-11-27.1]
/source/repos/AcmeCorp-master 016ca20b75 [master]
/source/repos/AcmeCorp-reference 454be5348d [feature/RecentReviewedWork]
/source/repos/AcmeCorp-release 95027177d7 [release9.12.0]While some implementations probably cache things, most filesystems have an underlying file handle which allow opening the file directly (see name_to_handle_at(2)/open_by_handle_at(2) on linux, but other systems also have something similar it's just not necessarily exposed to userspace)
NFS servers can then just expose this along with some internal stuff (like export id for internal permissions checks etc)
This can lead to some interesting misfeature if a user can guess the handles -- on most filesystems it'll just be something like the inode number and some extra stuff that depending on the filesystem can be contiguously allocated, so +1 can work -- in that open by handle at won't check that the user has access to the full path: so a malign user could access some/subdir/file if file itself is open even if subdir isn't accessible. Thanksfully most recent filesystems allocate inodes pseudo-randomly, so this won't work well for most people.
(In this particular case, you could have commit id + a big hash table with paths inside the repo? Most recently large repos would fit in memory for one single revision and layout doesn't change all that much between commits, so that could work fairly well a bit past linux kernel size. For humongous repos some more tricks might be needed though)
For Buildbarn (a distributed build cluster for Bazel) I also wrote an NFSv4 server in Go to act as a lazy-loading file system for input files. As the remote execution protocol uses Content Addressable Storage, I ended up dumping the entire SHA-256, size and permission bits of files in the NFSv4 file handle. This allows me to reconstruct files as needed.
Some notes I made at the time (which are written with Buildbarn knowledge in mind):
https://github.com/buildbarn/bb-adrs/blob/master/0009-nfsv4....
On Windows, access to this is controlled by the confusingly named "Bypass traverse checking" (aka SeChangeNotifyPrivilege) permission (see for instance https://techcommunity.microsoft.com/t5/windows-blog-archive/... for some information about it; I recall once reading an article from either Raymond Chen or Larry Osterman explaining the naming of that permission, but can't find it at the moment).
I think IBM now own them.
By day 2, it was clear that they were selling us snake oil, but my customer was paying for my time, so I sit through it to the end. A few engineers in the room were genuinely hooked up by the promise of generating 90% of the code from UML, integration streams, automatic "un-branching" and all that.
I wonder what they spent overall in licensing fees, training and lost productivity? And that's probably a fraction of the long-term damage dealt by this absurd process to a large C++ codebase.
Somewhere, I should still have a "degree" issued by Rational University :-)
I recall it made a very uncompelling case, on the grounds that it was putatively the future of software engineering, and you could hardly right-click on anything without the stupid thing hard crashing. Pro-tip: If your software is the future of software engineering, an engineering student using your software to do exactly what it was designed to do should not be able to crash it in under two minutes. And then keep crashing it.
But even when it was working, it was literally virtually impossible to so much as contort a student assignment into that model. I can't imagine working somewhere that insisted on building production software that way, and I'm shocked they're still finding enough chumps to stay in business with that complete and utter trip pipe dream.
(In a nutshell, the primary problem with such systems is that they do not account for the fact that every entity in a diagram is a cost. Every entity in a diagram needs to carry a value in excess of that cost, preferably comfortably so. Any methodology that insists on a totalizing view of the world in which everything must be in a diagram will produce diagrams so cognitively expensive that they are just a blurred mass of diagram entities no easier to understand than the underlying code. The people pushing these systems build nice little demo diagrams with 10-15 entities in them that are easy to understand, and then incorrectly attribute the ease of understanding to the fact that it is a diagram, rather than the fact the diagram only contains a handful of entities! Then they build systems based on the utterly incorrect belief that all diagrams are simple. The results are entirely predictable when you see it from this point of view. The part that's mindblowing is just how hard some people need to be beaten over their skull with the fact that diagrams aren't necessarily simple once they've ingested this idea, no matter how many hundreds of insanely complicated diagrams they stare at over the years....)
One of the FUSE implementations mentioned in this post may do something like that.
It really brings home how close git's design already is to being a file system.
Is it hard to write one? Which language did you used?
> Do programs play nice with them (no caching, no recursive lookahead and such)?
Other programs play nice with my peculiar file systems as far as I can tell: they just look like ordinary files and directories and symlinks etc to them.
I don't know how well that would work for your usecase. Most of the caching etc that I am aware of happens at the layer of the Linux kernel, and your FUSE daemon can configure what caching it wants or doesn't want.
I don't know much about NFS. I only used FUSE directly.
For me, presenting git as a filesystem was relatively easy to write. I think that comes down to two main factors:
* Git's internal structure is very close to how Linux (and especially fuse) do filesystems. That's probably not a big surprise, given that git and the Linux kernel are both projects started by the same guy.
* I exposed the git data as a read-only filesystem. In general, if you can avoid mutation, you can avoid a lot of complexity. (The main source of complexity in the Python version I wrote first was that we allowed wanted to reflect changes in the contents of git branches to be reflected in the filesystem side. That's mutability flowing in only one direction (from git to the filesystem), but it was already annoying enough to deal with.)
On a more general note: I see filesystems mostly as one peculiar interface you can put in front of your data. So one some level git has an underlying database, and we just expose it. The same could be done for eg postgres or mongodb with a suitable fuse daemon.
But you can also think of ext4 being such a database with a weird UI layer on top. The main difference in practice is that they run their 'fuse daemon' in the kernel.
From what I've read the new bcachefs explicitly embraces this view of 'filesystems are just databases', and uses techniques borrowed from relational databases for its internal datastructures.
I believe Apple's version of FUSE is https://developer.apple.com/documentation/fileprovider/nonre... and, while I am absolutely certain someone will point out things that FUSE has that the macOS version doesn't, I'm only saying that, just like Windows, macOS has its own way of doing things and thus very little incentive to implement someone else's vision
Regarding FUSE, there have been many implementations for macOS, just not very good ones. I don’t see a fundamental limitation on XNU, Darwin, Mach.
I haven’t though of File Provider in this way, but maybe it’s a good solution. Apple seems reasonably committed to it so that’s encouraging.
> It’s just that git commits aren’t actually implemented as folders to save disk space.
Okay. So what. People are used to archive formats.
> git worktree also lets you have multiple branches checked out at the same time, but to me it feels weird to set up an entire worktree just to look at 1 file.
But one directory per commit is less weird.
Yes. This is a pet peeve of mine.
Fairly easy to look up and I just did , https://stackoverflow.com/questions/5078676/what-is-the-diff... but I agree , comments like this come across as smug and unhelpful. A few short lines explaining and a link would have been much helpful and greatly improves the quality of discussion on this site.
> > There is a difference between a directory, which is a file system concept, and the graphical user interface metaphor that is used to represent it (a folder).
So there really is no difference in a technical (programmers talking) context. :)
There are exceptions where a GUI folder doesn't have a corresponding directory (e.g. "Control Panel") or when a directory doesn't appear as a GUI folder (a mounted volume, maybe?).
But if you're just talking about regular everyday use cases for organizing files, folders are directories and directories are folders. In those cases, there's zero reason to distinguish the two, no matter whether you happen to be using a GUI or terminal at the moment.
No computer system, from the command line, has a command that allows you to "make_folder" but every system has a "make_directory" (mkdir) or "change_directory" (cd) and so on. If there is little to no difference, then why is this so?
Why are there no folders from the command line?
No I didn't. I listed the exceptions where they are not the same thing. The fact that they're exceptions implies the existence of a rule, which is that they're usually the same.
Generally speaking, every Mac and Windows system I use, when I create a directory, I see a folder created in the GUI. And when I create a folder in the GUI, I see a directory created in the terminal.
So pretty sure that, aside from rare exceptions, they're identical and synonymous. So as long as you're not dealing with an exceptional situation, the terminology is literally interchangeable.
> Why are there no folders from the command line?
For historical terminology reasons, mostly. By the time GUI's introduced a new terminology, all the terminal commands had long since been named.
I really don't understand what overall point you're trying to make. But if you create a directory in the terminal that shows up in the GUI as a folder, it's 100% perfectly correct to say you created a folder in the terminal. Because you did.
A folder does not exist without a GUI. A folder "might" represent a clickable interface that shows the same contents as a directory but a browser will read the contents of that directory using pathnames and not folders.
And mounting a disk drive onto a folder makes no sense at all.
Your "for historical reasons" is a made up reason unless you can point to an authoritative source for that. If nothing else, the source to Wikipedia in this thread says what I say and not what you wrote.
Of course you can create a folder without a GUI, because in all cases but a few exceptions, a folder is a directory. "mkdir" creates a folder, regardless of the name of the command. When I load my GUI later, I see the folder.
Until now, I have never knowingly met anybody who thought there was any value in drawing a distinction between the two for the basic purposes of routine file organization. So I think you're fighting a lost cause here.
Good luck with your crusade though! Sorry our usage of the term "folder" bothers you so much.
But doesn't that prove the concept that they're the same thing? If they were different things, surely there would be a different command for creating a folder and versus creating a directory. But there's just one command, and nobody is going to make a command called `mkdirakafol` ("make directory a.k.a. folder"). Sometimes multiple names can mean the same thing.
Like, a "trunk" and a "boot" can refer to the same thing. So what you just said is like "my car has an Open Trunk button, not an Open Boot button, so surely those are different concepts."
I've fallen for xkcd 386...
Bonus points if you can find any kind of modern user that knows the difference between a FILE and a FOLDER (directory) ..
Do you mount a disk drive onto a folder? How does that make sense?
Yes? It looks exactly the same as a folder in the UI either way.
Why does even Windows use both terms if, as some want to claim, they have the same meaning? Why does Wikipedia state they are not the same thing?