The future of code search is not regex – 100x faster than ripgrep
24 points
3 hours ago
| 8 comments
| fff.dmtrkovalenko.dev
| HN
kristopolous
1 hour ago
[-]
I ran across this fascinating tool a few days ago researching embedding models on hugging face.

Advertised as "ColGREP Semantic code search for your terminal and your coding agents",

I haven't put it in any harness yet but I probably should.

https://github.com/lightonai/next-plaid/tree/main/colgrep

I've also tried astgrep (also known as sg) but llms really mess up on them. I think you'd need to fine tune.

If anyone has cracked that case I'd love to hear about it

reply
swiftcoder
13 minutes ago
[-]
Is there a write up of the underlying approach? The summary on the repo mentioned SIMD, but not a whole lot else.
reply
genewitch
40 minutes ago
[-]
considering that ripgrep has marginal overhead over just reading the files to /dev/null, how exactly does this achieve 100x speedup?

I have a lot of use for something that can search ~1GB of text "instantly", but so far nothing beats rg/ag after the data has been moved into RAM.

reply
anilakar
35 minutes ago
[-]
The trick to optimization is not "doing faster" but "doing less". I already feel rg is missing a ton of results I want to see because it has a very large ignore list by default.
reply
neogoose
3 hours ago
[-]
I have open sourced the fastest code search implementation. Comprehensive SDK for both file finder and grep file search that is over 100x faster than ripgrep
reply
siva7
1 hour ago
[-]
I don't get this submission title. Your tool uses regex but the title claims the future is not about regex.
reply
molszanski
19 minutes ago
[-]
I think it is about input. Before I had to type regex, now I just type text and fuzzy finds more, regex style. Awkward wording, but code seems cool.
reply
MaxMonteil
2 hours ago
[-]
This looks cool!

You should add a link to the GitHub repo for the project itself, at first I wasn't even sure what it was called.

I found this link https://github.com/dmtrKovalenko/fff.nvim

reply
dig1
17 minutes ago
[-]
ctags, GNU Global and even "ugrep -Q" would like to have a few words with you ;)
reply
asdfadsfaf
21 minutes ago
[-]
I don't get it how can I search anything but the file name?
reply
schrodinger
1 hour ago
[-]
How's it work? Embed tokens and use euclidean distance or something?
reply
globular-toast
48 minutes ago
[-]
Why is it "for neovim"? Surely such a thing would be useful in many applications?
reply
ramon156
33 minutes ago
[-]
Because it's being dishonest from multiple angles.

- it has regex, so the title is weird - it definitely wouldn't be 100x faster than rg - its an sdk, so its apples to oranges anyway

reply