Here’s a simple demo notebook where you get the best-performing, statistically significant configurations for your RAG — and improve hallucination metrics by 4X in just 5 minutes — with a single Nomadic experiment: https://tinyurl.com/4xmaryyw
Our lightweight library is now live on PyPI (`pip install nomadic`). Try one of the README examples :) Input your model, define an evaluation metric, specify the dataset, and choose which parameters to test.
Nomadic emerged from our frustration with existing HPO (hyperparameter optimization) solutions. We heard over and over that for the sake of deploying fast, folks resort to setting HPs through a single, expensive grid search or better yet, intuition-based “vibes”. From fine-tuning to inference, small tweaks to HPs can have a huge impact on performance.
We wanted a tool to make that “drunken wander” systematic, quick, and interpretable. So we started building Nomadic - our goal is to create the best parameter search platform out there for your ML systems to keep your hyperparameters, prompts, and all aspects of your AI system production-grade. We started aggregating top parameter search techniques from popular tools and research (Bayesian Optimizations, cost-frugal flavors).
Among us: Built Lyft’s driver earnings platform, automated Snowflake’s just-in-time compute resource allocation, became a finalist for the INFORMS Wagner Prize (top prize in industrial optimization), and developed a fintech fraud screening system for half a million consumers. You might say we love optimization.
If you’re building AI agents / applications across LLM safety, fintech, support, or especially compound AI systems (multiple components > monolithic models), and want to deeply understand your ML system’s best levers to boost performance as it scales - get in touch.
Nomadic is being actively developed. Up next: Supporting text-to-SQL pipelines (TAG) and a Workspace UI (preview it at https://demo.nomadicml.com). We’re eager to hear honest feedback, likes, dislikes, feature requests, you name it. If you’re also a optimization junkie, we’d love for you to join our community here https://discord.gg/PF869aGM
========
baileyw6 2 hours ago [flagged] [dead] | prev | next [–] excellent work!
r0sh 3 hours ago [flagged] [dead] | prev | next [–] cracked team!
mlw14 3 hours ago [flagged] [dead] | prev | next [–] Interesting library, is it like unit testing for RAGs? Can't wait to try it out!
lncheine 2 hours ago [flagged] [dead] | prev | next [–] Interesting library, can't wait to try it out!
Linda_ll 2 hours ago [flagged] [dead] | prev | next [–] Congrats on the launch! Excited for what’s to come :)
bmountain17 3 hours ago [flagged] [dead] | prev | next [–] Great new platform to boast AI performance, can't wait to try the Python library!
jjBailey 1 hour ago [flagged] [dead] | prev | next [–] Cool library, I’ll test it out
sidkapoor39 3 hours ago [flagged] [dead] | prev | next [–] Congrats on the launch! Excited to see how this streamlines Hyperparameter optimization. Keep up the great work!
brucetry 1 hour ago [flagged] [dead] | prev | next [–] Ver interesting, similar to unit test for RAGs? Love to try it out
jjBailey 1 hour ago [flagged] [dead] | prev | next [–] Very interesting library!! Can’t wait to try it!
luxxxxx 1 hour ago [flagged] [dead] | prev | next [–] Interesting library! Is it like unit testing for RAGs? Can’t wait to try it out!
kangjl888 2 hours ago [flagged] [dead] | prev | next [–] Huge congratulations to the NomadicML team on the launch of Nomadic! The platform looks like a game-changer for optimizing AI systems, excited to see how it transforms hyperparameter search for the community.
nishsinha2345 21 minutes ago [flagged] [dead] | prev | next [–] Excited to try out this library! would this help make unit testing easier? Or be used instead of unit testing?
greysongy5 19 minutes ago [flagged] [dead] | prev | next [–] Wow, this seems like it would really help automated RAG testing. What are the top use cases today?
sidvijay10 5 minutes ago [flagged] [dead] | prev [–] We're looking for a RAG testing framework for searching UGC. So far we've just been running evals manually w/o a library. Will try out Nomadic and see if it's more convenient.
1. Solo ML practitioners looking to streamline their workflows, 2. MLEs in small to mid-size companies wanting FAANG-level capabilities, 3. Data science teams aiming to productionize models more efficiently, 4. Startups needing to quickly deploy ML features without a large engineering team.
Our goal is to provide tools that let any team serve high-quality ML features, regardless of size or resources. We're trying to bridge the gap between cutting-edge ML research and optimized, deployable solutions.
If you want to dig deeper, please peruse our Nomadic Docs (https://docs.nomadicml.com), Workspace Demo (https://demo.nomadicml.com), and contact at info@nomadicml.com.
We started working on Nomadic because we saw people wanted to ship out powerful and reliable systems but very often didn't have a map of it:
Which embedding model works best for my RAG? What temperature to set? What threshold for similarity search?
We wanted a tool to make the decision process of answering these types of questions systematic and affordable instead of resorting to intuition or something like a single expensive grid search, then set it and forget it... give us your most honest feedback!
Are there any plans to make updates to this score or add in different metrics for more accurately detecting hallucinations that don't penalize rephrasing?
This is a potential limitation of N-gram precision with context matching, which we were using in the RAG demo for simplicity (though even with this, I don't think it would be so extreme :-) )
We already offer two other different hallucination detection approaches which should mitigate this problem - an LLM-as-a-judge model for evaluation, and semantic similarity matching. We've also considered, for example, using metrics such as BertScore. Do you have other ideas? :-)
Is it possible to programmatically interface with Nomadic’s hyperparameter search through an authenticated endpoint, with the ability to generate user-specific tokens for secure access?"
The Nomadic SDK supports 1st-party integrations with various open & closed-source ML/LLM providers. These are done through authenticated endpoints for interfacing securely with your models. Also, as noted in the Custom Evaluation section of our docs (https://docs.nomadicml.com/features/evaluations), you can provide your custom objective_functions and detail your model access logic, which may include custom authentication & access rules. A sample of this is present in our "Basic RAG" cookbook (link: https://colab.research.google.com/drive/1rv2f-qxgoN_eVDFu6Um...).
When integrated with the upcoming Nomadic Workspace, you can obtain your Nomadic API key and sync your local Nomadic models, experiments & experiment results with our managed service. The demo of this model/experiment/experiment result visualizatio is live at https:demo.nomadicml.com, please check it out and let us know your thoughts!
Been a pleasure to work with Mustafa and Lizzie on this! Hopefully you can solve a pain point I personally have had for so long - how can you easily verify that your model continues to perform well?