Currently interested in inference optimization, speculative decoding, and small-model routing for latency-bound systems.
Writing → medium.com/@OmsharmaOfficial Code → github.com/justomsharma