Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B
87 points
16 hours ago
| 4 comments
| github.com
| HN
Related: https://news.ycombinator.com/item?id=47653752
jwr
33 minutes ago
[-]
That is very, very interesting. I've been hoping to have an assistant in the workshop (hands-free!) that I could talk to and have it help me with simple tasks: timers, calculating, digging up notes, etc. — basically, what the phone assistants were supposed to be, but aren't.

"You will have to unlock your iphone first" is kind of a deal-breaker when you are in the middle of mixing polyurethane resin and have gloves and a mask on.

More and more I find that we have the technology, but the supposedly "tech" companies are the gatekeepers, preventing us from using the technological advances and holding us back years behind the state of the art.

I'll be trying this out on my Macbook, looks very promising!

reply
zerop
53 minutes ago
[-]
I have been looking forward to build something like this using open models. A voice assisstant I can talk while I am driving, as I do have long commute. I do use chatGPT voice mode and it works great for querying any information or discussions. But I want to do tasks like browsing web, act like a social media manager for my business etc.
reply
dvt
4 hours ago
[-]
Solid work and great showcase, I've done a bunch of stuff with Kokoro and the latency is incredible. So crazy how badly Apple dropped the ball... feels like your demo should be a Siri demo (I mean that in the most complimentary way possible).
reply
karimf
4 hours ago
[-]
Thank you. This reminds me of a paragraph from the LatentSpace newsletter [0]

> The excellent on device capabilities makes one wonder if these are the basis for the models that will be deployed in New Siri under the deal with Apple….

https://www.latent.space/p/ainews-gemma-4-the-best-small-mul...

reply
k-almuraee
1 hour ago
[-]
Amazing, love your work ,
reply