FilterHN

Show HN: OpenAI/reflect – Physical AI Assistant that illuminates your life

89 points

3 days ago

| 10 comments

I have been working on making WebRTC + Embedded Devices easier for a few years. This is a hackathon project that pulled some of that together. I hope others build on it/it inspires them to play with hardware. I worked on it with two other people and I had a lot of fun with some of the ideas that came out of it.

* Extendable/hackable - I tried to keep the code as simple as possible so others can fork/modify easily.

* Communicate with light. With function calling it changes the light bulb, so it can match your mood or feelings.

* Populate info from clients you control. I wanted to experiment with having it guide you through yesterday/today.

* Phone as control. Setting up new devices can be frustrating. I liked that this didn't require any WiFi setup, it just routed everything through your phone. Also cool then that they device doesn't actually have any sensitive data on it.

▲

Sean-Der

3 days ago

[-]

I also have been working with Daily on https://github.com/pipecat-ai/pipecat-esp32

I see so much potential if I can make hardware hacking + WebRTC easy. Not just for AI assistants but security cameras + robotics. If anyone has questions/ideas/feedback here to help :)

▲

joshu

3 days ago

[-]

what is Daily?

▲

Sean-Der

3 days ago

[-]

https://www.daily.co/

You can use it to build lots of different real-time communication projects. Conferencing, Send your audio/video to GPU servers for AI, broadcasting and lots more.

It’s a super fun space to be in

▲

3 days ago

[-]

Are there any cool demos that use daily I can explore?

▲

Sean-Der

3 days ago

[-]

https://www.linkedin.com/posts/thorwebdev_esp32-webrtc-activ...

https://m.youtube.com/watch?v=HbO18Elw9WY

Are two that I know of. Try it out, if you hit any roadblocks @ me on pipecat discord and would love to help

▲

joshu

2 days ago

[-]

this is very relevant to my interests. currently building a robot teleoperation project. i will have to investigate further.

▲

baxtr

3 days ago

[-]

If you want to know what this is about, here’s the video they provided:

https://www.youtube.com/watch?v=G5OUnpPAyCg

▲

voxelizer

3 days ago

[-]

I love seeing that hackathons are encouraged inside OpenAI and most importantly, that their outcome is also shared :)

▲

tesch1

1 day ago

[-]

A cynic might wonder if this is just another way for a corporation selling advertising to get more of the "your data". Who is sharing more? :)

▲

kelseydh

3 days ago

[-]

It annoys me a lot that the current devices for controlling smart homes, such as Amazon Alexa or Google Home, lack the ability for lovely conversations the way OpenAI has.

▲

crimsoneer

2 days ago

[-]

The way the Gemini Google Assistant rollout has been SO SLOW is utterly baffling to me.

▲

godelski

2 days ago

[-]

I honestly can't tell if this comment is joking, serious, or AI lol

▲

HPsquared

2 days ago

[-]

LLMs have a lot of advantages over humans for making conversation.

Even forgetting the main advantages (24x7 availability, and ability to talk about basically any topic for as much or little time as you want), they also get basically every obscure reference/analogy/metaphor and how it ties in to the topic at hand.

Usually when you're talking to another person, the intersection of obscure references you can confidently make (with the assumption your partner will understand them) is much more limited. I enjoy making those random connections so it's a real luxury to have a conversation partner that gets them all.

Another one is simply the time and attention they can bring to things a normal person would never have the time for. I'd not want to talk someone's ear off, unless I was paying them and even then, I don't want to subject someone to topics of only interest to myself.

(Edit: I suppose it's the final apotheosis of the atomised individual leaving all traces of community behind)

▲

eloisius

2 days ago

[-]

> final apotheosis of the atomised individual leaving all traces of community behind

It's not. In 10 years this is going to look as dumb as the biohacker wetware bros surgically embedding RFID chips in their hands. There's much more to communication (and life) than receiving pantomimed validation for your obscure references. You could be throwing away opportunities to connect with another person who would genuinely share your interests and help you grow as a person. Having a useless magnet in your fingertip is going to seem brilliant compared to ending up socially withdrawn and mentally unwell because you traded human companionship for a chat bot.

▲

HPsquared

2 days ago

[-]

I think it's a much bigger social phenomenon already. Social talk will become even more a matter of performance, positioning and signalling, rather than something pursued for enjoyment of the thing itself.

Maybe I'm just weird but LLM conversations seem generally more interesting and enlightening, even in these early iterations of the technology.

▲

igleria

2 days ago

[-]

> LLMs have a lot of advantages over humans for making conversation.

A lot of those advantages seem to be what enables an LLM to keep pushing people into delusion or triggering latent mental issues : https://www.psychologytoday.com/ie/blog/urban-survival/20250...

▲

kelseydh

2 days ago

[-]

Yes, but when I ask it to answer a speculative question it doesn't just respond like Siri does with "Sorry, I can't answer that!"

▲

HPsquared

2 days ago

[-]

Of course, every technology comes with risks.

▲

godelski

2 days ago

[-]

Honestly, it sounds like you need a therapist, not a LLM. I'm not saying this as some quip, I'm saying this because what you wrote is that concerning.

▲

HPsquared

2 days ago

[-]

Nobody is talking about my car collection and maintenance plans for half an hour, for instance. Literally. I can expound on the topic, get ideas, ponder things. Much better than the old ways.

▲

eloisius

1 day ago

[-]

Car talk is one of the most broadly relatable topics available for men to talk about, man. It’s right up there with sports, stuff blowing up, and attractive women. There’s something else to it if you can’t find someone that will chitchat about cars. I’m not even a car guy but I’d happily talk to someone passionate about restoring classics or something. I mean this sincerely and not to be snide, you might find that some personal growth unlocks a lot for you when it comes to socializing and community. Don’t throw your humanity away to become some chat bot gargoyle.

▲

HPsquared

1 day ago

[-]

It's better than posting into the void (i.e. one-way sending) and lurking on forums (one-way receiving). That's the usual way to approach deep subjects online, there's no continuity. And you can't go into depth on a subject of choice with an actual associate, because there's always mismatch of interests.

▲

tesch1

1 day ago

[-]

And here we all are enjoyably sharing with humans our interest and experiences in using LLMs!

▲

HPsquared

1 day ago

[-]

I disagree! This is arguing on the internet aka struggle for dominance. (See what I did there)

▲

godelski

2 days ago

[-]

Really? I mean this might just be a function of your friend group. I have some long time friends and we'll talk about that weird stuff for hours. Probably going to be hard with new friends or people you don't have a good relationship with but these are very human things. We all have cars so those conversations come up. Similarly things like dealing with investments, 401ks, health plans, relationships, or fucking petty shit like if In-N-Out is better than Five Guys or not (it is, and I'll fight you over this).

Not every friend group is like that. Different friends offer different things. We're all people though. I mean you can't have a relationship around just talking to people over something like that because that is dehumanizing, but I disagree that you can't talk about that stuff with people. Frankly, I think what's more common is that people are just afraid of opening those subjects. Like we've all had that experience in school where the teacher is talking about something and everybody is confused but nobody speaks up because they're afraid to look dumb or feel like they're disrupting. That class of situations doesn't just go away lol.

I'm not saying don't use the LLMs. Hell, it's like saying don't use Reddit. Reddit will often give you bad advice too, but that doesn't mean it can't also be helpful. But my concern is really that you feel like you can't talk to people about those types of things. We're social creatures. There's no "atomised individual". You can't make it through this world without reliance upon others. Unless you're doing literally everything by yourself from scratch, you're an active part of society. The only difference is your interpretation.

So with that, let me say too that there's a lot of advantages to humans and conversations with them. Not being available 24/7 can be advantageous. Forces you to think on your own and helps facilitate better conversations when they do become available. Sometimes you need answers immediately, but that's rare and often decisions are better informed after thinking rather than through reliance on others. The latter only works if there's non-nuanced objective answers. I'm also someone who makes a lot of obscure references. But that's okay if others don't pick up on them. The rarity of that happening actually is a good thing and helps form relationships, as it is a strong signal we have some common ground. When it is rare it is special. And when not recognized I have the opportunity to share things I enjoy with others. Then they also have the opportunity to share things with me! I don't want clones of me, that provides no opportunity for growth, it only leads to a narrower view (which is the opposite of growth!). Also, humans can be rude, mean, push back, and be confrontational. It always sucks but it isn't always wrong either. I'm assuming you're human (hell, it'll be true even if you're machine lol), so you're not perfect. Sometimes our own egos get in the way and that confrontation is necessary. In fact, this is an aspect that is related to the psychosis/danger igleria mentioned. You can't just wrap yourself up in safety blankets and avoid confrontation. As much as we should try to make life more enjoyable and better the unfortunate truth is that often that requires temporary discomfort. If everything was easy everyone would do it. So even in the times where that discomfort doesn't directly lead to future "rewards" it still provides learning experiences that allow you to be more equipped for the (more frequently occurring) times where it does. It is only your perspective/interpretation.

▲

OJFord

2 days ago

[-]

Why does this need hardware, other than the phone? Could just be an app on the phone couldn't it?

▲

Sean-Der

2 days ago

[-]

I was interested in the ‘hands-free’ idea.

If I put these devices through out my house it would allow me to switch AI personalities by proximity.

You can also use the device without your phone. These devices are also very cheap. I think you could do audio only for around ~5$

▲

Telemakhos

3 days ago

[-]

Somewhere in here there's a joke about how many tokens it takes to turn on a lightbulb.

▲

throwup238

3 days ago

[-]

It deserves a minor rewrite of the Black Mirror episode Fifteen Million Merits where people do menial labor like folding laundry and washing dishes to earn tokens so that their LLM will dispense their toothpaste and send Studio Ghibli stylized birthday cards to their friends.

▲

mrbungie

3 days ago

[-]

inb4: When sama and co talk about UBI, they mean a variation of it based around a memecoin tethered/indexed on (tik)tokens.

▲

toomuchtodo

3 days ago

[-]

https://en.wikipedia.org/wiki/World_(blockchain)

▲

a2128

3 days ago

[-]

Probably 1,000 for the system prompt, 400 for the audio speech-to-text, 8 for the query, 180 for the thinking, 12 for the tool call, 33 for the response with a useless follow-up question

▲

godelski

3 days ago

[-]

All to achieve something that could be done with a Raspberry Pi. You could do all this locally too.

https://gist.github.com/mgarratt/afb3b57a08e2eb2479eb6083a86...

https://www.xda-developers.com/ollama-ai-comparison-raspberr...

https://www.xda-developers.com/raspberry-pi-voice-assistant-...

https://www.youtube.com/watch?v=o1sN1lB76EA

▲

Sean-Der

3 days ago

[-]

This project isn’t tightly coupled with anything. Any service that supports WebRTC should work!

Also I was hoping to push people toward a RTOS. Better experience then a raspberry pi, I can cycle power and be back way quicker. Also cheaper/more power efficient.

I also think I unfairly like ESP because it’s an excuse to write C :)

▲

godelski

2 days ago

[-]

Home Assistant integrates with WebRTC btw[0].

Also, why make the ESP32 the the hotspot? Why not just connect to the same network? Then you're not really range limited.

  > I also think I unfairly like ESP because it’s an excuse to write C :)

Is the comment about Home Assistant being python? Yeah, I can get that. Feels weird to be using slow scripting languages on lean hardware. Though of course you can write whatever routines in C and just patch it in to the interface.

The ESPs are cheaper (here's the non-dev kit which has WiFi[1]), but way less powerful. I don't think you could get away with doing things on device. Though I wouldn't call that dev kit cheaper and that price point was context of my comment.

FWIW, I don't think there's really anything wrong with the project other than just that it comes off as doing things that have already been done before but presenting as if something novel was done. I'm all for reinventing the wheel. For fun, education, or even to improve. Just if I'm being honest, it came off with some weird vibes because of that. I imagine that's how some people are responding as well.

[0] https://www.home-assistant.io/integrations/homeassistant/

[1] https://shop.m5stack.com/products/m5stamp-esp32s3-module

▲

regularfry

2 days ago

[-]

> Also, why make the ESP32 the the hotspot? Why not just connect to the same network? Then you're not really range limited.

Because then they don't have to include the ability to configure wifi, which (while not that hard) is one more thing to do and for a hackathon that's not really contributing to the end goal.

▲

Sean-Der

2 days ago

[-]

I couldn't get on WiFi at the office at all. Corporate WiFi had a bunch of hoops to jump through that made ESP32 hard.

Once I got it working it felt really cool though. As a user I don't want to configure WiFi on the microcontroller at all. I would be really cool if I could walk up to a 'smart device' and set my phone next to it and do zero configuration.

▲

regularfry

2 days ago

[-]

Hah, that doesn't surprise me either. You're really hoping for guest WiFi in that situation.

▲

godelski

2 days ago

[-]

That makes a lot more sense now. Thanks

▲

Sean-Der

2 days ago

[-]

I thought about your comment a lot. I worry that most people just say nice things (but think the opposite) so I appreciate you being direct.

-----

I don't expect you to know anything about me. It made me feel like you have written me off/dismissed me when you mention HomeAssistant + WebRTC. HomeAssistant uses Go2RTC and the WebRTC library it uses is Pion[0]. I created that and maintain it. Getting WebRTC on tiny devices is something I have been working on for years and always doing it Open Source/making it best for users.

-----

> comes off as doing things that have already been done before but presenting as if something novel was done.

I don't think 'Hardware AI Assistant' is a novel idea. What I hoped was a novel idea was putting it in an easy to use package for others to build upon and use. WebRTC + Hardware is something I have been trying to make easier for a long time https://github.com/awslabs/amazon-kinesis-video-streams-webr... [1] I wrote this code, but things still felt too difficult.

When ESP32s got powerful enough to do WebRTC I wrote [2]. Reflect inherits from that. So I am proud of that. I have been on this journey to make RTC + cheap hardware possible so that others can build upon that.

-----

Again I really appreciate your comment, and sorry to be so defensive. Someone I really respected (and I thought they respected me) said the same thing about my work not being novel. They said people have been building security cameras for years that use WebRTC, you are over inflating what you are doing. That has stuck with me. So part of me does fear that I am wasting my time trying to solve these problems.

I don't think what I am doing is novel. I do think that I am solving it differently because I make it accessible/Open Source. Most people solving these problems/building it just keep their code at work and don't try to help others use it.

If you are up for it shoot me an email sean@pion.ly and https://www.linkedin.com/in/sean-dubois/ I would love to have a friend that calls me out/is honest whats good work and what is just BS :)

-----

[0] https://github.com/pion/webrtc

[1] https://github.com/awslabs/amazon-kinesis-video-streams-webr...

[2] https://github.com/Sean-Der/embedded-sdk

▲

godelski

2 days ago

[-]

Maybe you thought about my comment too much. You're right, I don't know you. That also means it is difficult to interpret intent and perspective, right? Not an easy task on the internet.

I don't think you should apologize for being defensive. I'm not upset and what you said does change how I interpret things. Hell, I now know you have way more experience than me in the domain! I'm still reserved on novelty but that doesn't mean you don't know something I don't.

Also, I want to be clear, not everything needs to be novel. There's tons of utility in things not being novel. I want to stress that because we both are in groups which overemphasize novelty (sometimes to a toxic level). Sometimes you should rebuild the wheel for the learning experience/fun. Hell, sometimes you should rebuild the wheel because it can be better, and one of the best ways to figure that out is just by rebuilding that wheel. Or maybe you just rebuild the wheel and show others how to do so (though I'd say there's novelty in that ;). My critique was not due to your work, but messaging. I think it is a fun project, but it came off as if pitching something like an Alexa but as if those devices didn't exist[0]. There's an entirely different take with "this is a fun thing I did (and here are some unique things)" vs "look at this novel thing".

I'll send an invite, but only on one condition: you also call me out on my BS lol. Friends aren't sycophants. Sycophants don't care outside themselves. Human relationships comes with conflict (are we even nerds if we don't argue about stupid nuances?). What matters is if at the end of the day friends can grab a beer together and be friends lol

[0] Back to the first paragraph. There's probably things I incorrectly assumed, especially due to the connection with OpenAI. I'm just getting a small glimpse of this, right? Sounds now like you weren't intending that message. It also sounds to me like that person was telling you something similar. "Over inflating" is "over selling" not "your work is meaningless." Pitch differently. If "for fun", just say that. Don't let anyone tell you there's something wrong with that. If there is something novel, then it's not coming across in your pitch. It's entirely possible you know something we don't. Though the reverse is true too lol.

▲

lagrange77

3 days ago

[-]

Is it my browser, or does the video in the readme not have sound?

▲

Sean-Der

3 days ago

[-]

No sound! YouTube video in README does.

I was tempted to put Erik Satie in the README video. Didn’t want to risk copyright issues

▲

countfeng

2 days ago

[-]

It would be perfect if it could intelligently linkany device under authorization

▲

Sean-Der

2 days ago

[-]

Can you describe more? I would love to build/try it!

▲

TZubiri

3 days ago

[-]

I get that this is as-is, but I wonder if so many ultra-alpha products don't dilute the OpenAI brand and create redundancy in the product line. It feels like the opposite of Apple's well thought out planned product design and product line.

Let's see if it pays out.

▲

Sean-Der

3 days ago

[-]

This is just a hackathon project. Not a product in any way.

My primary work is on backend WebRTC servers. This was just an outlet/fun side thing to do client and embedded work. I love writing C and do microcontrollers. I just can’t seem to find a way to do it full time:(

▲

dasickis

3 days ago

[-]

We could help you find a pathway there :)

▲

tuckerman

3 days ago

[-]

For a developer platform having examples is useful as a starting point for new projects.

Also, I’m not sure if it’s similar at OpenAI, but when I was at Google it was much easier to get approval to put an open source project under the Google GitHub org than my personal user.

▲

jgalt212

3 days ago

[-]

They're selling shares at a $500B valuation. The market is telling them everything they are doing is amazing.

▲

TZubiri

3 days ago

[-]

Is it possible to differentiate the feedback of the initial success of chatgpt from whatever came after it?

It's possible those investments are just the oai owners selling their 2023 chatgpt success and its profit share.

▲

orliesaurus

3 days ago

[-]

Philips Hue is about to start a riot