How the Heck does Shazam work?
86 points
2 days ago
| 12 comments
| perthirtysix.com
| HN
swyx
35 minutes ago
[-]
related comments from Shazamers

- OG shazam paper https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf (he has a talk on youtube btw look it up if really care)

- https://news.ycombinator.com/item?id=18069968 shazam employee blogpost

- https://news.ycombinator.com/item?id=38538996 shazam cofounder endorsed explainer

- go algo repro https://news.ycombinator.com/item?id=41127726

as with all ML things... the code is much less % of the value than the data...

reply
krishna_dam
3 minutes ago
[-]
Surprised to see how that got it worked with out all the "AI" bluff
reply
thakoppno
12 minutes ago
[-]
Perhaps obviously this is the same technique that enables ACR on TVs.

It occurs to me that Shazam has such a better reputation online because the intent and consent of the user is honored.

It makes me wonder if there couldn’t be an implementation on TVs that is similar and actually is a net positive for consumers. Basically would customers actually like TV ACR if the data wasn’t just going to sell more ads?

reply
krustyburger
5 minutes ago
[-]
So the value-add would be the consumer would get to find out the name of the show or movie that’s playing, the same info that also pops up if they hit the pause button?
reply
yawpitch
1 minute ago
[-]
This has been explained so many times… a wizard imbued the kid with the powers of Solomon, Hercules, Atlas, Zeus, Achilles, and Mercury.
reply
G_o_D
6 minutes ago
[-]
Out of curiosity is it possible to prevent shazam like app from detecting maybe by adding noise or any technique ?
reply
Animats
35 minutes ago
[-]
Recognizing a recording isn't hard to do, because, for the same recording, the chords follow each other with precisely repeatable timing. That's been around for well over a decade. Recognizing a different recording, say, a, cover version, of the same song, is much more work.

Audible Magic claims to be able to recognize multiple performances of the same songs, and even parodies.[1] Using, of course, "AI technology" and much more compute.

[1] https://www.audiblemagic.com/2024/02/07/identifying-cover-so...

reply
andai
23 minutes ago
[-]
Why is this harder than "delete timing information" ?
reply
bitexploder
33 minutes ago
[-]
20 years at least. I remember seeing how Gracenote worked back in the day when I was consulting for them.
reply
dataviz1000
24 minutes ago
[-]
Add to my list of projects. Dinosaur game but with audible clucks to jump.
reply
blackjackfoe
9 minutes ago
[-]
No "AI" required!
reply
gnabgib
56 minutes ago
[-]
Again? Oh I see.. SCP (this domain is sus)

From CameronMacLeod (2022) - and much more complete analysis (587 points, 2023, 155 comments) https://news.ycombinator.com/item?id=38531428

Or Slate (2009) (50 points, 16 comments) https://news.ycombinator.com/item?id=893353

reply
BLKNSLVR
45 minutes ago
[-]
Forgive my ignorance, but what does SCP mean in this context? (my normal go-to of 'secure copy' doesn't fit).

Thanks for the other links, the question in this title is one I've day-dreamily thought about on occasion, but never dug into. Will have a read of all three.

reply
downboots
1 minute ago
[-]
embodied strange occurrence
reply
Animats
43 minutes ago
[-]
Vaguely relevant pop-culture reference.[1]

[1] https://scp-wiki.wikidot.com/glossary-of-terms

reply
BLKNSLVR
24 minutes ago
[-]
I seem to have wandered into a parallel universe.

I think it'll take me longer to understand WTF SCP is than it will to understand how Shazam works.

reply
cyral
51 minutes ago
[-]
The interactive parts of this post are very cool though
reply
cellular
1 hour ago
[-]
I did this for a science project in 1986 on an Apple ][c computer !
reply
wood_spirit
10 minutes ago
[-]
Reminds me of Roy Van Rijn’s prototype that got a cease and desist letter! Lots of community disappointment at the time!

https://hn.algolia.com/?q=royvanrijn

reply
dackdel
22 minutes ago
[-]
voodoo
reply