NokiMo
JimBrowning
JimBrowning

patreon


Reverse engineering sounds from visuals

Ok. This hasn't really much to do with scams, but I did find this very impressive. Someone commented in my latest video that they could work out the sound from my graph even though I had muted the sound.

I challenged the author to do this and I'm mighty impressed with the result. It's not quite the right surname, but very close.

Well done Rebane… I'll be blurring the graphs next time!

Reverse engineering sounds from visuals

Comments

Nerds......all of you......

Low quality is because you used pure sines. If you used filtered noise of right bandwidth it'd sound much, much better. More mathy though. This is what is used in part in voice synthesis. The solution here is to match a voice synthesis model settings (noise and pauses included) to the resulting frequency response. What you did is part of it, but real synthesis model would have various "excitation" waveforms that are not sines, filtered by a set of passive filters simulating vocal tract, teeth, tongue, nose cavity etc. Since this recording is short and data is sparse, a similarly sparse model would be used, say ACELP. (Which additionally likely matches at least somewhat the digital codec used by those SIP gateways.) With in depth data and advanced models you can do full voice clones. (Like Google's DeepMind did from some 20 minutes of recording.)

That's not half scary but yeah, sound is vibration.

I'm impressed! Reminds me of this Passive Recovery of Sound from Video project <a href="http://people.csail.mit.edu/mrub/VisualMic/" rel="nofollow noopener" target="_blank">http://people.csail.mit.edu/mrub/VisualMic/</a>

Hey Jim, I reside in Chennai. I have major pull here in regard with political connections and it hurts me a lot to know that there are people in my country harming our name as Indians. I have many foreign company’s, in Colorado etc. feel free to send me any information you have to admin@ujiwater.io and I will make sure that these scammers get arrested. Regards, Babu Rishikesh. CEO Racehorse Startup’s LLC

I think the NSA or FBI should be rewarding this guy somehow :)

Very interesting and nice work. However, our brains do a lot of clever filling in gaps with information from our eyes. Come back to this video a few days later, skip straight to 1:53, close your eyes so you don't see the text and listen. Can you really make much out? Once you know what is being said, the problem is that it's difficult to forget it. Try it on somebody who does not see the text while listening and who doesn't know what is being said.

Syd

I probably didn't explain the background to this... In my most recent video, there is a point where I silence the audio to protect the privacy of the victim... it's just at this point: <a href="https://www.youtube.com/watch?v=hRLoGSmuWXs&amp;t=747s." rel="nofollow noopener" target="_blank">https://www.youtube.com/watch?v=hRLoGSmuWXs&amp;t=747s.</a> I mute the audio so that her surname can't be heard. This very clever chap was able to show me that there's no point in muting the audio if you're clever enough to re-produce the audio by just using the on-screen graph. I'll be moving the "*Muted*" text to prevent this in future!

Jim Browning

Very impressive

This is more than amazing

EpicLPer

I have to admit, I'm fairly confused.

I'm just picturing what he can be capable of as he hones that raw intellect.

Carol Chapman

nice!!! specially liked he used Python to reverse it.

I think I should be rewarding this guy somehow.

Jim Browning

Wow...


Related Creators