≡ Menu

Say hello to AI. It sounds just like you.

Netflix and No Chill 

Picture this: You’re curled up, bingeing the latest Netflix series. Your phone rings. You let it go to voicemail (because who answers). In between episodes, you check the voicemail. It is from you.

“Hi, It’s me, your inner voice,” it says. “Taxes are due. You’re running out of time. Fortunately, TurboTax can help. Stop delaying and go to turbotax.com for a discount right now. Listen to your inner voice – turbotax today, total relaxation tomorrow!”

Say hello to AI. Your own voice just shamed you and sold you a service. 

Think this is the future? Nope. This could happen tonight. OpenAI, the force behind Large Language Model Chat-GPT, can now mimic and replicate your voice with your own unique intonation and pronunciation behaviors. OpenAI’s voice cloning program, called Voice Engine, needs only 15 seconds of audio of your natural speech to robo-parrot a fully-formed voice clone that can read any text it is given. The strangest part: it *sounds* exactly like you. Not even your closest friends and family would be able tell the difference between your voice and the machine’s clone. 

At Least OpenAI is Open about it

OpenAI isn’t the first software company to clone voices, but Voice Engine needs only 15 seconds of speech, and the advanced mimicry is beyond impressive. They laid it all out pretty plainly in a recent blog post (see below for links). 

OpenAI’s teams are the same folks who brought us Chat-GPT, a chatbot that can write Shakespearean sonnets about spaghetti. OpenAI has now turned its sights on perfecting voice replication. This technology may revolutionize everything from personal assistants to prank calling.


A Sonnet to Spaghetti, by Chat-GPT4

But while Voice Engine promises to make voice-overs and personalized messages more accessible, it also opens a can of ethical worms. Imagine receiving a heartfelt apology from a friend, only to discover it was AI-generated. That would make the next happy hour pretty awkward, huh? Sadly, a bit of cringe between friends is the least of our worries. 

Fake kidnapping schemes and wire fraud scams have already begun. When that AI-generated apology is followed up by a phone call from that same friend, now stranded and desperately asking for $1000, odds are you’ll be in the mood to help him out. 

More than one parent have already fallen victim to fake-kidnapping calls. See how this gets potentially devastating. We’ll need several blog posts to talk about the legal, ethical and moral considerations of voice clones. For now, just know it’s already way too messy. 

Strategize against the machine

Now that this technology exists, what’s a regular old Netflix fan supposed to do? For starters, don’t trust voice alone. If anyone calls you in distress asking for money, hang up and call them back. CallerID can be spoofed, so even if the call seems legit, make a hangup/callback move the standard emergency protocol. Once you re-establish contact, perhaps it can be all cleared up. If you are still in doubt, pull out a secret codeword you’ve previously established. 

(When I told my two BFFs from college this latest development, we established a code word right away. Yay, 12-year-old-boy humor).

If you don’t have a codeword, another verification idea comes from the world of science-fiction: shared memory verification. Whenever the aliens body snatch, we humans resort to asking who threw up in second grade or what people food our first cat craved. If the person can’t answer, then they’ve been body snatched. Kill them immediately. Or, in this case, just hang up. 

Some Cool Bots

For creators, this technology is a game-changer. Podcasters can produce content in multiple languages without ever learning them, and authors can narrate their audiobooks without stepping into a recording booth. 

The ability to preserve the voices of actors or artists could bring comfort to many. In my opinion, there cannot be too much Mr. Rogers content. The world lost a kind soul when he died. But now I could hear his voice reading children’s stories or perhaps reading one of his own written articles.

Most importantly, people with communication challenges will have a whole new world opened to them. Many people with speech impairments have long-relied on synthetic voices, but the voices are usually tinny and lack personality. Now the sufferer’s own voice can be constructed or reconstructed from a few vowel sounds.

OK, Corral your thoughts

It’s worth repeating here that legal, ethical and moral issues surround these good possibilities. Right now, AI is *quite* messy. It’s Wild West; basically zero legislation exists to protect any of us. But let’s not throw the AI baby out with the electronics recycling bathwater just yet.

Indeed, we must say bye-bye to the days of blindly trusting our ears – and eyes (e.g., deep fakes). But the good news is – now you know. Keep your eyes and ears open to new possibilities of how AI can help you. Hey, maybe once your Netflix series is kicked, you can get you to read you a bedtime story. 

After you do your taxes.

Artists ask for AI Ethics https://artistrightsnow.medium.com/200-artists-urge-tech-platforms-stop-devaluing-music-559fb109bbac

Fake kidnapping: https://www.cnn.com/2023/04/29/us/ai-scam-calls-kidnapping-cec/index.html

Mutism, paralysis solved with AI and voice https://www.smithsonianmag.com/smart-news/woman-with-paralysis-can-speak-by-thinking-with-a-brain-implant-and-ai-180982797/

OpenAI’s announcement of Voice Engine, with mind-blowing examples: 

https://openai.com/blog/navigating-the-challenges-and-opportunities-of-synthetic-voices/