When I saw the Matrix way back in 1999 and saw Neo being uploaded with martial arts skills all I could think was “I would upload every language on the planet.” It would be its own superpower. Every African dialect? Cantonese? All 800 languages they speak in Papua New Guinea? You could travel anywhere and have in depth conversations with the residents.
If you are someone who is dreading the new AI age, you need to consider what AI is about to do for us and languages, including instant translation and dubbing which uses your own voice.
We’ll begin with the new abilities of text-to-voice generation. We all remember the weird robot voice that these apps used to create - they were grating and unpleasant to listen to. I just uploaded the paragraph above to Eleven Labs and gave it a sample of my voice. Here is the result (again, this is not me but an AI version of my voice.)
Whoa. Pretty impressive.
I then tried the video translation software HeyGen. I was interested in the technology, but also I was really curious to hear myself speaking in French. My husband and I live part time in southern France (we have a Substack about it here) so I hear myself speaking halting, mediocre French on a regular basis, but what would I sound like if I were fluent? Would listening to my own voice speak the language help with my accent and confidence? I hoped so.
I took a 30 second video as I walked through the streets of Montpellier.
This YouTube video walked me step by step through the translation process using either Eleven Labs or HeyGen. Eleven Labs translates your voice, but HeyGen will take a video of you and not only dub your words, it will alter your lips to match! Bonkers. Here is my first attempt:
You will see that there are glitches. At the 00:24 mark my voice cuts out completely. Perhaps because I turned away and my mouth was not visible to the camera? Also, the synching of my lips is okay, not great. I asked a native French speaker to watch it and she said that the beginning sounds like “French French” but after the glitch it sounds like “Canadian French,” and then it switches back to “French French.”
All of these things aside, it is still F***ING UNBELIEVABLE. The voice sounds very much like mine. It will certainly only be a matter of months before the software is nearly perfect.
Because of the glitches, I decided to try filming at home, staring straight at the camera without any wind or background noise. In English:
This time I chose to speak in Italian:
Much better than the one I did outside! Unfortunately, I don’t have quick access to a native Italian speaker to check the grammar and accent. But it’s impressive to me!
In France, Roberto and I work on our French daily, and I am determined to be fluent regardless of what AI can do. It’s good for my brain and I enjoy speaking it. However, if I could wear a pair of glasses that would instantaneously translate for me, I would start going to French plays and films immediately instead of say, in five years (thanks to @PRANATHFERNANDO for sharing the glasses news).
I suspect instant translation will be a game changer for immigrants and refugees in new countries. Think how much more quickly they will be able to find employment. When people were fleeing Ukraine, relief workers in Poland and Germany were desperate for translators to help people get resettled. Translators lives were endangered by helping American soldiers in Afghanistan. I also believe that if soldiers could understand the residents of a foreign country it might not be so easy to see those people as “other.”
So I say bring on the Matrix!
NAILED IT
This image is terrific with its lovely light streaming through the windows of a Spanish cafe. But for some reason Dall-E 3 decided to give me a full beard.
Maybe the AI knows that I am a middle-aged woman and that if I don’t stay on top of things a beard is kind of inevitable?
Absolutely amazing! Thank you for providing balance to the AI conversation.
Wow. Thank you for that AI and language reporting with examples and links to sources.