Grandma on Alexa – Online Defense

I am personally fascinated by artificial intelligence, technology and how it is gradually but inevitably entering our lives in a pervasive way.

Improving them, of course, sometimes in a very subversive way, such as … reviving feelings associated with people who are no longer there.

In these cases, critical thinking is important: it is precisely this thinking that leads us to evaluate these impacts, to decide whether we want to be part of them, also weighing the possible negative aspects. This explains the fascination that technology has for me, as an occasion for frequent surfaces and cathartic moments.

One of these moments happened to me a few days ago, after the re event: MARS 2022. An artificial intelligence kermesse in which the giant Amazon presents to the world its studies, experiments and innovations in machine learning, automation, robotics and space… applied to present and future business.

The event itself is exciting, full of ideas and challenges, with many distinguished guests, including external ones. The speeches of these guests, known in jargon as keynotes, are also available online on YouTube for repeated consumption by the public.

In the second day’s keynote available at https://www.youtube.com/watch?v=22cb24-sGhg, just after an hour and two minutes into the start, I was struck by a quote from Rohit Prasad – chief scientist by Alexa AI.

I watched his presentation, very well prepared: talking about empathy between man and machine as an emotion at the basis of building a relationship of trust, he focused on the fact that for many of us the recent pandemic emergency meant the loss of a loved one.

Alexa – often a symbol of this technological presence, even for simple conversations – has developed skills over time that I was literally shocked by: obviously not ones that will eliminate the pain of these losses, but enough to provide a further way to make people’s memory more persistent care.
Within seconds, the video changes to show a child asking Alexa to have his dead grandmother read him a passage from The Wizard of Oz, as she did when she was alive.

Alexa responds with an “OK” to the request, moving immediately to a perfect simulation of the beloved grandmother’s voice, giving a visibly palpable thrill to the technological grandson.

The video then cuts back to Rohit, who immediately explains two things that really struck me about his innovative power: first, that the possibility I just saw comes from a change of perspective in how voice is analyzed. More specifically, by moving the analysis from a speech production problem, i.e. the production of an auditory phrase, to a voice conversion problem.

The second thing, closely related to the first, is how this change in perspective made it possible to play with just one minute of existing recording, compared to hours of studio recording that the previous approach would have required!!!

(Rohit Prasad on stage re: MARS 2022)

But I spoke first about the need for critical thinking: past that wow momentin fact, I began to reflect on certain aspects.

There are many online services, in which one of the authentication methods consists of saying a sentence to prove that you are the person you claim to be: it is clear that with such technology, which is also available not only on Amazon, it becomes important to trust the strength of the authentication process to other mechanisms – certainly less simulated ones.

Biometric voice authentication is certainly a new technology, which, however, has seen quite immediate popularity due to its ease of use and the absence of accessory tools. Much of this popularity has seen banking and insurance services embrace it in a significant way, albeit recently in combination with other mechanisms to increase the overall power of the process. In a 2020 article, the combination of multi-frequency sounds of a telephone keypad was hypothesized as a strong authentication process, combined with a phrase recorded in the customer’s voice that had to be played back: it goes without saying that Amazon’s innovation would nullify this the power, putting the authentication process at risk of impersonation attacks. Attacks in which the attacker has all the necessary tools to act as if they were the final victim, deceiving the technological entity that should authenticate.

Another risky process is that which involves controlling access, physical or virtual, to work for billing purposes or controlling working hours. Another area where attackers will be greatly facilitated by these speech synthesis technologies.

The second consideration is a little more emotional: as I said at the beginning, I find this possibility absolutely fascinating and innovative. I realize, however, that for different people this may not already be a source of positive feelings, but prolong the pain of losing a loved one in a significant and difficult way.

Well, if there is one thing that distinguishes man from technology it is consciousness, from which free will springs. History teaches us the futility of rejecting progress, science and technology in favor of personal feelings and beliefs – right or wrong. So I simply suggest that these people choose. By choosing to ignore this possibility that technology offers, leaving the same right to the rest of people.

Leave a Comment