The Current Challenges of Voice-Assistant A.I.

The Five Most Key Takeaways of This Blog Post

How Far Are We from Ubiquitous Voice-Assistant A.I.?

Yes, most people who have a smart phone are using an iPhone, but how many of us are really using Siri?

Some analysts may point to Siris lack of popularity relative to just typing something in an Internet browser’s search bar as evidence that this type of A.I. may not exactly capture the attention of the public for a sustained period of time.

But when the Siris of tomorrow start to sound more and more like actual human beings, and can fetch (and create) data with unprecedented speed and efficiency for an A.I. voice assistant, then we will see whether this technology will become a huge part of smart-phone users’ daily lives.

However, there are indeed some big hurdles to clear before this technology can truly capture that attention. We enumerated them in the Key Takeaways section that opened this blog post, and we will go into more detail below.

Variations in Accents, and Languages

Most of us Americans are familiar with the fact that the diversity of accents, let alone regional dialects, can lead to significant changes in language use.

In the context of voice-assistant A.I., this means that an A.I. will need to be adaptable to a wide range of accents that speakers of a particular natural language may have. Otherwise, the likelihood of frustrating misunderstandings will linger.

Likewise, the sheer amount of languages on the Earth makes adapting this technology across the globe particularly challenging. You can certainly expect the super-common languages like English and Mandarin Chinese to be among the ones used, but the A.I. for smaller languages may lag behind in efficiency or even availability.

OpenAI’s Emotional Reliance Issue

Located far down in a report with a decidedly boring-sounding (for some readers, at least) name “GPT-4o System Card” is a section by the name “Societal impacts”, which provides under its subsection “Anthropomorphization and emotional reliance” a description of that subsection title’s named risks.

Anthropomorphization is attributing human-like features to non-human things. If you have indeed seen and remember enough of the movie Her, the movie referenced in the Key Takeaways section opening this blog and creepily referenced in OpenAI’s now-recalled Scarlett Johannson-soundalike A.I., then you will have a clear picture of what this looks like in relation to voice-assistant A.I.

Basically, you will have people that come to feel that the A.I. is something more than an A.I. assistant, such as a friend, or further even more than a friend (again, recall or look up a clip from Her).

Emotional reliance is also listed as one of the risks of OpenAI’s in-development voice-assistant A.I. platform.

The issue becomes even more serious when one considers the risk of voice-assistant A.I. “hallucinating” by giving erroneous, perhaps even harmful, advice to a user that has both anthropomorphized and formed an emotional attachment to a voice-assistant A.I.

Giving your attention to an A.I. platform is one thing, but giving it your trust is an entirely different thing. For business owners looking to avoid any major liabilities with their business’ own voice-assistant A.I.’s of the future, it is useful to consider including disclaimers in, and even just conversational limits on, the interactions between customers and the A.I. that occur.

Something that OpenAI points to as a potential agitating societal influence is the deferential nature of voice-assistant A.I., which will allow users to interrupt and take charge of the conversation at any time.

Given how the more advanced voice-assistant A.I. models are pretty much built to mimic a human-like conversation in certain superficial respects (namely, spoken back-and-forths with an artificial human-like voice), this could negatively condition people to bristle at the lack of such control in human-to-human conversations.