What is Generative AI and How is it Changing Voice Assistants?
What is a virtual assistant?
A virtual assistant is a software agent that can perform a range of tasks or services for a user based on input such as a command or question. Interaction between the assistant and user may be via text, graphical interface, or voice.
What is a voice assistant?
A voice assistant is a type of virtual assistant that is able to interpret human speech and respond via synthesized voice. Voice assistants run on Internet-connected devices, such as smartphones and smart speakers, and use voice recognition and language processing algorithms to listen to specific voice commands and return relevant information or perform specific functions as requested by the user.
Voice assistants were originally designed for the following kinds of tasks:
- reading text or email messages aloud
- looking up phone numbers
- scheduling appointments
- placing phone calls
- reminding the user about appointments
Today, voice assistants are integrated into many of the devices we use on a daily basis, such as smartphones, computers, and smart speakers. Because of their wide array of integrations, voice assistants may offer a very specific feature set or be open ended to help with almost any situation at hand. Several of the top technology companies have incorporated smart voice assistants into their speaker product offerings, combining high quality music capabilities with smart home integration.
Acer Smart Speakers
A colorful addition to a room of any size, the Acer Halo sits on a base lit up by RGB lighting you can customize. This smart speaker is powered by Google Assistant.
Halo HSP3100G specs:
- Compact, stylish smart speaker featuring Google Assistant
- Professionally engineered DTS Sound projecting 360° audio
- Google Assistant
- Two far-field omnidirectional mic arrays; 1 x 3.5-mm audio jack
- Customizable LED display; RGB-lit base that reacts to music
A chic yet unusual Bluetooth and WiFi speaker that makes a lot of noise despite its compact dimensions. A great speaker for those who need music on the go.
Halo Swing HSP5100G specs:
- Omnidirectional DTS audio
- Google Assistant, Bluetooth 5.2, and Wi-Fi 6
- Customizable LED display using Acer Halo App reacts to music
- Portable IPX5 water-resistant design
- USB port & dock charger deliver up to 8 hours of portable music
Most popular voice assistant technologies
Among the products on the market, the following have achieved the most success:
- Google Assistant. This virtual voice assistant software application was developed by Google for Android devices. Google Assistant can perform a variety of tasks including answering questions, adjusting hardware settings on the user's device, scheduling events and alarms, and playing games.
- Siri. Apple's built-in, voice-controlled personal virtual assistant is available on devices using iOS, iPadOS, watchOS, macOS, and tvOS. Siri uses voice recognition technology that's powered by AI.
- Alexa. Used mainly through Amazon's line of hands-free speakers known as Echo, Alexa is a cloud-based voice service that responds to simple language requests, such as "what is the weather today?" or "play pop music on the dining room speaker."
- Bixby. Samsung's virtual AI assistant runs primarily on mobile devices but also some smart refrigerators. Bixby can be used for various tasks including texting, retrieving location-specific weather information, setting meeting reminders, and reading news articles.
- Mycroft AI. Mycroft is an open source voice assistant that can be run on any platform, including desktops, automobiles, and Raspberry Pis. It focuses on voice-enabling any device to turn it into a smart virtual assistant.
What is generative AI?
Generative AI is a type of artificial intelligence that learns from existing artifacts to generate realistic, new artifacts (at scale) that reflect the characteristics of the training data. It can produce a variety of novel content, such as images, video, music, speech, text, software code, and product designs.
Generative AI is capable of producing highly realistic and complex content. Generative AI most commonly creates content in response to natural language requests—coding knowledge is not required. The enterprise use cases are numerous: the technology has potential applications in gaming, entertainment, customer service, content creation, product design, software development, and much more.
Generative AI hit mainstream headlines in late 2022 with the launch of OpenAI's ChatGPT, a chatbot capable of very human-seeming interactions. OpenAI’s DALL·E 2 tool similarly generates images from text in a related generative AI innovation. The number of use cases of generative AI is likely to increase as people and enterprises discover more innovative applications for the technology in daily work and life.
State of the voice assistant market
According to Business Insider, in the first quarter of 2022, Amazon’s “Worldwide Digital” unit had an operating loss of more than $3 billion, most of which was due to its Echo smart speakers and Alexa voice technology. The loss was by far the largest among all of Amazon’s business units. Media reports about cutbacks of Google Assistant investments by Alphabet further suggest the ongoing losses occurring in this market.
But why are these voice assistant divisions not profitable? It can’t be from lack of adoption; Siri and Google Assistant are installed on hundreds of millions of smartphones. More than 100 million Echo devices with Alexa have been sold, and Alexa is also installed on a similar number of non-Echo devices. It also can’t be for lack of usage; users interact with these voice assistants billions of times every week.
It turns out that it’s really hard to build monetization scenarios around voice assistants. The primary method of monetizing these technologies has so far been with royalties from third-party manufacturers who integrate these assistants into their products. The following channels, initially anticipated to be revenue generators, have seen limited success:
- Voice-based commerce has not taken off. Unlike mobile apps and websites, voice assistants are unable to display product images or provide detailed product descriptions. Also, the inability for users to read product reviews is a limiting factor.
- Advertising-based monetization strategies are also not viable. Compared with other digital channels, voice ads in the middle of an audio interaction feel more intrusive and subtract from the experience.
- Third-party app development, such as Alexa Skills, have had limited success. Despite the 150,000 Skills in the Alexa catalog, the typical Alexa user has not been installing, using, or subscribing to them. This means limited revenue for Alexa Skill developers as well as for the Alexa Skills store.
Perhaps the biggest challenge of all is that consumers consider voice assistants, which are integrated into their smartphones, smart devices, home automation systems, and cars, as features of those products rather than being products themselves worth paying for.
Generative AI and the next iteration of voice assistants
Generative AI is the natural next step for voice assistant technologies, enabling assistants to deliver more intelligent answers than what is possible with current command and response models. Generative AI built on the latest large language models can more effectively understand user queries than can search algorithms or other older models that also use natural language processing. Generative AI can also respond to questions with more accurate and personalized information. Voice assistants that incorporate generative AI may better understand user prompts and be more effective for users.
Personalized voice assistants
Can AI voice assistants have a personality? The answer is yes. Generative AI can generate output in a way that mimics fictional characters and even real-life individuals. Imagine being able to chat with your favorite historical figure or celebrity and have that character remember your conversations—this brings new opportunities for voice assistants to provide personalized services. One company, Character.ai, has developed a chatbot platform that’s making this idea a reality.
Character.AI is a web application that lets users create and chat to personalized chatbots. These chatbots, called Characters, can be original creations or imitations of famous figures, fictional characters, or specialists in certain fields. The platform was released to the public in September 2022 and has become highly popular among users.
Alexa and Google Assistant developments
Amazon’s in-house large language model ‘Alexa Teacher Model’ is being used to add conversational capabilities and improved functionality to Alexa. Likewise, Google has initiated the development of the revamped Google Assistant, commencing with the mobile version of the product. The new Google Assistant will leverage technology akin to ChatGPT, enhancing its assistance capabilities, natural language comprehension, and overall range of functions.
We can expect voice assistants of the future to have a more proactive role in interactions. Rather than just waiting for user commands, assistants will collect context-specific information and then take the initiative by making helpful suggestions to the user.
Ashley is a technology writer who is interested in computers and software development. He is also a fintech researcher and is fascinated with emerging trends in DeFi, blockchain, and bitcoin. He has been writing, editing, and creating content for the ESL industry in Asia for eight years, with a special focus on interactive, digital learning.