Why are voice assistants still not smart?
The underwhelming experience of voice assistants stands in contrast to the ever expanding market growth of the smart speakers.
Welcome to the FWIW newsletter about tech, media & audio written by David Tvrdon. 🌐 Read it online and consider subscribing.
In this edition
🗣️ Why are voice assistants still not smart?
🛒 Inside Amazon’s first cashless Go supermarket
💬 The rest of the news in short bites
Why are voice assistants still not smart?
📷 by Ana Flávia on Unsplash
Almost no one today, when speaking of a voice assistant or a smart speaker, expects something like Ironman’s Jarvis. Nor do people expect a virtual personal assistant, meaning someone who “assists a specific person with their daily business or personal tasks”.
No, today’s voice assistants are good at playing music and podcasts, turning on and off various things, answering simple questions (though Alexa will give you an answer to “what is love”, Google will show some YouTube music video) or telling the weather forecast.
Smart speaker sales reached a new record of 146,9 million in 2019, which is up 70% from 2018, the latest report from Strategy Analytics says. Amazon maintains the lead, trailer by Google, followed by the Chinese vendors Baidu, Alibaba and Xiaomi that all increased their respective shares in 2019.
And that’s just 2019. Now there are hundreds of thousands of smart speakers in people’s homes. And this Adobe report from last year should tell you a lot about what people expect from a smart speaker:
Playing music is the most popular activity, with 74% of people saying they use them for this purpose, followed by checking the weather forecast (66 %). 58% of people say they use their device for asking those fun questions, the third most popular activity.
Yep, people don’t see a Jarvis nor a personal assistant. We are just not there yet. And when you talk to the people building these natural language processing (NLP) assistants, one thing they keep repeating is that it is hard. And that’s fair and understandable.
As I wrote above, a question could be understood by different ways, different intents - Google think I was looking for “that song”, Alexa thought offering a quote from Dalailama would make the job done well.
Now imagine you would ask a real personal assistant what is love. You would most probably get a mixed feelings answer on the topic, but certainly, at some point would the PA ask you whether everything is OK or there is some trouble.
Or take another example. Two years ago Google Duplex was introduced, a conversational AI that could interact with someone real so that a person won’t even know that he or she was talking to a computer, not a human being.
Now in that clip, the assistant says “just a women’s haircut”, but imagine the assistant would fulfill the given task by pulling the latest haircut from the social media accounts of the owner. Now that would be mind-blowing. And I would be happy to pay for a service like that.
Which brings me to my final point and a current big issue I have with the future of voice assistants. Without a doubt, the most popular ones are Siri, Alexa, and Google assistant. They are all free, either they come with the operating system or you can buy a smart speaker to accommodate them.
Would you trust a real personal assistant if he or she said they worked for free? I wouldn’t, though I imagine some people would be really happy about it.
Would you pay for a voice assistant that would be like Jarvis from Ironman? You could have a conversation with the assistant, it would learn more and more about you, remember things and remind you even though you did not tell the assistant to remind you, simply the assistant listened to you conversations and figured out your friend Joe wanted to meet up next Friday, but you forgot to schedule, the assistant wouldn’t forget.
Well, I would pay for a service like that a monthly subscription fee. And that I think would be a viable monetization option.
I do not think this will be possible soon, the technology is just not there yet, regardless of Google’s Duplex pitch.
The first step would be defining an agreed-upon benchmark, just as Google did for Meena, its chatbot. Then you could compare the progress made across the industry. (Yes, I know of the Digital assistant IQ test, not the benchmark I am looking for.)
The next steps would lead to ensuring data safety and anonymity, remember the assistant would be like your real assistant with a slight difference - not being a human.
Finally, you would have to move away from little boxes with mics and speakers to a more ambient experience (explained nicely here by Walt Mosberg).
For now, though, you can keep asking those fun questions, or playing music.
P.S.: As it is with a good PA, they ask for more money with time as they get better. Would you be willing to pay a higher monthly fee if your voice assistant got better?
The rest of the news in short bites
🎧 Audio articles are helping news publishers gain loyal audiences. Many publications do not put together teams for creating and audio experience, for that they have a podcast team. There are a few “text to audio” startups doing the reading - Audm, Noa, or Curio, they all have also mobile apps. Of course, the Danish Zetland has its authors read their stories which makes the experience even more personal. This is not possible for publications such as The Economist, that’s why it makes sense to outsource such task. [Nieman Reports]
“We are a premium-priced publication and people feel guilty if they don’t read many stories,” says Economist deputy editor Tom Standage, who oversees audio strategy. “Our evidence suggests that the audio edition is a very effective retention tool; once you come to rely on it, you won’t unsubscribe.”
🤦♂️ Facebook fail of the week. Tests find Facebook's Download Your Information tool gives users an incomplete and inconsistent list of advertisers who have uploaded their data to Facebook. Despite the social media giant’s claim, "Download Your Information" doesn't provide users with a list of all advertisers who uploaded a list with their personal data. [Privacy International]
ℹ️ Google Jigsaw launched a digital magazine about disinformation called The Current. Here is a great take on it by Matthew Ingram:
The Current has the feeling of something Google’s marketing department cooked up in a hurry. If it were a presentation for ninth-grade civics class, it would get high marks. But for something produced by a $900 billion company that purports to have high-minded goals, it’s pretty weak tea. It deserves a C+ at best.
I can confirm that the only feeling left was that this is a nice website. [Columbia Journalism Review]
🛒 Amazon is opening its first full-size, cashier-less grocery store. It’s Amazon’s first full-size, cashier-less grocery store, expanding the technology in its Amazon Go shops, which are more like convenience stores. The store is about 10,400 square feet and stocks roughly 5,000 items, including fresh produce, meats, and alcohol. The company introduced the first Go store in 2016. [CNBC]
📚 #5Recommended this time on the future of work, the rise of Signal messaging app and other selected stories. Every Sunday I put together a top 5 list with the best reads of the week. Check out the second edition.
📟 Being a white hat hacker can be a good business. HackerOne, the bug bounty platform, reports hackers on its platform, which doubled to 600.000 in 2019, received $40 million in bounties in 2019, $82 million total since 2012, with 7 hackers passing $1 million. As HackerOne's report stressed, the unemployment rate for trained cybersecurity personnel is 0%, suggesting that the demand for workers in this profession is acute, and matched by insufficient supply. [ZDNet]
Playlist of the Week: Bojack Horseman [Apple Music | Spotify]
TV Show of the Week: Bojack Horseman [Netflix original]
Catch me on Twitter or LinkedIn. Was this forwarded to you? Subscribe over here👈