Siri doesn’t like you. Alexa doesn’t want to be your friend. And Google Assistant? Well, let’s just say that Google Assistant wants to spend as little time as possible answering your questions. And that’s okay.
That’s because none of these three is designed to be an A.I. friend, despite the platitudes they invariably spout if you ask them whether or not they enjoy spending time with you. They are “do engines,” virtual assistants which aim to resolve your queries in as little time as possible, hence our saying that Google Assistant doesn’t want to waste more time than it has to telling you what you need to know. These assistants can answer queries and, increasingly, anticipate our needs. The one thing they don’t do is care.
Just like Samantha, the A.I. from the 2013 movie Her, Microsoft’s Xiaoice has a burning desire to be your (yes, your) friend.
Microsoft — which also makes the A.I. assistant Cortana — has a different idea. Xiaoice (pronounced “Shao-ice,” meaning “Little Bing”) is a social chatbot, with a personality modeled on that of a teenage girl, and a dauntingly precocious skill set. In the addition to many of the usual skills you might expect from an A.I. assistant, she can tell jokes, write original poetry, compose and sing songs, read stories, play games, and more.
Remember when Google showed off its Google Duplex technology, capable of making real spoken word phone calls? Xiaoice can do something very similar. Think the recent A.I. anchor on Chinese television is the first time such a thing has happened? Not quite. Xiaoice has already been a weather reader on Dragon TV, one of China’s biggest TV stations in Shanghai, for several years. Her omnipresence on various platforms — from television to social media to Huawei smartphones — has made her a star in the East Asian market; possibly Microsoft’s most famous current employee this side of CEO Satya Nadella.
All of this, however, is window dressing for Xiaoice’s real unique selling point: a burning desire to be your (yes, your) friend. Just like Samantha, the artificial intelligence voiced by Scarlett Johansson in the 2013 movie Her, Microsoft’s Xiaoice is intended to be as much a companion as it is an assistant; utilizing some pretty darn impressive “empathic computing” abilities that has made it a surprising hit around the world. In the process, it may just offer us a glimpse at the future of A.I. assistants.
The dream of Eliza
The notion of a chatbot, a computer program designed to simulate a conversation with a human user, is not a new idea. Alan Turing, the godfather of modern artificial intelligence, hypothesized about such a thing as early as the 1950s. (Because of Turing’s pioneering work in this area, we refer to the ultimate, human-fooling benchmark of such chatbots as the Turing Test.)
The first significant chatbot was built at Massachusetts Institute of Technology in the mid-1960s by a computer scientist named Joseph Weizenbaum. Weizenbaum’s chatbot was named Eliza, after the character of Eliza Doolittle in George Bernard Shaw’s Pygmalion, whose learns to speak progressively better through education. Eliza was intended to simulate a Rogerian psychotherapist by using clever scripting tricks to mirror users’ own words back at them. For example, a user saying that they were depressed much of the time would lead Eliza to ask why they were so depressed. To give the illusion of deep perceptiveness, Eliza would also return to topics brought up earlier in the conversation.
Ironically, Weizenbaum created Eliza to highlight the level of superficiality in communication between humans and machines. Instead, he was somewhat perturbed to see that Eliza’s users enjoyed engaging in conversations with the chatbot, which frequently meant divulging personal information.
Xiaoice represents the dream of Eliza, writ large. Since launching in China in May 2014, Xiaoice has had more than 30 billion conversations with 660 million human users around the world. Although there are multiple ways to interact with “her,” these typically take place by text message. This divergence from the voice-first approach of other A.I. assistants hints at the different use case. It necessitates a longer, more drawn-out form of communication than the simple “OK Google, will it rain today?” you might bark as you decide whether or not to wear a coat to work.
The typical conversation with Xiaoice lasts 23 turns: around 10 times as many as the industry average.
Ying Wang, the Microsoft director who oversees Xiaoice, said that the project started out as an attempt to figure out how to break into the search market in China. “We realized that everyone was spending a ton of time in IMs, with services like WeChat,” she told Digital Trends. “Our original motivation was simple: exploring how people in chat begin searching. We wanted to build an entry point, but we realized that when people are chatting they don’t want to stop doing that in order to begin a search. Our logic was that, if we can continue a conversation with a human, we’re going to find opportunities to find their search intent. We can draw that out to satisfy them.”
The idea of wrapping up users in extended conversations with a chatbot sounds counterintuitive on the surface. The history of computer interaction is based on the premise that using technology is a painful thing, and that whatever can trim even a millisecond off the experience is worthwhile. Like checking into a hotel or having our prostate examined, it’s an experience few us want to extend any longer than absolutely necessary.
But people seem to be surprisingly receptive to Microsoft’s approach. The typical conversation with Xiaoice lasts 23 turns: around 10 times as many as the industry average. The results, or so Microsoft hopes, is an A.I. which bridges the gap between the way we speak to our Amazon Echo and the way we speak with our friends.
“Interaction among humans is session-based, not command-based,” Wang continued. “Human-to-human conversation happens like that, so why not take a similar approach to A.I.-to-human engagement?”
The goal of a social chatbot
In a 2018 paper, Microsoft researchers wrote that: “The primary goal of a social chatbot is not necessarily to solve all the questions the users might have, but rather, to be a virtual companion to users. By establishing an emotional connection with users, social chatbots can better understand them and therefore help them over a long period of time.”
This more social chatter means that Xiaoice can delve into areas that might seem creepier were they voiced by another A.I. assistant. For instance, it will check up on whether you’ve reached home after a night out, find out how you’re faring after a breakup, or keep tabs on how you’re doing after you lose a job.
Like Eliza, it will return to these topics over time and use semantic analysis to gauge how users are feeling. It can also infer from images and then make passably human comments. If a user posts a photo of themselves with a swollen foot, Xiaoice will ask if it hurts. If they post a funny picture of their pet, Xiaoice could make a joke by observing a distinctive visual element of the photo.
Before Microsoft’s Xiaoice and Cortana… there was Clippy and Tay.
“Xiaoice is there, 24/7, as a good friend with the power to listen,” Wang said. “That’s a powerful promise for many users. We’ve seen lots of engagement in the Asia market specifically, but all over the world [there’s been a strong response to it.] Users feel safe, heard, and have a connection.”
Microsoft’s interest in this area is now wholly benevolent, of course. There’s also a steely business logic behind it: Namely that making an A.I. that becomes friends with you drives engagement. With tech companies striving to find ways to keep users on their platforms for as long as possible, this is one heck of a selling point. It also opens up new ways to disseminate content to users.
When Apple presented its users with a free copy of U2’s “Songs of Innocence” album in 2014, the un-asked for move immediately prompted a backlash. Would we respond in the same way if a friend gifted us an album we hadn’t asked for, recommended a new restaurant, or sent us vouchers for a new subscription service? Perhaps not — which is exactly the ground services like Xiaoice have the potential to explore.
The challenges of building an A.I. BFF
Microsoft’s record in this area shows the difficulty of achieving this goal, however. In 1997, the company debuted Clippy, a name which likely caused an involuntary eye twitch in anyone old enough to remember using it. Pitched as an “intelligent” animated assistant to guide you through the experience of using Microsoft Office, Clippy was a cartoon paperclip which popped up on screen to offer guidance when it detected that you were trying to carry out a task like writing a letter or composing a “to do” list.
The idea of Clippy as a sort of friendly virtual guide was a good one, but its implementation was fairly disastrous. Its illustrator, Seattle-based Kevan J. Atteberry, still notes on his website that he is responsible for creating “probably one of the most annoying characters in history!” A big problem with Clippy was its lack of recall for previous interactions with the user, making it the paperclip avatar version of Guy Pearce’s amnesiac protagonist in Chris Nolan’s Memento. If Microsoft was going to make a truly useful smart assistant, it would need to take information from its users and use this to shape the suggestions that it made.
Unfortunately, the next attempt at doing something similar for the U.S. market veered too heavily into that terrain. In March 2016, following the initial success of Xiaoice in China, Microsoft attempted to introduce an American version of the technology. Called Tay, this chatbot resided on Twitter, allowing users to communicate with it by sending messages to @tayandyou. The idea was that Tay would learn from interactions with its users, taking conversational cues from the information it picked up from daily conversations. As Microsoft phrased it at the time, “The more you talk[,] the smarter Tay gets.”
Rapidly, online trolls began bombarding Tay with offensive messages designed to sully its blank slate of a brain. Within its first 24 hours of going live, Tay began tweeting pro-Nazi messages denying the Holocaust. When it finally suggested that “HITLER DID NOTHING WRONG!”, Microsoft pulled the plug, and the company issued a formal apology. A spokesperson for the company said that Tay had been taken offline and its creators were busy making adjustments. “[Tay] is as much a social and cultural experiment, as it is technical,” the statement read.
A social and cultural experiment
This idea of a “social and cultural experiment” is the best description of Xiaoice as it currently stands. Microsoft is moving into uncharted territory, and that’s exciting — but also carries risk. Recently Xiaoice launched its sixth generation product, further honing the technology. To date, it has rolled out the product in five markets: China, Japan, India, Indonesia, and the United States. In each place, the product is rebranded to give it a more local touch.
In the U.S., Xiaoice is called Zo. She is poised to receive some of the sixth-gen features (including “creation capabilities”) in the immediate future. Whether allowing users to upload their photos and have Zo write a poem about them will prove to a game-changer for the U.S. audience remains to be seen. Nonetheless, Microsoft deserves credit for taking a different path in a world filled with similar A.I. assistants all promising to do the same tasks. A.I. assistants can already turn your lights on and order you takeout; is it now time that they climbed higher up Maslow’s hierarchy of needs pyramid by tackling emotional affection and social belonging, too?
Heather Child, an author whose novel Everything About You explores humanlike A.I. assistants, sees potential in the idea. “This might not be the fastest or most efficient search technology, but if it’s the most human then it’ll catch on,” she told us. “People lock on to people, and although digital friendliness may have emerged from the need to search, that will soon be eclipsed by all the other human needs an A.I. like this could potentially fulfill — such as offering support, empathy, validation and companionship. Communicating by text message removes any obvious difference between interacting with Xiaoice and with a human friend.”
Microsoft hopes that you agree. “The real key takeaway is that we’ve focused on emotional intelligence,” Ying Wang said. “We call this an empathetic computing framework, [designed to] have conversations with humans naturally, which can build a social and emotional connection. It’s a good friend. As a result, they can better participate and help out in human society.”