Google has recently announced that it has been developing a tool called “Duplex”, an artificially intelligent robot that can realistically mimic humans during phone conversations. The intention of this technology has primarily been outlined to function as a personal assistant, and examples have been shown of how it can reserve a table at a restaurant or book a hairdressing appointment. All of this can be achieved without the person on the other end even knowing that they just spoke to a robot.
The most captivating facet of this technology has come from how well it can resemble human patterns of speech. This includes using phrases like “hmm” and “um” which google describe as synthetic sounds aimed at naturally and humanistically conveying when the system is processing information.
The developers have also played with latency rates, recognizing that some responses should be quicker, such as general greetings, and others actually benefit from a longer waiting time. .
Interestingly, in some situations, we found it was actually helpful to introduce more latency to make the conversation feel more natural — for example, when replying to a really complex sentence.
The overall result is a hyper-realistic sounding piece of technology, that can not only find the right words, but can deliver them in a humanistic and natural way.
Replicating human speech is a simpler task on paper than in practice. This is mostly due to the fact that general conversation does not always follow a specific set of rules. Instead humans tend to speak using inference, colloquialisms and even silence to express particular points.
Humans also tend to use single words to express a number of points. Even something as simple as the word “yes” can mean a number of different things depending on the tone and context.
This poses an interesting challenge from a machine learning standpoint, since it requires making sense of large streams of seemingly similar data that has to be contextually understood and subsequently applied.
Google Duplex’s conversations sound natural thanks to advances in understanding, interacting, timing, and speaking.
Duplex utilizes a Recurrent Neural Network, and trains it on phone conversation data so that it can learn patterns of speech and dialogue when carrying out specific tasks, such as ordering a table at a restaurant or booking an appointment. This also means that the machines have a singular focus and aim when carrying out these calls, and will not be able to engage in ordinary conversations as it stands.
The technology also employs a self-monitoring capability, whereby the system is actually able to assess in real-time whether or not it will be able to handle the tasks it is faced with, and to contact an actual operator if not. However, Google claim that the majority of interactions are entirely self autonomous and have not required this additional step thus far.
Google are also using real-time supervised training to educate the system, whereby experienced operators can affect the machine’s behavior in real-time if necessary, as a means of training the machine until it is able to handle all calls autonomously.
Humans are already accustomed to talking to robots at callcentres, and Google claim that this will be a way to ultimately optimize that experience and eradicate some of the frustrations associated with that experience.
However, it is yet to be seen how people feel interacting with these machines. In some ways the experience of talking to an obvious robot at a callcentre can be reassuring, simply because we are at least aware that this is the case. This poses an additional interesting challenge for Google, as to how they will eventually decide to roll out this technology. Will recipients of phone calls be aware they are speaking to a robot, and whether or not this proves to be an important factor to people is certainly a fascinating question.
From a business perspective there are certainly a lot of exciting applications that this technology could be used towards. It could mean the introduction of around the clock customer service, and a larger number of phone calls being taken. Perhaps businesses will also be more willing to let robots handle the bulk of their incoming calls, if they know that customers may be less put off by the experience.
Ultimately this technology has sparked a lot of debate, which good innovation generally strives to do, and whilst the results of it are ultimately still to be seen, it is certainly without doubt that this represents an intriguing move in the technological sphere, and displays one of the many amazing things that can be achieved using machine learning capabilities.
Stay up to date with latest articles about AI and Machine learning tools.