Humans lose at everything else, so why is language different?
This last year we have seen two startling achievements from the AI community. First, was the match defeat of Lee Sedol by Google’s AlphaGo at Go. Second, was the thorough routing of Poker players by an AI based on a similar technology.
The reason these two achievements are significant is that they capture two functions previously thought to be far from the reach of machines: pattern matching and dealing with uncertainty.
On the other hand, two domains remain closed to AI for the foreseeable future. The first is muscle memory and acrobatics, which looks to be on the way out. Second is natural language. While natural language remains the holy grail of AI goal-setting, it has time over time remained the most difficult problem for our little machine friends.
Deep learning may change some of this. Recently the functions of transcribing and voicing text have both been accomplished by machines. The transfer of meaning however, is still limited only to translation.
So, why is meaningful conversation such a hard problem? The answer may lie in the fact that language is intimately tied to consciousness. Though we still do not have a robust theory of consciousness, in humans, we have seen that having a conversation lights up many parts of the brain at once. In other words, speech may be one of the highest faculties of our evolution and something that separates us from other animals and machines for that matter.
So what do we have right now? Aside from the deep-learning approaches that were used in the above linked technologies, we also have tried using probabilistic grammars and LSTMs to simulate human speech and to translate text-to-text. These technologies are both fundamentally limited against gaining the ability to see meaning in speech. So we are still at a point that every technique we can think of does not fare well in conversation.
Further research will surely be done along the deep-learning route. This, and neural nets, has been one of the most exciting developments of our time. After a long time on the shelf, deep learning technology is creating new ways of thinking about what it means for a machine to interact with speech or other domains of information.
In the next 12 months, I would expect that we will have much better speech recognition and text-to-speech functions across a larger variety of human languages. I also expect to find tooling to be built for interacting with computers more meaningfully. For now the best we can do is simply write a bunch of scripted dialogue that would otherwise be sent to a robo-caller machine.
It still seems unfair to try a full Turing test against these machines, however I hope there is much to be found along these paths. Did you hear the one about deep-learning detecting cancer? …
This post was originally published on medium.com