The Most Important Technology You've Never Heard Of ...

Technology is more important to our lives than ever before, but how many of us understand the technology behind the screen? Here, Jeremy Reffin introduces you to the most important tech you’ve never heard of: natural language processing.

In his 1950 paper “Computing Machinery and Intelligence” Alan Turing proposed a test – he called it an “imitation game” – to determine if a computer could be considered intelligent. A human experimenter interviews a computer and another human via a text interface: the computer succeeds if the interviewer cannot reliably distinguish the two.[1] For Turing, intelligence meant the ability to communicate through language: both comprehending another’s ideas and transmitting your own. If you could do this, then you were intelligent, whether human or machine. The famous Turing test was born.

Turing predicted computers would pass this test by the turn of the century. Popular culture – from The Imitation Game and Ex Machina to Big Bang Theory – is fascinated by Turing’s vision and many envisage that an era of clever (or sometimes deviant) robots wandering the streets is just around the corner. Turing’s goal remains but the reality has been less exciting (or scary) than Hollywood movies and less successful than Turing’s predictions. No computer has, yet, passed Turing’s test, but interest in creating machines that can understand and use language is intense. The reason for this hasn’t been to create legions of killer robots, but to make computers more useful to us, and the technology that is being used to do it is called Natural Language Processing, or NLP.

NLP isn’t a new field. (In fact many consider NLP to have started with that famous Turing paper). It is a particular application of an artificial intelligence perspective to the interpretation of ‘natural’ language: words, phrases and sentences that convey meaning from one person to another. Researchers try to understand how language works to convey meaning, approaching the problem mechanistically. If we can distill that understanding into a mechanistic description then we can program a machine to imitate the process, feed it a large volume of text to allow it to learn the underlying structural patterns – and the Turing test is back on the cards.

Interest in the field really leapt in the 2000s onward when, thanks to the growth of social and other digital media, marketing and advertising companies realized they were collecting vast volumes of potentially commercially valuable text being created and shared by millions of users. NLP was suddenly relevant, indeed essential, for analyzing this data – searching it efficiently, identifying key pieces of information, detecting trends in language use and analysing people’s attitudinal responses.

Over the last two decades, devices that exploit NLP technology have been introduced and are infiltrating themselves into our lives. Voice recognition systems are installed on smartphones and sufficiently robust to be of some use for organising the minutae of our lives. Search engines are starting to identify the things that we are seeking rather than just looking for text that matches our search terms. Rough-and-ready automatic translation from one language to another helps us get by. But these changes are going to accelerate dramatically over the next 20 years, thanks to the powerful combination of four separate trends:

First, an explosion in demand. As our working and social lives have moved online, the volume of digitally-stored natural language has reached extraordinary levels. Its growth shows no sign of abating. Humans are themselves very slow at processing natural language so there is strong demand for systems that can automatically index, annotate, classify or otherwise process natural language. Google, Facebook, Amazon and the rest are now all leading experts in NLP.

Second, computing power is accelerating. For the past 50 years, computing power and computing memory capacity has doubled approximately every 2 years, driven by both component miniaturisation and design innovation. Having learnt to program in the 1980s, I still carry a mental model of 64KB (64,000 bytes of information) as a reasonably generous RAM space for a computer program and its running storage – a figure that is 4 million times too low for the box on which we currently run our NLP research experiments. These changes make it feasible to implement ideas for processing language that were previously impossible.

Third, we are living in a time when exciting new ideas in NLP are opening up amazing new possibilities. Much of the current excitement is centred around the concept of something called distributional semantics (the idea that the meaning or other characteristics of words and phrases can be captured purely by attending to the words that surround them) as implemented by multi-layer neural networks, a family of techniques recently popularised as “deep learning”. Deep learning approaches have application not just in NLP and the approach got a whole new level of attention in the business world in 2014 after Google bought the product-less London-based start-up DeepMind for £400 million. (More recently, deep learning has entered the popular imagination with the crowning of DeepMind’s AlphaGo program as the world’s best player of the fiendishly complex board game Go.)

Finally, NLP relies and thrives on data. Machines need to see large volumes of natural language at work in order to learn, and access to that information is key. And every tweet, blog and post we share – which is growing enormously every year – can be fed into the training models, making NLP models still better. All of this digitised natural language is both driving demand for solutions and providing the raw material that can effect them.

Forecasting is a fool’s errand, but in many respects the trends are fairly clear. So I think that over the next 15 years we can expect some exciting developments. Voice recognition systems will improve to the point at which the decision whether to speak to a machine or use a keyboard becomes purely a matter of convenience. Automated dialogue systems will expand gradually in usability and scope. Personal assistants (like Siri) will reach a tipping point where they are sufficiently useful that they will be adopted by most people. Automated systems will become sufficiently sophisticated to manage things like routine call-centre enquiries. Search engines will continue to get better and better at guessing what you mean, not just what you type. At some point in the next decade I think we’ll switch from keyboard to voice as the primary way we produce words on the screen. As increasingly powerful techniques sift, sort, categorise and locate pieces of information, business, government and individuals will be able to filter, access and assess natural-language based information much more easily and accurately. This will help shift business advertising, marketing and customer interaction even further online as well as cementing our social lives online.

NLP might not have passed the Turing test, but it is changing the way in which we process information, the way in which we communicate with machines, and the way in which machines communicate with us. It is one of the most important ways that humans and machines connect with each other, and in terms of the impact on our day to day lives, the significance of NLP is only going to increase.