Artificial intelligence. Deep learning. Neural networks. Think you can ignore this stuff? Think again. Because machine learning is far too important to be left to the ComScis
Nearly six decades have passed since Arthur Samuel, a researcher at IBM, began programming computers to play draughts. He went on to create a program that could challenge a respectable amateur. Last year, DeepMind, founded by Demis Hassabis (Queens’ 1994), unveiled AlphaZero – a computer program that had ‘taught’ itself to play chess from scratch in just four hours, with no initial input of human knowledge. AlphaZero went on to beat the recognised leading chess program Stockfish – by sacrificing a bishop in an apparently reckless move. The machine was learning to do the unexpected, by itself.
Machine learning would be impossible without the astonishing increase in computational power (an estimated 1 trillionfold increase in performance over 60 years) which has enabled computers to examine big – often massive – datasets. This ‘big data’, whether of heart scans or Amazon purchasing records, provides the source material from which the program learns.
Professor of Computational Biology Pietro Lio’ is using big data and machine learning to provide clinicians with better information by combining data of different types – medical imagery (MRI scans, x-rays), electronic health records (blood and glucose tests) and audiovisual data – and from different sources, such as (for their current work with cancer patients) radiology, epidemiology and inflammology. “The tools biologists use to put together theory and application in medicine need to be made easier for physicians to access, and machine learning enables the transfer of learning across disciplines. By creating communities and bridging data gaps, we can receive new data to build back into our theories.”
However, Lio’s warns that data alone is not enough. As he says, “there is a widespread belief that you can do hypothesis-free research because, having enough data, machine learning will provide the answers in a semiautomatic way”. For a system to be intelligent, it must be able to learn from experience: to process information and perform tasks by considering examples without having been given any task-specific rules. This is where virtual neural networks come in, taking the principles of maths as their blueprint.
At the base of the architecture of a machine learning system are the ‘neurons’: mathematical functions containing ‘weights’ and ‘biases’. Weights are the numbers the neural network learns in order to generalise a problem. Biases are the numbers the network concludes it should add after multiplying the weights with the data. When neurons receive data, they perform computations and then apply the weights and biases. Usually, these are initially wrong, but the network then trains itself to learn the optimal biases, gaining complexity.
However, while the earliest systems had just two layers – input and output – most current systems have more layers and so the network is referred to as ‘deep’. And as deep learning effectively takes place in a multi-layered black box, where algorithms evolve and self-learn, scientists often do not know how the system arrives at its results. So, while virtual neural networks have been around for a long time, combine them with deep learning and you get a game-changer that can still baffle scientists.
In fact, deep learning systems can confidently give wrong answers while providing limited insight into why they have come to particular decisions. As Dr Richard Turner, Reader in Machine Learning in the Department of Engineering, explains: “Machine learning methods use hierarchical [knowledge] representations and in conventional machine learning, most approaches are super-confident. Traditionally, the data is fed-forward. The issue today is to develop algorithms that represent uncertainty, so that the machine will not only recognise and flag up when it makes a mistake, but actually learn from it”.
“Effectively, the system will train itself from the data it receives, including from other data sources, reducing the probability of making a mistake in the future.”
This is important in Turner’s work on climate science, where he and his team are working with well-calibrated uncertainty. He says: “At the moment, we definitely can’t predict the weather but, over time, the system can test whether, for example, its prediction of how many days will be over 27°C was correct and what it should then infer from that information.”
Turner points out that any system operating in the real world needs to learn to adapt rather than remain static. Take the classification of self-driving cars; every year there will be new models, with new performance parameters. When new tasks are added, deep neural networks are prone to what is known as ‘catastrophically forgetting’ previous tasks. So, Turner and his colleagues are working to build networks that are capable of assimilating new information incrementally. For the system to be effective, it must undergo continuous layered learning – as circumstances change, the algorithms constantly update – just as children learn by experience.
And then there is the problem of language. “For most applications, artificial intelligence has to be able to access human knowledge, and of the whole range of cognitive abilities, language is probably the most important,” says Ann Copestake, Professor of Computational Linguistics and Head of the Computer Science and Technology Department. “If AI is intended to mimic human intelligence, then the ability for machines to understand and communicate language is vital. Yet, for many years, Natural Language Processing (NLP) and AI were largely separate subjects. Only now are we finding commonality.”
Copestake and her team are using techniques such as supervised learning to ‘teach’ the system how to answer visual questions. Until recently, “the only way to do this was to cheat – the system would identify a visual image because the experimenter had already placed a symbol that allows the system to ‘see’ that image”.
By scraping multiple images from the web to show to the system, researchers in companies and universities have been building a shared visual library. Deep learning enables a computer system to work out how to identify a cat – without any human input about cat features – after ‘seeing’ millions of random images. But then, as Copestake points out, “you do get a lot of images of cats on the internet, so perhaps it is not surprising that this was one of the first images a system learned to identify. Same goes for guitars, which are the most common musical instrument. In fact, one of the bizarre findings is that if you show the system the descriptive text, and not the image, they do quite well at answering questions about the image purely through frequent associations”.
And that matters because human language is grounded in the world – it has some physical presence. “Whatever system you are building, you first have to model that world and then match it with language,” explains Copestake. “We have gone back to showing simple images such as coloured squares and triangles and we then ask the system quite complicated questions.”
Professor of Mobile Systems, Cecilia Mascolo, uses machine learning techniques to make sense of the data she and her team gather from mobile wearable systems, mostly phones at the moment, but poised to encompass everything from necklaces and headbands to implants into the body. “The sensors continuously collect data that is both temporal and spatial, allowing us to analyse stress and emotional signals, as well as location,” she says. “The data is gathered in the outside world, so it can be noisy. Through machine learning, we can strip out extraneous information and ensure the mobile device only sends out the data that is important for our scientific work in applications for measuring health. Given the unprecedented scalable effect, we can reach populations for continuous diagnostics and illness progression monitoring: for example, we are working alongside neurologists on the development techniques to diagnose Alzheimer’s disease.”
Traditionally, machine learning requires huge machines, but Mascolo’s team at the Centre for Mobile, Wearable Systems and Augmented Intelligence is seeking to use it on devices with limited memory and battery. As Mascolo explains: “It is, of course, much easier to use machine learning in the cloud, but our work is on locally gathered data. If we can get this to work, it will help break down the barriers against acceptance of mobile health measuring because, by taking a local approach, individuals will feel more in control of their data.”Indeed, ‘feeling in control of our data’ will be key to the future of machine learning. Today, the artificial intelligence we encounter is ‘narrow’ – designed to undertake a defined (narrow) task, whether that is voice and facial recognition or driving a car. However, the long-term aim for some scientists is super-intelligence: ‘general’ or ‘strong’ artificial intelligence – designed to perform all tasks as well as or better than humans.
But Richard Turner, for one, prefers to stay away from using the word ‘intelligence’. “It is very loaded. People talk about artificial intelligence for hype, particularly when they are raising money, but there is still nothing out there that can mimic true human intelligence,” he says. “Talk of singularity – the tipping point when artificial intelligence overtakes human thinking – is a distraction in my opinion. The field is split, but it is not clear to me that a general intelligence system is a sensible near-term scientific goal.”
For Ann Copestake, responsible for the education of the scientists who will be the ones who decide what the future looks like, the issue is a live one. “We now teach machine learning to first-year undergraduates, rather than starting at PhD and Master’s level, because it is important to begin early with this fundamentally new way of programming,” she says. “But it remains an experimental, probability-based science, which must be approached in a controlled way, with proper methodology to produce effective results. After all, machine learning can create as well as detect fake news, and we need human intelligence to control it.”
A new DeepMind Chair of Machine Learning to be established at Cambridge will build on existing strengths in computer science and engineering, and be a focal point for the wide range of AI-related research which currently takes place across the University. The post is the result of a gift from DeepMind, the British AI company founded by Demis Hassabis (Queens’ 1994). The first DeepMind Chair is expected to take up their position in October 2019.