AI system can predict the structures of life’s molecules with stunning accuracy – helping to solve one of biology’s biggest problems
By Charlotte Dodson, University of Bath and Richard Bayliss, University of Leeds
AlphaFold 3, unveiled to the world on May 9, is the latest version of an algorithm designed to predict the structures of proteins – vital molecules used by all life – from the “instruction code” in their building blocks.
Predicting protein structures and the way they interact with other molecules has been one of the biggest problems in biology. Yet, AI developer Google DeepMind has gone some way to solving it in the last few years. This new version of the AI system features improved function and accuracy over its predecessors.
Like the next release in a video-game franchise, structural biologists – and most recently – chemists have been waiting with impatience to see what it can do. DNA is widely understood as the instruction book for a living organism but, inside our cells, proteins are the molecules that actually carry out most of the work.
It is proteins that enable our cells to sense the world outside, to integrate information from different signals, to make new molecules within the cell, to decide to grow or to stop growing.
It is also proteins that enable the body to distinguish between foreign invaders (bacteria, viruses) and itself. And it is proteins that are the targets of most drugs that you or I take to treat disease.
Protein Lego
Why does protein structure matter? Proteins are large molecules consisting of thousands of atoms in very specific orders. The order of these atoms, and the way that they are arranged in 3D space, is crucial to a protein being able to carry out its biological function.
This same 3D arrangement also determines the way in which a drug molecule binds to its protein target and treats disease.
Imagine having a Lego set in which the bricks are not based on cuboids, but can be any shape. In order to put two bricks together in this set, each brick will need to fit snugly against the other without any holes. But this isn’t enough – the two bricks will also need to have the right combination of bumps and holes for the bricks to stay in place.
Designing a new drug molecule is a bit like playing with this new Lego set. Someone has built an enormous model already (the protein target found in our cells), and the job of the drug discovery chemist is to use their tool-kit to put a handful of bricks together that will bind to a particular part of the protein and – in biological terms – stop it carrying out its normal function.
So what does AlphaFold do? Based on knowing exactly which atoms are in any protein, how these atoms have evolved differently in different species, and what other protein structures look like, AlphaFold is very good at predicting the 3D structure of any protein.
AlphaFold 3, the most recent iteration, has expanded capabilities to model nucleic acids, for example, pieces of DNA. It can also predict the shapes of proteins that have been modified with chemical groups that may turn the protein on or off, or with sugar molecules. This gives scientists more than just a bigger, more colourful Lego set to play with. It means they can develop more detailed models of reading and correcting the genetic code and of cellular control mechanisms.
This is important in understanding disease processes at a molecular level and in developing drugs that target proteins whose biological role is regulating which genes are turned on or off. The new version of AlphaFold also predicts antibodies with greater accuracy than previous versions.
Antibodies are important proteins in biology in their own right, forming a vital part of the immune system. They are also used as biological drugs such as trastuzumab, for breast cancer, and infliximab, for diseases such as inflammatory bowel disease and rheumatoid arthritis.
The latest version of AlphaFold can predict the structure of proteins bound to drug-like small molecules. Drug discovery chemists can already predict the way in which a potential drug binds to its protein target if the 3D structure of the target has been identified through experiments. The downside is this process can take months or even years.
Predicting the way in which potential drugs and protein targets bind to each other is used to help decide which potential drugs to synthesise and test in the laboratory. AlphaFold 3 can not only predict drug binding in the absence of an experimentally identified protein structure but, in testing, it outperformed existing software predictions, even if the target structure and drug binding site were known.
These new capabilities make AlphaFold 3 an exciting addition to the repertoire of tools used to discover new therapeutic drugs. More accurate predictions will enable better decisions to be taken about which potential drugs to test in the lab (and which are unlikely to be effective).
Time and money
This saves both time and money. AlphaFold 3 also provides the opportunity to make predictions about drug binding to modified forms of the protein target which are biologically relevant but currently difficult – or impossible – to do using existing software. Examples of this are proteins modified by chemical groups such as phosphates or sugars.
Of course, as with any new potential drug, extensive experimental testing for safety and efficacy – including in human volunteers – is always needed before approval as a licensed medicine.
AlphaFold 3 does have some limitations. Like its predecessors, it is poor at predicting the behaviour of protein areas that lack a fixed or ordered structure. It is poor at predicting multiple conformations of a protein (which may change shape due to drug binding or as part of its normal biology) and cannot predict protein dynamics.
It can also make some slightly embarrassing chemical mistakes such as putting atoms on top of each other (physically impossible), and in replacing some details of a structure with its mirror images (biologically or chemically impossible).
A more substantial limitation is that the code will – for now at least – be unavailable so it will have to be used on the DeepMind server on a purely non-commercial basis. Although many academic users will not be put off by this, it will limit the enthusiasm of expert modellers, biotechnologists and many applications in drug discovery.
Despite this, the release of AlphaFold 3 looks certain to stimulate a new wave of creativity in both drug discovery and structural biology more widely – and we’re already looking forward to AlphaFold 4.
Charlotte Dodson, Senior Lecturer in Drug Discovery, Department of Life Sciences, University of Bath and Richard Bayliss, Professor of Molecular Medicine, School of Molecular and Cellular Biology, University of Leeds
This article is republished from The Conversation under a Creative Commons license. Read the original article.