AlphaFold: A new world of protein structure
Tiffany Zhou | January 23, 2023
How do proteins fold into their correct conformation? According to Levinthal’s paradox, a thought experiment published in 1969, there are about 10300 different ways a protein can possibly fold. This number amounts to over 2000 times the calculated number of protons in the universe! Given so many possibilities, it is an incredible feat of nature that proteins are able to achieve their correct conformation by making complex and accurate folds in milliseconds.
These possibilities underscore the difficulty scientists face when determining protein structure using traditional experimental methods like x-ray crystallography, NMR (nuclear magnetic resonance), and cryo-EM (cryogenic electron microscopy). While it is known that proteins engage with molecular chaperones that guide their folding through thermodynamic landscapes, it still takes years for scientists to discover and confirm the structure of a protein.
These years may now be only minutes, however, due to the recent creation of AlphaFold. A deep learning system created by DeepMind, an artificial intelligence research laboratory owned by Google, AlphaFold is able to predict the 3D structure of almost every known protein.
Why is it so important for scientists to know a protein’s structure? A common adage among structural biologists is “structure determines function.” Knowing the shape of a folded protein is fundamental to understanding its interactions with protein binding partners, and a mutation that affects this native fold can have devastating consequences for the human body.
The recent breakthrough provided by AlphaFold has been praised by experts in the field as one of the greatest achievements in AI in the 21st century, and is expected to revolutionize the fields of structural biology and drug discovery.
Vanderbilt alum Dr. John Jumper (B.S in Physics and Mathematics, 2007) served as lead research scientist of the team that developed AlphaFold. He, along with CEO Demis Hassabis, were awarded the $3 million Breakthrough Prize in Life Sciences in 2022 for this achievement.
How was AlphaFold developed and assessed?
In 2016, DeepMind successfully defeated the reigning Go world champion Lee Sedol by using the AlphaGo AI technology. This milestone event showed how machines could beat humans and potentially be harnessed to tackle difficult challenges, setting the stage for the development of AlphaFold. In a recent interview with Scientific American, CEO Demis Hassabis said, “That’s always been the mission of DeepMind: to develop general-purpose algorithms that could be applied really generally across many, many problems. We started off with games because it was really efficient to develop things and test things out in games [but our] end goal was [to develop] things like AlphaFold.”
The AlphaFold system is incredibly complex, being the subject of a 60-page Nature article, and was constructed using 32 different component algorithms. The team built a new version of a transformer, a machine learning model used in natural language processing, named Invariant Point Attention (IPA), capable of predicting 3D structures. IPA was trained on existing protein databases, including UniProt and Protein Data Bank (PDB), which contain various amino acid sequences and the known 3D protein structures of about 17% of all known proteins.
In November 2020, AlphaFold was put to the test at the Critical Assessment of Protein Structure Prediction (CASP) 13 competition. In the Global Distance Test, the structure predicted by AlphaFold was compared to the ground truth of known protein structures, with an ideal match rate of over 90%. AlphaFold reportedly scored 90% on accuracy in this competition. In comparison, the highest performing groups at CASP 11 and 12 had shown an increase in precision from 26% to 47%.
After winning CASP 13, the DeepMind team worked to further develop AlphaFold 2 with the help of biologists, chemists and other protein-folding specialists. One of the most important changes to the platform was the addition of confidence measures for specific regions of proteins, which allowed biologists to better interpret the results. The next year, at CASP 14, AlphaFold 2 swept the competition with its accuracy in the prediction of protein structure. John Moult, CASP co-founder, declared, “This is the first time a serious scientific problem has been solved by AI.”
“This is the first time a serious scientific problem has been solved by AI.” – John Moult, CASP co-founder
What impact is AlphaFold going to have on future drug discovery and scientific research?
Since its open source code is available online, AlphaFold has already served as a boon to researchers in various fields such as structural biology, drug discovery, protein design, and even in the fight against COVID-19. So far in its development, the database formed by AlphaFold has been used in research on neglected tropical diseases, the nuclear pore complex, new malaria vaccines, and a key SARS CoV-2 protein. On campus, AlphaFold has become an important resource for research labs at Vanderbilt and VUMC.
AlphaFold does have some limitations, however. Proteins are dynamic–constantly undergoing small conformational changes during chemical reactions and interactions with other proteins–so a static model cannot give a full picture of a protein’s shape. Furthermore, AlphaFold cannot replace wet lab research and clinical trials that determine whether new vaccines and drugs are safe and effective in humans. Finally, the structures predicted by AlphaFold are not necessarily accurate enough for the immediate discovery of new drug targets. However, scientists are optimistic that improvements in the training and computing power of the AI technology will render these discoveries possible in the future.
Akdel, Mehmet, et al. “A Structural Biology Community Assessment of AlphaFold2 Applications.” Nature News, Nature Publishing Group, 7 Nov. 2022, https://www.nature.com/articles/s41594-022-00849-w.
“AlphaFold: The Making of a Scientific Breakthrough.” YouTube, YouTube, 30 Nov. 2020, https://www.youtube.com/watch?v=gg7WjuFs8F4. Accessed 5 Dec. 2022.
Lewis, Tanya. “One of the Biggest Problems in Biology Has Finally Been Solved.” Scientific American, Scientific American, 31 Oct. 2022, https://www.scientificamerican.com/article/one-of-the-biggest-problems-in-biology-has-finally-been-solved/.
Saey, Tina Hesman. “The Alphafold AI Predicted the Structures of Nearly Every Protein Known to Science.” Science News, 26 Sept. 2022, https://www.sciencenews.org/article/alphafold-ai-protein-structure-folding-prediction.
Toews, Rob. “Alphafold Is the Most Important Achievement in AI-Ever.” Forbes, Forbes Magazine, 9 Nov. 2022, https://www.forbes.com/sites/robtoews/2021/10/03/alphafold-is-the-most-important-achievement-in-ai-ever/?sh=684565f26e0a.
Trafton, Anne. “Analyzing the Potential of Alphafold in Drug Discovery.” MIT News | Massachusetts Institute of Technology, https://news.mit.edu/2022/alphafold-potential-protein-drug-0906.
“Ribosome as part of a biological cell constructing mRNA molecule” from iStock