Computational biologist Jue Wang was already striving to develop an artificial intelligence (AI) to churn out candidate medicines when he had to rush his 2-year-old son to the hospital with a potentially lethal respiratory infection. After seeing his son quickly recover from respiratory syncytial virus (RSV), Wang, a postdoctoral assistant at the University of Washington (UW), Seattle, and his colleagues redoubled their efforts and yesterday they unveiled in Science a new AI software that can “paint” or “hallucinate” structures for proteins that don’t yet exist in nature. The software has already created original compounds for potential use in industrial reactions, cancer treatment, and even a vaccine candidate aimed at preventing RSV infections.
“It’s the perfect use of AI,” says Yang Zhang, a protein designer at the University of Michigan, Ann Arbor, who was not involved with the work. Though researchers have used computers and other means to design novel proteins for decades, AI approaches such as this are likely to increase the successes, Zhang says.
The AI developed by Wang and his colleagues builds on a string of recent advances in using computers to predict the 3D structure of natural proteins from their basic sequence of amino acids. Last year, an AI program called AlphaFold developed by DeepMind, a sister company of Google, whipped out predicted structures for hundreds of thousands of human proteins. AlphaFold and a similar AI software package called RoseTTAFold also offered thousands of likely structures of various proteins, each bound to a partner that it pairs with inside cells. Last year, such feats earned protein structure prediction software Science’s 2021 Breakthrough of the Year.
It’s one thing to predict how natural proteins might fold; it’s another to design new ones from scratch. In 2017, for example, researchers led by Wang’s boss, David Baker, a protein designer at UW, showed they could use an earlier AI-free protein structure prediction software program they had developed, called simply Rosetta, to design potential protein-based drugs that bind to and inactivate molecular targets on the influenza virus and a bacterial toxin. The team members started by feeding the software an already known piece of what they wanted—a small bit of protein structure, called the binding motif, that is able to bind to their target. They then had Rosetta scan a database of protein structures they had previously designed and find an existing scaffold that could possibly hold the active site in the correct shape. The software then put the two pieces together and tweaked the combination to make needed refinements.
The problem is the approach only worked when Rosetta identified an adequate scaffold. “You had to hope there was a good match,” Baker says. Not anymore. Wang, Baker, and colleagues have now adapted their AI-driven RoseTTAfold to dream up its own proteins from scratch using two different strategies. The first, called inpainting, starts like the previous effort, giving the AI a starting point, such as an active site or another key feature of a desired protein. Much as a word processor’s autocomplete function tries to complete a word after you’ve typed a few characters, the AI then draws on its understanding of how proteins fold to fill in additional parts of the protein around the central feature.
The second approach, known as constrained hallucination, is more wide open. It gives the software a goal for a protein, such as binding to a metal. The program then generates a virtual protein composed of a random sequence of amino acids, and mutates the sequence over and over, evaluating the impact of each change on the protein’s likely shape and, thus, function. The AI keeps pieces it deems effective and mutates the rest, steadily evolving toward the goal.
In both cases, the final predicted proteins can then be made in the lab and tested. And both strategies worked. Baker and his colleagues made novel proteins able to bind to receptors on cancer cells, grab metals in solution, and bind carbon dioxide for possible use in pulling it out of the atmosphere. Finally, to identify potential RSV vaccines, the team’s AI hallucinated 37 proteins aiming to present a key bit of the virus, called F protein site V to the immune system. Three of the 37 were found to bind to a known RSV neutralizing antibody, indicating their likely effectiveness.
The results aren’t always perfect. In several cases the activity of the new proteins, such as those designed to bind metals, didn’t initially match natural versions, notes Joe Watson, a postdoc in Baker’s lab. But by dreaming up different proteins, the software comes up with structures that have not been seen so far in nature. Researchers can then use those as starting points for other proven techniques for evolving improved proteins in the lab, Watson says. “This gives us a lot of new starting points.”