Your location:Home / Media Center / Industry News

When can AI deliver the drug discovery hits?

The CACHE hit-finding competition highlights the potential of AI to identify small molecules that bind to hard-to-drug targets — and the long road ahead for these computational screening approaches.


The dream of the AI revolution in smallmolecule drug discovery is clear and compelling. Take a target of choice, run a virtual screen against it, get some hit compounds — and you’re off. The reality is murkier. Despite big claims from biopharmacompanies, industry teams release scant details of what they are doing and how they are faring. Academic groups are more open, but often lack the resources to rigorously validate their techniques or their could-be contenders. The hype drowns out the hits.


The first results from the Critical Assessment of Computational Hit-finding Experiments (CACHE) competition now provide a crucial glimpse into the black box. Competitors made inroads against a hard target, but they still have a long way to go, the results showed. “CACHE is revealing the state of the art in computational hit finding. In a few years, when we look back, we will call this stone-age art,” says Matthieu Schapira, a computational chemist and CACHE coordinator at the Structural Genomics Consortium, the University of Toronto.


23 teams predicted over 2,000 compounds that they hoped would bind to the as-yet undrugged WD40 repeat (WDR) domain of LRRK2, a multifunctional enzyme that is associated with Parkinson disease. When these small molecules were tested in the lab, fewer than a dozen actually fit into the WDR pocket. “We’re at less than a 1% hit rate,” says Schapira. The hits that did make the cut were not very potent, he adds, with binding affinities ranging from 20 to 70 micromolar.


“My one surprise is that anything worked at all,” says Brian Shoichet, a virtual docking expert at UCSF who did not compete in the challenge. If you set the potency threshold higher, he cautions, there would have been no hits. “I told them this target was too hard,” he adds.


The difficulty stems from how little the field understands about LRRK2’s WDR domain, he adds. It has no known binders, and the best crystal structure on offer shows a big empty pocket without clear ligand–protein interaction points (Fig. 1). A home run hit against this kind of target would signal the start of a new era in computational drug discovery. The reality is more modest. “It’s better than random, but we are far from a breakthrough,” says Schapira. The point of CACHE, however, is not to find the winners but rather to accelerate the whole field. “We are awash in hype and need to have a more open conversation about the tools we’re using and how effective they are,” says Ryan Merkley, CEO of Conscience, the non-profit that is backing the challenge. “The only way we improve is if we start sharing our results, benchmarking tools against each other, and talking to each other about how to iterate and make those processes better.”


Compute a hit

CACHE is modelled after the Critical Assessment of protein Structure Prediction (CASP) challenge, in which entrants predict the structure of a protein based on its amino acid sequence. CASP launched in 1994, and 26 years later DeepMind’s AlphaFold rocked the field. Researchers are still coming to terms with what AI-enabled protein structure predictions mean for biology, and how to best use these results. Computational methods are already deeply embedded into drug discovery workflows, but continue to need more work. AI advocates hope that various types of machine learning will make these tools as transformative for drug discovery as AlphaFold has been for structure prediction. CACHE — organized initially by the Structural Genomics Consortium — set out to help the field overcome the many hurdles ahead.


image.png


In the first challenge, 23 teams signed up to predict ligands that would bind to LRRK2’s WDR domain. Entrants used their computational method of choice to predict 100 compounds that might hit the target. The CACHE team then ordered these compounds from the make-on demand small-molecule vendor Enamine, tested them in the lab for activity, and scored the results. Teams that identified compounds of interest were invited to suggest another 50 analogues, which were tested again. All of the results have been disclosed to the public. Competitors have the option to remain anonymous.


“CACHE is a great idea,” says Shoichet. Rigorous wet-bench testing of putative hits to rule out artefacts is exactly what the community needs, he says, and sets a higher bar for computational hit validation. “The literature is cluttered with papers where people report that a molecule is active, but they only use one assay and they don’t control for aggregation and this and that, and so you never really trust them.”


For CACHE, a five-member ‘Hit Evaluation Committee’ was tasked with assessing the findings. “A hit is somewhat in the eyes of the beholder, and different people are going to  look at that differently,” says Pat Walters, chief data officer at Relay Therapeutics and a member of the committee. Despite the low hit rate and the poor potency of the best performers, seven teams suggested a handful of diverse small molecules with good enough binding profiles and drug-like properties. These hits are “great places to start”, says Walters. 


“We’re making incremental changes and improvements, but it wasn’t like there was anything earth-shattering that came out of this first challenge,” he says.


Accuracy trumps both quantity and potency, adds Karen Akinsanya, president of R&D at Schrodinger. Akinsanya was not involved in CACHE, but her company develops widely used computational drug discovery tools. “A low hit rate isn’t a bad thing if your structure and binding mode are accurate,” she says. Whether or not any of these hits have legs may depend on whether the starting structure of WDR represented a good enough model of its biologically relevant form. Figuring out when structures — be they solved or predicted — are ready for virtual screening programmes remains a priority for the field, she wrote recently in Cell.


What does AI add?

CACHE competitors leaned on techniques that  are already deeply integrated into many drug discovery organizations. Top-scoring teams used: pharmacophore-based approaches, to figure out the features that small molecules use to interact with the pocket; ultra-high throughput docking, to quickly test the fit of billions of compounds in the pocket; molecular dynamics and free energy calculations, to take closer looks at how particular small molecules could interact with the target; and  fragment-based methods, to take lessons from even the smallest ligands.


“There wasn’t a constant theme; everyone did something completely different,” says Walters.


At the outset of CACHE, Schapira expected the challenge to show how simple methods compared with more complex ones. In the end, “they are all rather convoluted,” says Schapira.


Six of the top models used some form of next-generation machine learning. The exception was Christoph Gorgulla, at St. Jude Children’s Research Hospital, who used a suite of classical computational drug discovery tools called VirtualFlow 1 to run an ultra-large docking experiment — looking for binders from a library of 1.4 billion compounds.


This computational chemical space is so vast that docking of each and every compound into the pocket remains challenging. Several teams are experimenting with AI as an accelerant to pick and choose which compounds to dock, but VirtualFlow 1.0 uses a large number of CPUs to efficiently parallelize the dockings. (A second-generation VirtualFlow 2.0 surveys an even bigger 69 billion compounds, and relies on a predefined “sampling” strategy to focus its computational power). Some competitors also used AI to come up with scoring functions to rank how well docked compounds interacted with the target, but in Gorgulla’s hands the classical schemes still worked best. “One takeaway is that classical methods can still keep up with AI-based methods,” says Gorgulla.


The bulk of the computing power across the CACHE winners was dedicated to established computational drug discovery tools, adds Lukas Friedrich, principal scientist at Merck KgaA and another top-scoring competitor in the challenge. “Pharmacophore-based screening and ultra-high throughput docking are already offering solutions. The question is, do we need really complicated, advanced AI technologies at every stage of the process?”


The answer is likely to be contextdependent, varying for example with how much is known about a target’s structure and what it binds, he adds. “It’s not likely we’ll find a universally applicable solution for every target.”


The purpose of the programme matters too — with different tools likely to be useful for different aspects of the small-molecule discovery or optimization process. Friedrich and Christina Schindler, head of computational drug design at Merck KgaA, combined docking with a generative model called REINVENT to  design de novo molecules, followed by a similarity search to find purchasable compounds on Enamine. Merck KgaA is already using this generative approach in lead optimization programmes, says Schindler, and wanted to give it a run at hit discovery. “It’s definitely worth a try,” she says, based on her CACHE success.


The opportunity to see how different techniques fare in different scenarios is part of the appeal of CACHE, she adds. “I hope we’ll learn when we have the highest probability of successful virtual screening.”


Ctrl + alt + delete

Future rounds of CACHE are taking on targets with various properties, structural starting points and chemical matter baselines. Three more challenges are underway, another has been announced and others are in the works. “Every four months — boom, boom, boom, boom — we’ll be releasing a new data set. This is where it’s going to get exciting,” says Schapira.


Challenges two and three are looking for hits for the NSP13 helicase and the NSP3 macrodomain of SARS-CoV-2.


The fourth is dedicated to the TKB domain of CBLB, an E3 ubiquitin ligase, a focus of industry investment because of the role of these enzymes in targeted protein degradation. Contestants have access to an experimentally solved crystal structure of CBLB and to data on hundreds of binders.


Most of the entrants in the LRRK2 challenge were academic groups, but nearly 50% of the entrants into the CBLB challenge are biotech firms. “The profile seems to be changing,” says Schapira. “I don’t know whether it’s a trend that will continue, or it just talks to the different types of targets that we are nominating.” The more types of entrants CACHE can attract, the better, he adds.


The recently announced fifth round is spotlighting MCHR1, a GPCR that is involved in sleep, anxiety, depression and learning. Competitors will be given data on some 3,500 compounds with low nanomolar to high micromolar potency against the target. MCHR1 is the first CACHE target without an experimentally solved protein structure — forcing participants to try different ways to find hits.


Ligand-based methods that rely on available binding data will have a chance to shine. So too could structure-prediction tools. “I’m guessing that this will be a new playground for testing AlphaFold or RoseTTAFold,” says Shoichet.


Researchers are still working to understand how to best use predicted protein structures in drug discovery. In most reported studies to date, virtual screens that use crystal structures still seem to outperform those that rely on predicted structures. But Shoichet and colleagues recently reported in bioRxiv that AlphaFold-based screening had the edge for at least two GPCR proteins.


The reported virtual screening hit rate against these better-understood targets was 51–54%, for the σ2 receptor, and 23–26%, for the 5-HT2A receptor. The potency of confirmed hits was higher than those in the first CACHE challenge.


“If you’re careful, AlphaFold structures can be great templates for virtual docking,” says Shoichet. AlphaFold may sample a different set of protein conformations than do the solved crystal structures, he speculates, leading to different binding possibilities.


“This was one of the most genuinely exciting results that I saw last year,” says Walters. 


But drug discovery is about more than just binding, he adds. Computational approaches cannot yet reliably predict whether a small molecule will be soluble, a must-have for any  contender. AI has also yet to solve whether a predicted hit can be made in the lab, will cross the cell membrane, or will have off-target liabilities.


Drug discovery is a multi-parameter optimization problem, explains Walters. “Really we are trying to find something with the whole package,” he says.


Schapira sees this is coming, eventually. “We are poised for a breakthrough,” he says. “Some biotechs say it has already happened. Some people say it’s a few months away. I say it’s going to take a few years.”


Competitors will meet at a symposium in Toronto in March to discuss lessons learned from this first challenge.


Parkinson progress

The first round of CACHE results could also provide narrower gains. The Michael J. Fox Foundation (MJFF) backed the first challenge, pointing the team to the WDR domain of LRRK2 because of its role in Parkinson disease. Companies are already targeting LRRK2’s kinase domain, but small-molecule binders of the WDR domain remain elusive.


For Brian Fiske, the MJFF’s co-chief science officer, and Luis Oliveira, senior associate director of translational research, the CACHE results suggest that these candidates are within reach. “The CACHE competition has yielded promising preliminary results confirming the LRRK2 WDR domain is druggable,” wrote Fiske and Oliveira in an email. “Targeting the WDR domain offers an opportunity to diversify the LRRK2 therapeutic pipeline.”


Merkley hopes these results will prompt drug makers to take another look at LRRK2. His team is now considering possible paths forward for the hits that came out of the challenge. “There’s at least one hit that has really excited folks,” he says.


Source: News & analysis by Asher Mullard