Molecular docking is a process of predicting interactions between a drug-like molecule and a target protein. While it’s one of the crucial steps in Drug Discovery, the state of art methods are computationally expensive and require knowledge of the potential binding site. With an increasing need for more efficient methods, more of deep learning-based models have been explored within the field. This increasing popularity raises a question:
How do we know how certain the model's prediction is?
As a next step to building AIchemy
, a platform for Bayesian Deep Virtual Screening, we developed a Bayesian Equivariant Graph Neural Network (BEGNN), which connects the advantages of the deep learning approach with uncertainty prediction. The BEGNN predicts the binding pocket of the receptor (without any prior knowledge about it), and the bound pose and orientation of the ligand, while exploring the posterior density of the ligand’s graph molecular representations. Now you can know how uncertain the prediction is and use that to further qualify your list of top compounds.
As our test case, we perform docking of an inhibitor drug for leukemia and gastrointestinal stromal tumors from Novartis, Imatinib, against the Tyrosin Kinase 6HD6. We compare the results with Qvina-W, the current state of art method for blind docking (docking without the prior knowledge of the binding site) based on molecular dynamics. 100 poses have been sampled from the BEGNN, which took about 18s, only 1.8s per pose! The predictions are shown in green, while the ground truth is shown in red. In blue we show another ligand available in the 6HD6 crystal structure from PDB. We evaluate the binding pocket prediction using centroid distance and the ligand position and conformation using RMSD. The darker green indicates the pose with the lowest RMSD compared to the ground truth. We use Qvina-W with default settings, generating 10 poses with previously centered receptor and the searching space of 50x50x50.
The BEGNN completed a 100 pose generation faster than Qvina-W generated 10 poses, while making a more accurate prediction. As you can see it detected the binding pocket correctly while generating multiple conformers from the molecular posterior distribution. Yet another step bringing us closer to a more efficient, greener, and interpretable virtual screening for early stage small molecule drug discovery.