{"id":6395,"date":"2023-03-31T13:55:00","date_gmt":"2023-03-31T13:55:00","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2023\/03\/31\/speeding-up-drug-discovery-with-diffusion-generative-models\/"},"modified":"2023-03-31T13:55:00","modified_gmt":"2023-03-31T13:55:00","slug":"speeding-up-drug-discovery-with-diffusion-generative-models","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2023\/03\/31\/speeding-up-drug-discovery-with-diffusion-generative-models\/","title":{"rendered":"Speeding up drug discovery with diffusion generative models"},"content":{"rendered":"<p>Author: Alex Ouyang | Abdul Latif Jameel Clinic for Machine Learning in Health<\/p>\n<div>\n<p>With the release of platforms like DALL-E 2 and Midjourney, diffusion generative models have achieved mainstream popularity, owing to their ability to generate a series of absurd, breathtaking, and often meme-worthy images from text prompts like \u201c<a href=\"https:\/\/twitter.com\/sama\/status\/1511715302265942024?s=20&amp;t=uCJie6au5oeN-alZw6VLDQ\">teddy bears working on new AI research on the moon in the 1980s<\/a>.\u201d But a team of researchers at MIT&#8217;s Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic) thinks there could be more to diffusion generative models than just creating surreal images \u2014 they could accelerate the development of new drugs and reduce the likelihood of adverse side effects.<\/p>\n<p>A <a href=\"https:\/\/arxiv.org\/abs\/2210.01776\" target=\"_blank\" rel=\"noopener\">paper<\/a> introducing this new molecular docking model, called <a href=\"https:\/\/arxiv.org\/abs\/2210.01776\">DiffDock<\/a>, will be presented at the 11th International Conference on Learning Representations. The model&#8217;s unique approach to computational drug design is a paradigm shift from current state-of-the-art tools that most pharmaceutical companies use, presenting a major opportunity for an overhaul of the traditional drug development pipeline.<\/p>\n<p>Drugs typically function by interacting with the proteins that make up our bodies, or proteins of bacteria and viruses. Molecular docking was developed to gain insight into these interactions by predicting the atomic 3D coordinates with which a ligand (i.e., drug molecule) and protein could bind together.\u00a0<\/p>\n<p>While molecular docking has led to the successful identification of drugs that now treat HIV and cancer, with each drug averaging a decade of development time and <a href=\"https:\/\/www.nature.com\/articles\/nrd.2016.136\">90 percent<\/a> of drug candidates failing costly clinical trials (most studies estimate average drug development costs to be <a href=\"https:\/\/www.cbo.gov\/publication\/57126\">around $1 billion to over $2 billion per drug<\/a>), it\u2019s no wonder that researchers are looking for faster, more efficient ways to sift through potential drug molecules.<\/p>\n<p>Currently, most molecular docking tools used for in-silico drug design take a \u201csampling and scoring\u201d approach, searching for a ligand \u201cpose\u201d that best fits the protein pocket. This time-consuming process evaluates a large number of different poses, then scores them based on how well the ligand binds to the protein.<\/p>\n<p>In previous deep-learning solutions, molecular docking is treated as a regression problem. In other words, \u201cit assumes that you have a single target that you\u2019re trying to optimize for and there\u2019s a single right answer,\u201d says Gabriele Corso, co-author and second-year MIT PhD student in electrical engineering and computer science who is an affiliate of the MIT Computer Sciences and Artificial Intelligence Laboratory (CSAIL). \u201cWith generative modeling, you assume that there is a distribution of possible answers \u2014 this is critical in the presence of uncertainty.\u201d<\/p>\n<p>\u201cInstead of a single prediction as previously, you now allow multiple poses to be predicted, and each one with a different probability,\u201d adds Hannes St\u00e4rk, co-author and first-year MIT PhD student in electrical engineering and computer science who is an affiliate of the MIT Computer Sciences and Artificial Intelligence Laboratory (CSAIL). As a result, the model doesn&#8217;t need to compromise in attempting to arrive at a single conclusion, which can be a recipe for failure.<\/p>\n<p>To understand how diffusion generative models work, it is helpful to explain them based on image-generating diffusion models. Here, diffusion models gradually add random noise to a 2D image through a series of steps, destroying the data in the image until it becomes nothing but grainy static. A neural network is then trained to recover the original image by reversing this noising process. The model can then generate new data by starting from a random configuration and iteratively removing the noise.<\/p>\n<p>In the case of DiffDock, after being trained on a variety of ligand and protein poses, the model is able to successfully identify multiple binding sites on proteins that it has never encountered before. Instead of generating new image data, it generates new 3D coordinates that help the ligand find potential angles that would allow it to fit into the protein pocket.<\/p>\n<p>This \u201cblind docking\u201d approach creates new opportunities to take advantage of AlphaFold 2 (2020), DeepMind\u2019s famous protein folding AI model. Since AlphaFold 1\u2019s initial release in 2018, there has been a great deal of excitement in the research community over the potential of AlphaFold\u2019s computationally folded protein structures to help identify new drug mechanisms of action. But state-of-the-art molecular docking tools have yet to demonstrate that their performance in binding ligands to computationally predicted structures is any better than\u00a0<a href=\"https:\/\/news.mit.edu\/2022\/alphafold-potential-protein-drug-0906\">random chance<\/a>.<\/p>\n<p>Not only is DiffDock significantly more accurate than previous approaches to traditional docking benchmarks, thanks to its ability to reason at a higher scale and implicitly model some of the protein flexibility, DiffDock maintains high performance, even as other docking models begin to fail. In the more realistic scenario involving the use of computationally generated unbound protein structures, DiffDock places 22 percent of its predictions within 2 angstroms (widely considered to be the threshold for an accurate pose, 1\u00c5 corresponds to one over 10 billion meters), more than double other docking models barely hovering over 10 percent for some and dropping as low as 1.7 percent.<\/p>\n<p>These improvements create a new landscape of opportunities for biological research and drug discovery. For instance, many drugs are found via a process known as phenotypic screening, in which researchers observe the effects of a given drug on a disease without knowing which proteins the drug is acting upon. Discovering the mechanism of action of the drug is then critical to understanding how the drug can be improved and its potential side effects. This process, known as \u201creverse screening,\u201d can be extremely challenging and costly, but a combination of protein folding techniques and DiffDock may allow performing a large part of the process in silico, allowing potential \u201coff-target\u201d side effects to be identified early on before clinical trials take place.<\/p>\n<p>\u201cDiffDock makes drug target identification much more possible. Before, one had to do laborious and costly experiments (months to years) with each protein to define the drug docking. But now, one can screen many proteins and do the triaging virtually in a day,\u201d Tim Peterson, an assistant professor at the University of Washington St. Louis School of Medicine, says. Peterson used DiffDock to characterize the mechanism of action of a novel drug candidate treating aging-related diseases in a recent paper. \u201cThere is a very \u2018fate loves irony\u2019 aspect that Eroom\u2019s law \u2014 that drug discovery takes longer and costs more money each year \u2014 is being solved by its namesake Moore\u2019s law \u2014 that computers get faster and cheaper each year \u2014 using tools such as DiffDock.\u201d<\/p>\n<p>This work was conducted by MIT PhD students Gabriele Corso, Hannes St\u00e4rk, and Bowen Jing, and their advisors, Professor Regina Barzilay and Professor Tommi Jaakkola, and was supported by the Machine Learning for Pharmaceutical Discovery and Synthesis consortium, the Jameel Clinic, the DTRA Discovery of Medical Countermeasures Against New and Emerging Threats program, the DARPA Accelerated Molecular Discovery program, the Sanofi Computational Antibody Design grant, and a Department of Energy Computational Science Graduate Fellowship.<\/p>\n<\/div>\n<p><a href=\"https:\/\/news.mit.edu\/2023\/speeding-drug-discovery-with-diffusion-generative-models-diffdock-0331\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Alex Ouyang | Abdul Latif Jameel Clinic for Machine Learning in Health With the release of platforms like DALL-E 2 and Midjourney, diffusion generative [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2023\/03\/31\/speeding-up-drug-discovery-with-diffusion-generative-models\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":464,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/6395"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=6395"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/6395\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/458"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=6395"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=6395"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=6395"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}