{"id":8982,"date":"2026-04-17T04:00:00","date_gmt":"2026-04-17T04:00:00","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2026\/04\/17\/bringing-ai-driven-protein-design-tools-to-biologists-everywhere\/"},"modified":"2026-04-17T04:00:00","modified_gmt":"2026-04-17T04:00:00","slug":"bringing-ai-driven-protein-design-tools-to-biologists-everywhere","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2026\/04\/17\/bringing-ai-driven-protein-design-tools-to-biologists-everywhere\/","title":{"rendered":"Bringing AI-driven protein-design tools to biologists everywhere"},"content":{"rendered":"<p>Author: Zach Winn | MIT News<\/p>\n<div>\n<p>Artificial intelligence is already proving it can accelerate drug development and improve our understanding of disease. But to turn AI into novel treatments we need to get the latest, most powerful models into the hands of scientists.<\/p>\n<p>The problem is that most scientists aren\u2019t machine-learning experts. Now the company OpenProtein.AI is helping scientists stay on the cutting edge of AI with a no-code platform that gives them access to powerful foundation models and a suite of tools for designing proteins, predicting protein structure and function, and training models.<\/p>\n<p>The company, founded by Tristan Bepler PhD \u201920 and former MIT associate professor Tim Lu PhD \u201907, is already equipping researchers in pharmaceutical and biotech companies of all sizes with its tools, including internally developed foundation models for protein engineering. OpenProtein.AI also offers its platform to scientists in academia for free.<\/p>\n<p>\u201cIt\u2019s a really exciting time right now because these models can not only make protein engineering more efficient \u2014 which shortens development cycles for therapeutics and industrial uses \u2014 they can also enhance our ability to design new proteins with specific traits,\u201d Bepler says. \u201cWe\u2019re also thinking about applying these approaches to non-protein modalities. The big picture is we\u2019re creating a language for describing biological systems.\u201d<\/p>\n<p><strong>Advancing biology with AI<\/strong><\/p>\n<p>Bepler came to MIT in 2014 as part of the Computational and Systems Biology PhD Program, studying under Bonnie Berger, MIT\u2019s Simons Professor of Applied Mathematics. It was there that he realized how little we understand about the molecules that make up the building blocks of biology.<\/p>\n<p>\u201cWe hadn\u2019t characterized biomolecules and proteins well enough to create good predictive models of what, say, a whole genome circuit will do, or how a protein interaction network will behave,\u201d Bepler recalls. \u201cIt got me interested in understanding proteins at a more fine-grained level.\u201d<\/p>\n<p>Bepler began exploring ways to predict the chains of amino acids that make up proteins by analyzing evolutionary data. This was before Google released AlphaFold, a powerful prediction model for protein structure. The work led to one of the first generative AI models for understanding and designing proteins \u2014 what the team calls a protein language model.<\/p>\n<p>\u201cI was really excited about the classical framework of proteins and the relationships between their sequence, structure, and function. We don\u2019t understand those links well,\u201d Bepler says. \u201cSo how could we use these foundation models to skip the \u2018structure\u2019 component and go straight from sequence to function?\u201d<\/p>\n<p>After earning his PhD in 2020, Bepler entered Lu\u2019s lab in MIT\u2019s Department of Biological Engineering as a postdoc.<\/p>\n<p>\u201cThis was around the time when the idea of integrating AI with biology was starting to pick up,\u201d Lu recalls. \u201cTristan helped us build better computational models for biologic design. We also realized there\u2019s a disconnect between the most cutting-edge tools available and the biologists, who would love to use these things but don\u2019t know how to code. OpenProtein came from the idea of broadening access to these tools.\u201d<\/p>\n<p>Bepler had worked at the forefront of AI as part of his PhD. He knew the technology could help scientists accelerate their work.<\/p>\n<p>\u201cWe started with the idea to build a general-purpose platform for doing machine learning-in-the-loop protein engineering,\u201d Bepler says. \u201cWe wanted to build something that was user friendly because machine-learning ideas are kind of esoteric. They require implementation, GPUs, fine-tuning, designing libraries of sequences. Especially at that time, it was a lot for biologists to learn.\u201d<\/p>\n<p>OpenProtein\u2019s platform, in contrast, features an intuitive web interface for biologists to upload data and conduct protein engineering work with machine learning. It features a range of open-source models, including PoET, OpenProtein\u2019s flagship protein language model.<\/p>\n<p>PoET, short for Protein Evolutionary Transformer, was trained on protein groups to generate sets of related proteins. Bepler and his collaborators showed it could generalize about evolutionary constraints on proteins and incorporate new information on protein sequences without retraining, allowing other researchers to add experimental data to improve the model.<\/p>\n<p>\u201cResearchers can use their own data to train models and optimize protein sequences, and then they can use our other tools to analyze those proteins,\u201d Bepler says. \u201cPeople are generating libraries of protein sequences in silico [on computers] and then running them through predictive models to get validation and structural predictors. It\u2019s basically a no-code front-end, but we also have APIs for people who want to access it with code.\u201d<\/p>\n<p>The models help researchers design proteins faster, then decide which ones are promising enough for further lab testing. Researchers can also input proteins of interest, and the models can generate new ones with similar properties.<\/p>\n<p>Since its founding, OpenProtein\u2019s team has continued to add tools to its platform for researchers regardless of their lab size or resources.<\/p>\n<p>\u201cWe\u2019ve tried really hard to make the platform an open-ended toolbox,\u201d Bepler says. \u201cIt has specific workflows, but it\u2019s not tied specifically to one protein function or class of proteins. One of the great things about these models is they are very good at understanding proteins broadly. They learn about the whole space of possible proteins.\u201d<\/p>\n<p><strong>Enabling the next generation of therapies<\/strong><\/p>\n<p>The large pharmaceutical company Boehringer Ingelheim began using OpenProtein\u2019s platform in early 2025. Recently, the companies announced an expanded collaboration that will see OpenProtein\u2019s platform and models embedded into Boehringer Ingelheim\u2019s work as it engineers proteins to treat diseases like cancer and autoimmune or inflammatory conditions.<\/p>\n<p>Last year, OpenProtein also released a new version of its protein language model, PoET-2, that outperforms much larger models while using a small fraction of the computing resources and experimental data.<\/p>\n<p>\u201cWe really want to solve the question of how we describe proteins,\u201d Bepler says. \u201cWhat\u2019s the meaningful, domain-specific language of protein constraints we use as we generate them?<strong>\u00a0<\/strong>How can we bring in more evolutionary constraints? How can we describe an enzymatic reaction a protein carries out such that a model can generate sequences to do that reaction?\u201d<\/p>\n<p>Moving forward,<strong>\u00a0<\/strong>the founders are hoping to make models that factor in the changing, interconnected nature of protein function.<\/p>\n<p>\u201cThe area I am excited about is going beyond protein binding events to use these models to predict and design dynamic features, where the protein has to engage two, three, or four biological mechanisms at the same time, or change its function after binding,\u201d says Lu, who currently serves in an advisory role for the company.<\/p>\n<p>As progress in AI races forward, OpenProtein continues to see its mission as giving scientists the best tools to develop new treatments faster.<\/p>\n<p>\u201cAs work gets more complex, with approaches incorporating things like protein logic and dynamic therapies, the existing experimental toolsets become limiting,\u201d Lu says. \u201cIt\u2019s really important to create open ecosystems around AI and biology. There\u2019s a risk that AI resources could get so concentrated that the average researcher can\u2019t use them. Open access is super important for the scientific field to make progress.\u201d<\/p>\n<\/div>\n<p><a href=\"https:\/\/news.mit.edu\/2026\/bringing-ai-driven-protein-design-tools-everywhere-0417\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Zach Winn | MIT News Artificial intelligence is already proving it can accelerate drug development and improve our understanding of disease. But to turn [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2026\/04\/17\/bringing-ai-driven-protein-design-tools-to-biologists-everywhere\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":471,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/8982"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=8982"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/8982\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/470"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=8982"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=8982"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=8982"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}