{"id":5155,"date":"2021-10-28T19:00:00","date_gmt":"2021-10-28T19:00:00","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2021\/10\/28\/introducing-pathways-a-next-generation-ai-architecture\/"},"modified":"2021-10-28T19:00:00","modified_gmt":"2021-10-28T19:00:00","slug":"introducing-pathways-a-next-generation-ai-architecture","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2021\/10\/28\/introducing-pathways-a-next-generation-ai-architecture\/","title":{"rendered":"Introducing Pathways: A next-generation AI architecture"},"content":{"rendered":"<p>Author: <\/p>\n<div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>When I reflect on the past two decades of computer science research, few things inspire me more than the remarkable progress we\u2019ve seen in the field of artificial intelligence.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>In 2001, some colleagues sitting just a few feet away from me at Google realized they could use an obscure technique called machine learning to help correct misspelled Search queries. (I remember I was amazed to see it work on everything from \u201cayambic pitnamiter\u201d to \u201cunnblevaiabel\u201d). Today, AI augments many of the things that we do, whether that\u2019s helping you <a href=\"https:\/\/ai.googleblog.com\/2021\/06\/take-all-your-pictures-to-cleaners-with.html\">capture a nice selfie<\/a>, or providing <a href=\"https:\/\/blog.google\/products\/search\/how-ai-making-information-more-useful\/\">more useful search results<\/a>, or warning hundreds of millions of people <a href=\"https:\/\/ai.googleblog.com\/2020\/09\/the-technology-behind-our-recent.html\">when and where flooding will occur<\/a>. Twenty years of advances in research have helped elevate AI from a promising idea to an indispensable aid in billions of people\u2019s daily lives. And for all that progress, I\u2019m still excited about its as-yet-untapped potential \u2013 AI is poised to help humanity confront some of the toughest challenges we\u2019ve ever faced, from persistent problems like illness and inequality to emerging threats like climate change.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>But matching the depth and complexity of those urgent challenges will require new, more capable AI systems \u2013 systems that can combine AI\u2019s proven approaches with nascent research directions to be able to solve problems we are unable to solve today. To that end, teams across Google Research are working on elements of a next-generation AI architecture we think will help realize such systems.<\/p>\n<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<h3>We call this new AI architecture Pathways.<\/h3>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Pathways is a new way of thinking about AI that addresses many of the weaknesses of existing systems and synthesizes their strengths. To show you what I mean, let\u2019s walk through some of AI\u2019s current shortcomings and how Pathways can improve upon them.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p><b>Today&#8217;s AI models are typically trained to do only one thing. Pathways will enable us to train a single model to do thousands or millions of things.<\/b><\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Today\u2019s AI systems are often trained from scratch for each new problem \u2013 the mathematical model\u2019s parameters are initiated literally with random numbers. Imagine if, every time you learned a new skill (jumping rope, for example),\u00a0you forgot everything you\u2019d learned \u2013 how to balance, how to leap, how to coordinate the movement of your hands \u2013 and started learning each new skill from nothing.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>That\u2019s more or less how we train most machine learning models today. Rather than extending existing models to learn new tasks, we train each new model from nothing to do one thing and one thing only (or we sometimes specialize a general model to a specific task). The result is that we end up developing thousands of models for thousands of individual tasks. Not only does learning each new task take longer this way, but it also requires much more data to learn each new task, since we\u2019re trying to learn everything about the world and the specifics of that task from nothing (completely unlike how people approach new tasks).<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Instead, we\u2019d like to train one model that can not only handle many separate tasks, but also draw upon and combine its existing skills to learn new tasks faster and more effectively. That way what a model learns by training on one task \u2013 say, learning how aerial images can predict the elevation of a landscape \u2013 could help it learn another task &#8212; say, predicting how flood waters will flow through that terrain.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>We want a model to have different capabilities that can be called upon as needed, and stitched together to perform new, more complex tasks \u2013 a bit closer to the way the mammalian brain generalizes across tasks.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p><b>Today&#8217;s models mostly focus on one sense. Pathways will enable multiple senses.<\/b><\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>People rely on multiple senses to perceive the world. That\u2019s very different from how contemporary AI systems digest information. Most of today\u2019s models process just one modality of information at a time. They can take in text, or images or speech \u2014 but typically not all three at once.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>Pathways could enable multimodal models that encompass vision, auditory, and language understanding simultaneously. So whether the model is processing the word \u201cleopard,\u201d the sound of someone saying \u201cleopard,\u201d or a video of a leopard running, the same response is activated internally: the concept of a leopard. The result is a model that\u2019s more insightful and less prone to mistakes and biases.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>And of course an AI model needn\u2019t be restricted to these familiar senses; Pathways could handle more abstract forms of data, helping find useful patterns that have eluded human scientists in complex systems such as climate dynamics.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p><b>Today&#8217;s models are dense and inefficient. Pathways will make them sparse and efficient.<\/b><\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>A third problem is that most of today\u2019s models are \u201cdense,\u201d which means the whole neural network activates to accomplish a task, regardless of whether it\u2019s very simple or really complicated.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>This, too, is very unlike the way people approach problems. We have many different parts of our brain that are specialized for different tasks, yet we only call upon the relevant pieces for a given situation. There are close to a hundred billion neurons in your brain, but you rely on a small fraction of them to interpret this sentence.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>AI can work the same way. We can build a single model that is \u201csparsely\u201d activated, which means only small pathways through the network are called into action as needed. In fact, the model dynamically learns which parts of the network are good at which tasks &#8212; it learns how to route tasks through the most relevant parts of the model. A big benefit to this kind of architecture is that it not only has a larger capacity to learn a variety of tasks, but it\u2019s also faster and much more energy efficient, because we don\u2019t activate the entire network for every task.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>For example, GShard and Switch Transformer are two of the largest machine learning models we\u2019ve ever created, but because both use sparse activation, they <a href=\"https:\/\/blog.google\/technology\/ai\/minimizing-carbon-footprint\/\">consume less than 1\/10th the energy<\/a> that you\u2019d expect of similarly sized dense models \u2014 while being as accurate as dense models.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>So to recap: today\u2019s machine learning models tend to overspecialize at individual tasks when they could excel at many. They rely on one form of input when they could synthesize several. And too often they resort to brute force when deftness and specialization of expertise would do.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>That\u2019s why we\u2019re building Pathways. Pathways will enable a single AI system to generalize across thousands or millions of tasks, to understand different types of data, and to do so with remarkable efficiency \u2013 advancing us from the era of single-purpose models that merely recognize patterns to one in which more general-purpose intelligent systems reflect a deeper understanding of our world and can adapt to new needs.<\/p>\n<\/div>\n<\/div>\n<div class=\"block-paragraph\">\n<div class=\"rich-text\">\n<p>That last point is crucial. We\u2019re familiar with many of today\u2019s biggest global challenges, and working on technologies to <a href=\"https:\/\/blog.google\/outreach-initiatives\/sustainability\/sustainability-2021\/\">help address them<\/a>. But we\u2019re also sure there are major future challenges we haven\u2019t yet anticipated, and many will demand urgent solutions. So, with great care, and always in line with our AI Principles, we\u2019re crafting the kind of next-generation AI system that can quickly adapt to new needs and solve new problems all around the world as they arise, helping humanity make the most of the future ahead of us.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><a href=\"https:\/\/blog.google\/technology\/ai\/introducing-pathways-next-generation-ai-architecture\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: When I reflect on the past two decades of computer science research, few things inspire me more than the remarkable progress we\u2019ve seen in [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2021\/10\/28\/introducing-pathways-a-next-generation-ai-architecture\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":471,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5155"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=5155"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5155\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/462"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=5155"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=5155"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=5155"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}