{"id":5652,"date":"2022-05-25T18:55:00","date_gmt":"2022-05-25T18:55:00","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2022\/05\/25\/is-diversity-the-key-to-collaboration-new-ai-research-suggests-so\/"},"modified":"2022-05-25T18:55:00","modified_gmt":"2022-05-25T18:55:00","slug":"is-diversity-the-key-to-collaboration-new-ai-research-suggests-so","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2022\/05\/25\/is-diversity-the-key-to-collaboration-new-ai-research-suggests-so\/","title":{"rendered":"Is diversity the key to collaboration? New AI research suggests so"},"content":{"rendered":"<p>Author: Kylie Foy | MIT Lincoln Laboratory<\/p>\n<div>\n<p>As artificial intelligence gets better at performing tasks once\u00a0solely in the hands of humans, like driving cars, many see teaming intelligence as a next frontier. In this future, humans and AI are true partners in high-stakes jobs, such as performing\u00a0complex surgery or\u00a0<a href=\"https:\/\/www.defense.gov\/News\/News-Stories\/Article\/Article\/2730215\/vice-admiral-discusses-potential-of-ai-in-missile-defense-testing-operations\/\">defending from missiles<\/a>.\u00a0But before teaming intelligence\u00a0can take off, researchers must overcome a\u00a0problem that corrodes cooperation:\u00a0<a href=\"https:\/\/www.ll.mit.edu\/news\/ai-smart-does-it-play-well-others\">humans often do not like or trust their AI partners<\/a>.\u00a0<\/p>\n<p>Now, new research points to diversity as being a key parameter for making AI a better team player. \u00a0<\/p>\n<p>MIT Lincoln Laboratory researchers have found that training an AI model with mathematically &#8220;diverse&#8221; teammates improves its ability to collaborate with other AI it has never worked with before, in the card game Hanabi. Moreover, both\u00a0<a href=\"https:\/\/proceedings.mlr.press\/v139\/lupu21a.html\">Facebook<\/a>\u00a0and\u00a0<a href=\"https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/797134c3e42371bb4979a462eb2f042a-Abstract.html\">Google\u2019s DeepMind<\/a>\u00a0concurrently published independent work that also infused diversity into training to improve outcomes in human-AI collaborative games. \u00a0<\/p>\n<p>Altogether, the results may point researchers down a promising path to making AI that can both perform well and be seen as good collaborators by human teammates. \u00a0<\/p>\n<p>&#8220;The fact that we all converged on the same idea \u2014 that if you want to cooperate, you need to train in a diverse setting \u2014 is exciting, and I believe it really sets the stage for the future work in cooperative AI,&#8221; says Ross Allen, a researcher in Lincoln Laboratory\u2019s Artificial Intelligence Technology Group and co-author of a <a href=\"https:\/\/arxiv.org\/abs\/2201.12436\">paper detailing this work<\/a>, which was recently presented at the International Conference on Autonomous Agents and Multi-Agent Systems. \u00a0\u00a0<\/p>\n<p><strong>Adapting to different behaviors<\/strong><\/p>\n<p>To develop cooperative AI, many researchers are using Hanabi as a testing ground. Hanabi challenges players to work together to stack cards in order, but players can\u00a0only see their teammates\u2019 cards and can only give\u00a0sparse clues to each other about which cards they hold.\u00a0<\/p>\n<p>In\u00a0<a href=\"https:\/\/arxiv.org\/abs\/2107.07630\">a previous experiment<\/a>, Lincoln Laboratory researchers tested one of the world&#8217;s best-performing Hanabi AI models with\u00a0humans. They were surprised to find that humans strongly disliked playing with this AI model, calling it a confusing and unpredictable\u00a0teammate. &#8220;The conclusion was that we\u2019re missing something about human preference, and\u00a0we&#8217;re not yet good at making models that might work in the real world,&#8221; Allen says. \u00a0<\/p>\n<p>The team wondered if\u00a0cooperative AI needs to be trained differently. The type of AI being used, called reinforcement learning, traditionally learns how to succeed at complex tasks by discovering which actions yield the highest reward.\u00a0It is often trained and evaluated against models similar to itself. This process has created unmatched AI players in competitive games like Go and\u00a0<a href=\"https:\/\/www.deepmind.com\/blog\/alphastar-mastering-the-real-time-strategy-game-starcraft-ii\">StarCraft<\/a>.<\/p>\n<p>But for AI to be a successful collaborator, perhaps it has to not only care about maximizing\u00a0reward when collaborating with other AI agents, but\u00a0also something more intrinsic: understanding and adapting to others\u2019 strengths and preferences. In other words, it needs to learn from and adapt to diversity. \u00a0<\/p>\n<p>How do you train such a diversity-minded AI? The\u00a0researchers came up with &#8220;Any-Play.&#8221; Any-Play augments the process of training an AI Hanabi agent by adding another objective, besides maximizing the game score: the AI must correctly identify the play-style of its training partner.<\/p>\n<p>This play-style is encoded within the training partner as a latent, or hidden, variable that the agent must estimate. It does this by observing differences in the behavior of its partner. This objective also requires its partner to learn distinct, recognizable behaviors in order to convey these differences to the receiving AI agent.<\/p>\n<p>Though this method of inducing diversity is\u00a0<a href=\"https:\/\/arxiv.org\/abs\/1802.06070\">not new to the field of AI<\/a>, the team extended the concept to collaborative games by leveraging these distinct behaviors as diverse play-styles of the game.<\/p>\n<p>&#8220;The AI agent has to observe its partners&#8217; behavior in order to identify that secret input they received and has to accommodate these various ways of playing to perform well in the game. The idea is that this would result in an AI agent that is good at playing with different play styles,&#8221; says first author and Carnegie Mellon University PhD candidate Keane Lucas, who led the experiments as a former intern at the laboratory.<\/p>\n<p><strong>Playing with others unlike itself<\/strong><\/p>\n<p>The team augmented that earlier\u00a0<a href=\"https:\/\/arxiv.org\/abs\/2003.02979\">Hanabi model<\/a>\u00a0(the one they had tested with humans in their prior experiment) with the Any-Play training process. To evaluate if the approach improved collaboration, the researchers teamed up the model with &#8220;strangers&#8221; \u2014 more than\u00a0100 other Hanabi models\u00a0that it had never encountered before and that were trained by separate algorithms \u2014 in millions of two-player matches.\u00a0<\/p>\n<p>The Any-Play pairings outperformed all other\u00a0teams, when those teams were also made up of partners who were algorithmically dissimilar to each other. It also scored better when partnering with the original version of itself not trained with Any-Play.<\/p>\n<p>The researchers view this type of evaluation, called inter-algorithm cross-play, as the best predictor of how cooperative AI would perform in the real world with humans. Inter-algorithm cross-play contrasts with more commonly used evaluations that test a model against copies of itself or against models trained by the same algorithm.<\/p>\n<p>&#8220;We argue that those other metrics can be misleading and artificially boost the apparent performance of some algorithms. Instead, we want to know, &#8216;if you just drop in a partner out of the blue, with no prior knowledge of how they&#8217;ll play, how well can you collaborate?&#8217;\u00a0We think this type of evaluation is most realistic when evaluating\u00a0cooperative AI with other AI, when you can\u2019t test with\u00a0humans,&#8221; Allen says. \u00a0<\/p>\n<p>Indeed, this work did not test Any-Play with humans. However,\u00a0<a href=\"https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/797134c3e42371bb4979a462eb2f042a-Abstract.html\">research published by DeepMind<\/a>, simultaneous to the lab&#8217;s work, used a similar diversity-training approach to develop an AI agent to play the collaborative game Overcooked with humans. &#8220;The AI agent and humans showed remarkably good cooperation, and this result leads us to believe our approach, which we find to be even more generalized, would also work well with humans,&#8221; Allen says.\u00a0<a href=\"https:\/\/proceedings.mlr.press\/v139\/lupu21a.html\">Facebook<\/a>\u00a0similarly used diversity in training to\u00a0improve collaboration\u00a0among Hanabi AI agents, but used a more complicated algorithm that required modifications of the Hanabi game rules to be tractable.<\/p>\n<p>Whether\u00a0inter-algorithm cross-play scores are actually good indicators of human preference is still a\u00a0hypothesis. To bring human perspective back into the process, the researchers want to try to correlate a person&#8217;s feelings about an AI, such as distrust or confusion,\u00a0to specific objectives used to train the\u00a0AI. Uncovering these connections could help accelerate advances in the field. \u00a0<\/p>\n<p>&#8220;The challenge with developing AI to work better with humans is that we can&#8217;t have humans in the loop during training telling the AI what they like and dislike. It would take millions of hours and personalities. But if we could find some kind of quantifiable proxy for human preference \u2014 and perhaps diversity in training is one such proxy \u00ad \u2014 then maybe we&#8217;ve found a way through this challenge,\u201d Allen says.<\/p>\n<\/div>\n<p><a href=\"https:\/\/news.mit.edu\/2022\/is-diversity-key-to-collaboration-0525\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Kylie Foy | MIT Lincoln Laboratory As artificial intelligence gets better at performing tasks once\u00a0solely in the hands of humans, like driving cars, many [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2022\/05\/25\/is-diversity-the-key-to-collaboration-new-ai-research-suggests-so\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":459,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5652"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=5652"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5652\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/467"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=5652"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=5652"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=5652"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}