{"id":7620,"date":"2024-09-24T15:00:00","date_gmt":"2024-09-24T15:00:00","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2024\/09\/24\/3-questions-should-we-label-ai-systems-like-we-do-prescription-drugs\/"},"modified":"2024-09-24T15:00:00","modified_gmt":"2024-09-24T15:00:00","slug":"3-questions-should-we-label-ai-systems-like-we-do-prescription-drugs","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2024\/09\/24\/3-questions-should-we-label-ai-systems-like-we-do-prescription-drugs\/","title":{"rendered":"3 Questions: Should we label AI systems like we do prescription drugs?"},"content":{"rendered":"<p>Author: Adam Zewe | MIT News<\/p>\n<div>\n<p><em>AI systems are increasingly being deployed in safety-critical health care situations. Yet these models sometimes hallucinate incorrect information, make biased predictions, or fail for unexpected reasons, which could have serious consequences for patients and clinicians.<\/em><\/p>\n<p><em>In a <\/em><a href=\"https:\/\/www.nature.com\/articles\/s43588-024-00676-7\" target=\"_blank\" rel=\"noopener\"><em>commentary article published today in\u00a0<\/em>Nature Computational Science<\/a><em>, MIT Associate Professor Marzyeh Ghassemi and Boston University Associate Professor Elaine Nsoesie argue that, to mitigate these potential harms, AI systems should be accompanied by responsible-use labels, similar to U.S. Food and Drug Administration-mandated labels placed on prescription medications.<\/em><\/p>\n<p>MIT News<em> spoke with Ghassemi about the need for such labels, the information they should convey, and how labeling procedures could be implemented.<\/em><\/p>\n<p><strong>Q:\u00a0<\/strong>Why do we need responsible use labels for AI systems in health care settings?<\/p>\n<p><strong>A:<\/strong> In a health setting, we have an interesting situation where doctors often rely on technology or treatments\u00a0 that are not\u00a0fully understood. Sometimes this lack of understanding is fundamental \u2014 the mechanism behind acetaminophen for instance \u2014 but other times this is just a limit of specialization.\u00a0We don\u2019t expect clinicians to know how to service an MRI machine, for instance. Instead, we have certification systems through the FDA or\u00a0other federal agencies, that certify the use of a medical\u00a0device or drug in a specific setting.<\/p>\n<p>Importantly, medical devices also\u00a0have service contracts \u2014 a technician from the manufacturer will fix your MRI machine if it is miscalibrated. For approved drugs, there are postmarket surveillance and reporting systems so that adverse effects or events can be addressed, for instance\u00a0if a lot of people taking a\u00a0drug seem to be developing a condition or allergy.<\/p>\n<p>Models and algorithms, whether they incorporate AI or not, skirt a lot of these approval and long-term monitoring processes, and that is something we need to be\u00a0wary of. Many prior studies have shown that predictive models need more careful evaluation and monitoring.\u00a0With more recent\u00a0generative AI specifically, we cite work that has demonstrated\u00a0generation\u00a0is not guaranteed to be appropriate, robust, or unbiased. Because we don\u2019t have the same level of surveillance on model predictions or generation, it would be\u00a0even more\u00a0difficult to catch a model\u2019s problematic responses. The generative models being used by hospitals right now could be biased. Having use labels is one way of ensuring that models don\u2019t automate biases that are learned from human practitioners or miscalibrated clinical decision support scores of the past.\u00a0\u00a0 \u00a0 \u00a0<\/p>\n<p><strong>Q:\u00a0<\/strong>Your article describes several components of a responsible use label for AI, following the FDA approach for creating prescription labels, including approved usage, ingredients, potential side effects, etc. What core information should these labels convey?<\/p>\n<p><strong>A:<\/strong> The things a label should make obvious\u00a0are time, place, and manner of a model\u2019s intended use. For instance, the user should know that models were trained at a specific time with data from a specific time point. For instance, does it include data that did or did not include the Covid-19 pandemic? There were very different health practices during Covid that could impact the data. This is why we advocate for the model \u201cingredients\u201d and \u201ccompleted studies\u201d to be disclosed.<\/p>\n<p>For place, we know from prior research that models trained in one location\u00a0tend to have\u00a0worse performance when\u00a0moved\u00a0to another location. Knowing\u00a0where the data were from and how\u00a0a\u00a0model was optimized within that population\u00a0can help to ensure that users are aware of \u201cpotential side effects,\u201d any \u201cwarnings and precautions,\u201d and \u201cadverse reactions.\u201d<\/p>\n<p>With a\u00a0model trained to predict one outcome,\u00a0knowing\u00a0the time and place of training could help you\u00a0make intelligent judgements about deployment. But many generative\u00a0models are incredibly flexible and can be used for many tasks.\u00a0Here,\u00a0time and place may not be as informative, and more explicit direction about \u201cconditions of labeling\u201d and \u201capproved usage\u201d versus \u201cunapproved usage\u201d come into play. If\u00a0a developer has evaluated a\u00a0generative model for reading a patient\u2019s clinical notes and generating prospective billing codes,\u00a0they can disclose that\u00a0it has bias toward overbilling for specific conditions\u00a0or underrecognizing others. A user wouldn\u2019t want to use this same generative model to decide who gets a referral to a specialist, even though they could. This\u00a0flexibility\u00a0is why we advocate for additional details on the\u00a0manner in which models\u00a0should be used.<\/p>\n<p>In general, we advocate\u00a0that you should train the best model you can, using the tools available to you. But even then, there should be a lot of disclosure. No model\u00a0is going to be perfect. As a society, we now understand that\u00a0no pill is perfect \u2014 there is always some risk. We should have the same understanding of AI models. Any model \u2014 with or without AI \u2014 is\u00a0limited. It may be\u00a0giving you realistic,\u00a0well-trained, forecasts of potential futures, but take that with whatever grain of salt is appropriate.<\/p>\n<p><strong>Q:\u00a0<\/strong>If AI labels were to be implemented, who would do the labeling and how would labels be regulated and enforced?<\/p>\n<p><strong>A:<\/strong>\u00a0If you don\u2019t intend for your model to be used in practice, then\u00a0the\u00a0disclosures you would make for\u00a0a high-quality research publication are sufficient. But once you intend your model to be deployed in a human-facing setting,\u00a0developers and deployers should\u00a0do an initial labeling, based on some of the established frameworks. There should be a validation of these claims prior to deployment; in a safety-critical setting like health care, many agencies of the Department of Health and Human Services could be involved.<\/p>\n<p>For model\u00a0developers, I think that knowing you will need to label the limitations of a system\u00a0induces more careful consideration of the process itself. If I know that at some point I am going to have to disclose the population upon which a model was trained, I would not want to disclose that it was trained only on dialogue from male chatbot users, for instance.<\/p>\n<p>Thinking about things like who the data are collected on, over what time period, what the sample size was, and how you decided what data to include or exclude, can open your mind up to potential problems at deployment.\u00a0<\/p>\n<\/div>\n<p><a href=\"https:\/\/news.mit.edu\/2024\/3-questions-should-we-label-ai-systems-like-prescription-drugs-0924\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Adam Zewe | MIT News AI systems are increasingly being deployed in safety-critical health care situations. Yet these models sometimes hallucinate incorrect information, make [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2024\/09\/24\/3-questions-should-we-label-ai-systems-like-we-do-prescription-drugs\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":460,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/7620"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=7620"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/7620\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/459"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=7620"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=7620"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=7620"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}