{"id":2699,"date":"2019-10-16T04:00:01","date_gmt":"2019-10-16T04:00:01","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/10\/16\/recovering-lost-dimensions-of-images-and-video\/"},"modified":"2019-10-16T04:00:01","modified_gmt":"2019-10-16T04:00:01","slug":"recovering-lost-dimensions-of-images-and-video","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/10\/16\/recovering-lost-dimensions-of-images-and-video\/","title":{"rendered":"Recovering \u201clost dimensions\u201d of images and video"},"content":{"rendered":"<p>Author: Rob Matheson | MIT News Office<\/p>\n<div>\n<p>MIT researchers have developed a model that recovers valuable data lost from images and video that have been \u201ccollapsed\u201d into lower dimensions.<\/p>\n<p>The model could be used to recreate video from motion-blurred images, or from new types of cameras that capture a person\u2019s movement around corners but only as vague one-dimensional lines. While more testing is needed, the researchers think this approach could someday could be used to convert 2D medical images into more informative \u2014 but more expensive \u2014 3D body scans, which could benefit medical imaging in poorer nations.<\/p>\n<p>\u201cIn all these cases, the visual data has one dimension \u2014 in time or space \u2014 that\u2019s completely lost,\u201d says Guha Balakrishnan, a postdoc in Computer Science and Artificial Intelligence Laboratory (CSAIL) and first author on a paper describing the model, which is being presented at next week\u2019s International Conference on Computer Vision. \u201cIf we recover that lost dimension, it can have a lot of important applications.\u201d<\/p>\n<p>Captured visual data often collapses data of multiple dimensions\u00a0of time and space into one or two dimensions, called \u201cprojections.\u201d X-rays, for example, collapse three-dimensional data about anatomical structures into a flat image. Or, consider a long-exposure shot of stars moving across the sky: The stars, whose position is changing over time, appear as blurred streaks in the still shot.<\/p>\n<p>Likewise, \u201c<a href=\"http:\/\/news.mit.edu\/2017\/artificial-intelligence-for-your-blind-spot-mit-csail-cornercameras-1009\">corner cameras<\/a>,\u201d recently invented at MIT, detect moving people around corners. These could be useful for, say, firefighters finding people in burning buildings. But the cameras aren\u2019t exactly user-friendly. Currently they only produce projections that resemble blurry, squiggly lines, corresponding to a person\u2019s trajectory and speed.<\/p>\n<p>The researchers invented a \u201cvisual deprojection\u201d model that uses a neural network to \u201clearn\u201d patterns that match low-dimensional projections to their original high-dimensional images and videos. Given new projections, the model uses what it\u2019s learned to recreate all the original data from a projection.<\/p>\n<p>In experiments, the model synthesized accurate video frames showing people walking, by extracting information from single, one-dimensional lines similar to those produced by corner cameras. The model also recovered video frames from single, motion-blurred projections of digits moving around a screen, from the popular <a href=\"http:\/\/www.cs.toronto.edu\/~nitish\/unsupervised_video\/\">Moving MNIST<\/a> dataset.<\/p>\n<p>Joining Balakrishnan on the paper are: Amy Zhao, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and CSAIL; EECS professors John Guttag, Fredo Durand, and William T. Freeman; and Adrian Dalca, a faculty member in radiology at Harvard Medical School.<\/p>\n<p><strong>Clues in pixels<\/strong><\/p>\n<p>The work started as a \u201ccool inversion problem\u201d to recreate movement that causes motion blur in long-exposure photography, Balakrishnan says. In a projection\u2019s pixels there exist some clues about the high-dimensional source.<\/p>\n<p>Digital cameras capturing long-exposure shots, for instance, will basically aggregate photons over a period of time on each pixel. In capturing an object\u2019s movement over time, the camera will take the average value of the movement-capturing pixels. Then, it applies those average values to corresponding heights and widths of a still image, which creates the signature blurry streaks of the object\u2019s trajectory. By calculating some variations in pixel intensity, the movement can theoretically be recreated.<\/p>\n<p>As the researchers realized, that problem is relevant in many areas: X-rays, for instance, capture height, width, and depth information of anatomical structures, but they use a similar pixel-averaging technique to collapse depth into a 2D image. Corner cameras \u2014 invented in 2017 by Freeman, Durand, and other researchers \u2014\u00a0capture reflected light signals around a hidden scene that carry two-dimensional information about a person\u2019s distance from walls and objects. The pixel-averaging technique then collapses that data into a one-dimensional video \u2014 basically, measurements of different lengths over time in a single line. \u00a0<\/p>\n<p>The researchers built a general model, based on a convolutional neural network (CNN) \u2014\u00a0a machine-learning model that\u2019s become a powerhouse for image-processing tasks \u2014 that captures clues about any lost dimension in averaged pixels.<\/p>\n<p><strong>Synthesizing signals<\/strong><\/p>\n<p>In training, the researchers fed the CNN thousands of pairs of projections and their high-dimensional sources, called \u201csignals.\u201d The CNN learns pixel patterns in the projections that match those in the signals. Powering the CNN is a framework called a \u201cvariational autoencoder,\u201d which evaluates how well the CNN outputs match its inputs across some statistical probability. From that, the model learns a \u201cspace\u201d of all possible signals that could have produced a given projection. This creates, in essence, a type of blueprint for how to go from a projection to all possible matching signals.<\/p>\n<p>When shown previously unseen projections, the model notes the pixel patterns and follows the blueprints to all possible signals that could have produced that projection. Then, it synthesizes new images that combine all data from the projection and all data from the signal. This recreates the high-dimensional signal.<\/p>\n<p>For one experiment, the researchers collected a dataset of 35 videos of 30 people walking in a specified area. They collapsed all frames into projections that they used to train and test the model. From a hold-out set of six unseen projections, the model accurately recreated 24 frames of the person\u2019s gait, down to the position of their legs and the person\u2019s size as they walked toward or away from the camera. The model seems to learn, for instance, that pixels that get darker and wider with time likely correspond to a person walking closer to the camera.<\/p>\n<p>\u201cIt\u2019s almost like magic that we\u2019re able to recover this detail,\u201d Balakrishnan says.<\/p>\n<p>The researchers didn\u2019t test their model on medical images. But they are now collaborating with Cornell University colleagues to recover 3D anatomical information from 2D medical images, such as X-rays, with no added costs \u2014\u00a0which can enable more detailed medical imaging in poorer nations. Doctors mostly prefer 3D scans, such as those captured with CT scans, because they contain far more useful medical information. But CT scans are generally difficult and expensive to acquire.<\/p>\n<p>\u201cIf we can convert X-rays to CT scans, that would be somewhat game-changing,\u201d Balakrishnan says. \u201cYou could just take an X-ray and push it through our algorithm and see all the lost information.\u201d<\/p>\n<\/div>\n<p><a href=\"http:\/\/news.mit.edu\/2019\/model-lost-data-images-video-1016\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Rob Matheson | MIT News Office MIT researchers have developed a model that recovers valuable data lost from images and video that have been [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/10\/16\/recovering-lost-dimensions-of-images-and-video\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":467,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2699"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2699"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2699\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/472"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2699"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2699"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2699"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}