{"id":819,"date":"2018-07-21T06:30:43","date_gmt":"2018-07-21T06:30:43","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/07\/21\/feature-selection-for-unsupervised-learning\/"},"modified":"2018-07-21T06:30:43","modified_gmt":"2018-07-21T06:30:43","slug":"feature-selection-for-unsupervised-learning","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/07\/21\/feature-selection-for-unsupervised-learning\/","title":{"rendered":"Feature Selection For Unsupervised Learning"},"content":{"rendered":"<p>Author: Vincent Granville<\/p>\n<div>\n<p><em>This is my presentation for the <a href=\"https:\/\/ibmdatascienceday.bemyapp.com\/talks\" target=\"_blank\" rel=\"noopener\">IBM data science day<\/a>, July 24.<\/em><\/p>\n<p><strong>Abstract<\/strong><\/p>\n<p>After reviewing popular techniques used in supervised, unsupervised and semi-supervised machine learning, we focus on feature selection methods in these different contexts, especially the metrics used to assess the value of a feature or set of features, be it binary, continuous or categorical variables. <\/p>\n<p> We go in deeper details and review modern feature selection techniques for unsupervised learning, typically relying on entropy-like criteria. While these criteria are usually model-dependent or scale-dependent, we introduce a new model-free, data-driven methodology in this context, with an application to an interesting number theory problem (simulated data set) in which each feature has a known theoretical entropy. <\/p>\n<p> We also briefly discuss high precision computing as it is relevant to this peculiar data set, as well as units of information smaller than the bit.<\/p>\n<p>To download the presentation, <a href=\"http:\/\/api.ning.com\/files\/jkjh50MJpjm*kp4ygZl5MTixNNfxcjOR3sSY3sQ1ZVPRwVotcPyTcFJpyhPE1kd*LBWLYafZxDllmV8H2et1uCzHdGEV4B--\/IBMJuly24_b.pptx\" target=\"_self\">click here<\/a>\u00a0(PowerPoint document.)<\/p>\n<p><span style=\"font-size: 14pt;\"><b>DSC Resources<\/b><\/span><\/p>\n<ul>\n<li><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/fee-book-applied-stochastic-processes\">Free Book: Applied Stochastic Processes<\/a><\/li>\n<li><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/comprehensive-repository-of-data-science-and-ml-resources\">Comprehensive Repository of Data Science and ML Resources<\/a><\/li>\n<li><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/advanced-machine-learning-with-basic-excel\">Advanced Machine Learning with Basic Excel<\/a><\/li>\n<li><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/difference-between-machine-learning-data-science-ai-deep-learning\">Difference between ML, Data Science, AI, Deep Learning, and Statistics<\/a><\/li>\n<li><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/my-data-science-machine-learning-and-related-articles\">Selected Business Analytics, Data Science and ML articles<\/a><\/li>\n<li><a href=\"http:\/\/careers.analytictalent.com\/jobs\/products\">Hire a Data Scientist<\/a><span>\u00a0<\/span>|<span>\u00a0<\/span><a href=\"http:\/\/www.datasciencecentral.com\/page\/search?q=Python\">Search DSC<\/a><span>\u00a0<\/span>|<span>\u00a0<\/span><a href=\"http:\/\/classifieds.datasciencecentral.com\/\">Classifieds<\/a><span>\u00a0<\/span>|<span>\u00a0<\/span><a href=\"http:\/\/www.analytictalent.com\/\">Find a Job<\/a><\/li>\n<li><a href=\"http:\/\/www.datasciencecentral.com\/profiles\/blog\/new\">Post a Blog<\/a><span>\u00a0<\/span>|<span>\u00a0<\/span><a href=\"http:\/\/www.datasciencecentral.com\/forum\/topic\/new\">Forum Questions<\/a><\/li>\n<\/ul>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:745315\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Vincent Granville This is my presentation for the IBM data science day, July 24. Abstract After reviewing popular techniques used in supervised, unsupervised and [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/07\/21\/feature-selection-for-unsupervised-learning\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":468,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/819"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=819"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/819\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/467"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=819"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=819"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=819"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}