{"id":2488,"date":"2019-08-22T06:34:43","date_gmt":"2019-08-22T06:34:43","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/22\/three-way-data-splits-training-test-and-validation-for-model-selection-and-performance-estimation\/"},"modified":"2019-08-22T06:34:43","modified_gmt":"2019-08-22T06:34:43","slug":"three-way-data-splits-training-test-and-validation-for-model-selection-and-performance-estimation","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/22\/three-way-data-splits-training-test-and-validation-for-model-selection-and-performance-estimation\/","title":{"rendered":"Three-way data splits (training, test and validation) for model selection and performance estimation"},"content":{"rendered":"<p>Author: ajit jaokar<\/p>\n<div>\n<p>The use of training, validation and test datasets is common but not easily understood.\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>In this post, I attempt to clarify this concept. The post is part of my forthcoming book on <strong>learning Artificial Intelligence, Machine Learning and Deep Learning based on high school maths<\/strong>.\u00a0 If you want to know more about the book, please follow me on <a href=\"https:\/\/www.linkedin.com\/in\/ajitjaokar\/\">Linkedin Ajit Jaokar<\/a><\/p>\n<p>\u00a0<\/p>\n<h2>Background<\/h2>\n<p>Jason Brownlee provides a good explanation on the <a href=\"https:\/\/machinelearningmastery.com\/difference-test-validation-datasets\/\">three-way data splits (training, test and validation)<\/a><\/p>\n<p><em>\u2013 Training set: A set of examples used for learning, that is to fit the parameters of the classifier.<\/em><\/p>\n<p><em>\u2013 Validation set: A set of examples used to tune the parameters of a classifier, for example to choose the number of hidden units in a neural network.<\/em><\/p>\n<p><em>\u2013 Test set: A set of examples used only to assess the performance of a fully-specified classifier.<\/em><\/p>\n<p>And then comes up with an important statement: <strong><em>Reference to a \u201cvalidation dataset\u201d disappears if the practitioner is choosing to tune model hyperparameters using k-fold cross-validation with the training dataset.<\/em><\/strong><\/p>\n<p>So, here, I try and explain these ideas in more detail from the <a href=\"http:\/\/research.cs.tamu.edu\/prism\/lectures\/iss\/iss_l13.pdf\">source by Ricardo Gutierrez-Osuna Wright State University<\/a><\/p>\n<p>\u00a0<\/p>\n<h2>Understanding model validation<\/h2>\n<p>Validation techniques are motivated by two fundamental problems in pattern recognition: <strong>model selection and performance estimation<\/strong><\/p>\n<p><strong>Model selection<\/strong>: involves selecting optimal parameters or a model. Pattern recognition techniques have one or more free parameters \u2013 for example &#8211; the number of neighbours in a kNN classification and the network size, learning parameters and weights in MLPs. The selection of these hyperparameters determines the efficiency of the solution. Hyperparameters are set by the user. In contrast, the parameters of a model are learned from the data. \u00a0<\/p>\n<p>\u00a0<\/p>\n<p><strong>Performance estimation:<\/strong> \u00a0\u00a0Once we have chosen a model, we need to estimate its performance. \u00a0If we had access to an unlimited set of samples (or the whole population) \u2013 it is easy to estimate the performance. However, in practise, we have access to a smaller sample of the population. If we use the entire dataset to train the model, the model is likely to overfit. Overfitting is essentially \u2018learning the noise\u2019 from the training data. Since our goal is to find the best model that can give optimal results on unseen data, overfitting is not a good option. We can address this problem by evaluating the error function using data which is independent of that used for training. \u00a0<\/p>\n<p>\u00a0<\/p>\n<p>The first approach is to split the model into training and test dataset. This is the <strong>holdout method<\/strong> where you use the training dataset to train the classifier and the test dataset to estimate the error of the trained classifier.\u00a0 The holdout method has limitations: for example, it is not suitable for sparse datasets The limitations of the holdout can be overcome with a family of resampling methods such as <strong>Cross Validation<\/strong>.<\/p>\n<p>\u00a0<\/p>\n<p>Finally, the\u00a0<strong>test dataset<\/strong>\u00a0is a dataset used to provide an unbiased evaluation of a\u00a0<em>final<\/em>\u00a0model fit on the training dataset. The test dataset is used to obtain the performance characteristics such as accuracy, sensitivity, specificity, F-measure, and so on.<\/p>\n<p>\u00a0<\/p>\n<h2>Putting it all together<\/h2>\n<p>The overall steps are:<\/p>\n<ol>\n<li>\u00a0Divide the available data into training, validation and test set\u00a0<\/li>\n<li>Select architecture and training parameters\u00a0<\/li>\n<li>Train the model using the training set\u00a0<\/li>\n<li>Evaluate the model using the validation set\u00a0<\/li>\n<li>Repeat steps 2 through 4 using different architectures and training parameters<\/li>\n<li>Select the best model and train it using data from the training and validation set<\/li>\n<li>Assess this final model using the test set 1.\u00a0<\/li>\n<\/ol>\n<p>\u00a0<\/p>\n<ul>\n<li>This outline assumes a holdout method g If CV or Bootstrap are used, steps 3 and 4 have to be repeated for each of the K folds<\/li>\n<li>Steps 2 3 4 are part of hyperparameter tuning<\/li>\n<\/ul>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/3437704683?profile=original\" target=\"_blank\" rel=\"noopener noreferrer\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/3437704683?profile=RESIZE_710x\" class=\"align-full\"><\/a>\u00a0<\/p>\n<p><strong>source for image and steps<\/strong> &#8211;\u00a0<a href=\"http:\/\/research.cs.tamu.edu\/prism\/lectures\/iss\/iss_l13.pdf\">source by Ricardo Gutierrez-Osuna Wright State University<\/a><\/p>\n<p>I hope you found this useful. The post is part of my forthcoming book on <strong>learning Artificial Intelligence, Machine Learning and Deep Learning based on high school maths<\/strong>.\u00a0 If you want to know more about the book, please follow me on <a href=\"https:\/\/www.linkedin.com\/in\/ajitjaokar\/\">Linkedin Ajit Jaokar<\/a><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:871225\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: ajit jaokar The use of training, validation and test datasets is common but not easily understood.\u00a0 \u00a0 In this post, I attempt to clarify [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/22\/three-way-data-splits-training-test-and-validation-for-model-selection-and-performance-estimation\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":456,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2488"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2488"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2488\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/470"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2488"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2488"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2488"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}