{"id":3073,"date":"2020-01-28T18:00:34","date_gmt":"2020-01-28T18:00:34","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2020\/01\/28\/cost-sensitive-decision-trees-for-imbalanced-classification\/"},"modified":"2020-01-28T18:00:34","modified_gmt":"2020-01-28T18:00:34","slug":"cost-sensitive-decision-trees-for-imbalanced-classification","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2020\/01\/28\/cost-sensitive-decision-trees-for-imbalanced-classification\/","title":{"rendered":"Cost-Sensitive Decision Trees for Imbalanced Classification"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>The decision tree algorithm is effective for balanced classification, although it does not perform well on imbalanced datasets.<\/p>\n<p>The split points of the tree are chosen to best separate examples into two groups with minimum mixing. When both groups are dominated by examples from one class, the criterion used to select a split point will see good separation, when in fact, the examples from the minority class are being ignored.<\/p>\n<p>This problem can be overcome by modifying the criterion used to evaluate split points to take the importance of each class into account, referred to generally as the weighted split-point or weighted decision tree.<\/p>\n<p>In this tutorial, you will discover the weighted decision tree for imbalanced classification.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How the standard decision tree algorithm does not support imbalanced classification.<\/li>\n<li>How the decision tree algorithm can be modified to weight model error by class weight when selecting splits.<\/li>\n<li>How to configure class weight for the decision tree algorithm and how to grid search different class weight configurations.<\/li>\n<\/ul>\n<p>Discover SMOTE, one-class classification, cost-sensitive learning, threshold moving, and much more <a href=\"https:\/\/machinelearningmastery.com\/imbalanced-classification-with-python\/\">in my new book<\/a>, with 30 step-by-step tutorials and full Python source code.<\/p>\n<p>Let&rsquo;s get started.<\/p>\n<div id=\"attachment_9495\" style=\"width: 809px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-9495\" class=\"size-full wp-image-9495\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/01\/How-to-Implement-Weighted-Decision-Trees-for-Imbalanced-Classification.jpg\" alt=\"How to Implement Weighted Decision Trees for Imbalanced Classification\" width=\"799\" height=\"533\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/01\/How-to-Implement-Weighted-Decision-Trees-for-Imbalanced-Classification.jpg 799w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/01\/How-to-Implement-Weighted-Decision-Trees-for-Imbalanced-Classification-300x200.jpg 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/01\/How-to-Implement-Weighted-Decision-Trees-for-Imbalanced-Classification-768x512.jpg 768w\" sizes=\"(max-width: 799px) 100vw, 799px\"><\/p>\n<p id=\"caption-attachment-9495\" class=\"wp-caption-text\">How to Implement Weighted Decision Trees for Imbalanced Classification<br \/>Photo by <a href=\"https:\/\/flickr.com\/photos\/icetsarina\/33074457825\/\">Bonnie Moreland<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into four parts; they are:<\/p>\n<ol>\n<li>Imbalanced Classification Dataset<\/li>\n<li>Decision Trees for Imbalanced Classification<\/li>\n<li>Weighted Decision Trees With Scikit-Learn<\/li>\n<li>Grid Search Weighted Decision Trees<\/li>\n<\/ol>\n<h2>Imbalanced Classification Dataset<\/h2>\n<p>Before we dive into the modification of decision for imbalanced classification, let&rsquo;s first define an imbalanced classification dataset.<\/p>\n<p>We can use the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.datasets.make_classification.html\">make_classification() function<\/a> to define a synthetic imbalanced two-class classification dataset. We will generate 10,000 examples with an approximate 1:100 minority to majority class ratio.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define dataset\r\nX, y = make_classification(n_samples=10000, n_features=2, n_redundant=0,\r\n\tn_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=3)<\/pre>\n<p>Once generated, we can summarize the class distribution to confirm that the dataset was created as we expected.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# summarize class distribution\r\ncounter = Counter(y)\r\nprint(counter)<\/pre>\n<p>Finally, we can create a scatter plot of the examples and color them by class label to help understand the challenge of classifying examples from this dataset.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# scatter plot of examples by class label\r\nfor label, _ in counter.items():\r\n\trow_ix = where(y == label)[0]\r\n\tpyplot.scatter(X[row_ix, 0], X[row_ix, 1], label=str(label))\r\npyplot.legend()\r\npyplot.show()<\/pre>\n<p>Tying this together, the complete example of generating the synthetic dataset and plotting the examples is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># Generate and plot a synthetic imbalanced classification dataset\r\nfrom collections import Counter\r\nfrom sklearn.datasets import make_classification\r\nfrom matplotlib import pyplot\r\nfrom numpy import where\r\n# define dataset\r\nX, y = make_classification(n_samples=10000, n_features=2, n_redundant=0,\r\n\tn_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=3)\r\n# summarize class distribution\r\ncounter = Counter(y)\r\nprint(counter)\r\n# scatter plot of examples by class label\r\nfor label, _ in counter.items():\r\n\trow_ix = where(y == label)[0]\r\n\tpyplot.scatter(X[row_ix, 0], X[row_ix, 1], label=str(label))\r\npyplot.legend()\r\npyplot.show()<\/pre>\n<p>Running the example first creates the dataset and summarizes the class distribution.<\/p>\n<p>We can see that the dataset has an approximate 1:100 class distribution with a little less than 10,000 examples in the majority class and 100 in the minority class.<\/p>\n<pre class=\"crayon-plain-tag\">Counter({0: 9900, 1: 100})<\/pre>\n<p>Next, a scatter plot of the dataset is created showing the large mass of examples for the majority class (blue) and a small number of examples for the minority class (orange), with some modest class overlap.<\/p>\n<div id=\"attachment_9494\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-9494\" class=\"size-full wp-image-9494\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/11\/Scatter-Plot-of-Binary-Classification-Dataset-with-1-to-100-Class-Imbalance-1.png\" alt=\"Scatter Plot of Binary Classification Dataset With 1 to 100 Class Imbalance\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/11\/Scatter-Plot-of-Binary-Classification-Dataset-with-1-to-100-Class-Imbalance-1.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/11\/Scatter-Plot-of-Binary-Classification-Dataset-with-1-to-100-Class-Imbalance-1-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/11\/Scatter-Plot-of-Binary-Classification-Dataset-with-1-to-100-Class-Imbalance-1-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/11\/Scatter-Plot-of-Binary-Classification-Dataset-with-1-to-100-Class-Imbalance-1-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-9494\" class=\"wp-caption-text\">Scatter Plot of Binary Classification Dataset With 1 to 100 Class Imbalance<\/p>\n<\/div>\n<p>Next, we can fit a standard decision tree model on the dataset.<\/p>\n<p>A decision tree can be defined using the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.tree.DecisionTreeClassifier.html\">DecisionTreeClassifier<\/a> class in the scikit-learn library.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define model\r\nmodel = DecisionTreeClassifier()<\/pre>\n<p>We will use repeated cross-validation to evaluate the model, with three repeats of <a href=\"https:\/\/machinelearningmastery.com\/k-fold-cross-validation\/\">10-fold cross-validation<\/a>. The mode performance will be reported using the mean ROC area under curve (ROC AUC) averaged over repeats and all folds.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define evaluation procedure\r\ncv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)\r\n# evaluate model\r\nscores = cross_val_score(model, X, y, scoring='roc_auc', cv=cv, n_jobs=-1)\r\n# summarize performance\r\nprint('Mean ROC AUC: %.3f' % mean(scores))<\/pre>\n<p>Tying this together, the complete example of defining and evaluating a standard decision tree model on the imbalanced classification problem is listed below.<\/p>\n<p>Decision trees are an effective model for binary classification tasks, although by default, they are not effective at imbalanced classification.<\/p>\n<pre class=\"crayon-plain-tag\"># fit a decision tree on an imbalanced classification dataset\r\nfrom numpy import mean\r\nfrom sklearn.datasets import make_classification\r\nfrom sklearn.model_selection import cross_val_score\r\nfrom sklearn.model_selection import RepeatedStratifiedKFold\r\nfrom sklearn.tree import DecisionTreeClassifier\r\n# generate dataset\r\nX, y = make_classification(n_samples=10000, n_features=2, n_redundant=0,\r\n\tn_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=3)\r\n# define model\r\nmodel = DecisionTreeClassifier()\r\n# define evaluation procedure\r\ncv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)\r\n# evaluate model\r\nscores = cross_val_score(model, X, y, scoring='roc_auc', cv=cv, n_jobs=-1)\r\n# summarize performance\r\nprint('Mean ROC AUC: %.3f' % mean(scores))<\/pre>\n<p>Running the example evaluates the standard decision tree model on the imbalanced dataset and reports the mean ROC AUC.<\/p>\n<p>Your specific results may vary given the stochastic nature of the learning algorithm. Try running the example a few times.<\/p>\n<p>We can see that the model has skill, achieving a ROC AUC above 0.5, in this case achieving a mean score of 0.746.<\/p>\n<pre class=\"crayon-plain-tag\">Mean ROC AUC: 0.746<\/pre>\n<p>This provides a baseline for comparison for any modifications performed to the standard decision tree algorithm.<\/p>\n<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Want to Get Started With Imbalance Classification?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14de34d42172a2%3A164f8be4f346dc\/4529268551712768\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\" rel=\"noopener noreferrer\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"14de34d42172a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14de34d42172a2%3A164f8be4f346dc\/4529268551712768\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1576257931.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h2>Decision Trees for Imbalanced Classification<\/h2>\n<p>The decision tree algorithm is also known as <a href=\"https:\/\/machinelearningmastery.com\/classification-and-regression-trees-for-machine-learning\/\">Classification and Regression Trees<\/a> (CART) and involves growing a tree to classify examples from the training dataset.<\/p>\n<p>The tree can be thought to divide the training dataset, where examples progress down the decision points of the tree to arrive in the leaves of the tree and are assigned a class label.<\/p>\n<p>The tree is constructed by splitting the training dataset using values for variables in the dataset. At each point, the split in the data that results in the purest (least mixed) groups of examples is chosen in a greedy manner.<\/p>\n<p>Here, purity means a clean separation of examples into groups where a group of examples of all 0 or all 1 class is the purest, and a 50-50 mixture of both classes is the least pure. Purity is most commonly calculated using Gini impurity, although it can also be calculated using <a href=\"https:\/\/machinelearningmastery.com\/information-gain-and-mutual-information\/\">entropy<\/a>.<\/p>\n<p>The calculation of a purity measure involves calculating the probability of an example of a given class being misclassified by a split. Calculating these probabilities involves summing the number of examples in each class within each group.<\/p>\n<p>The splitting criterion can be updated to not only take the purity of the split into account, but also be weighted by the importance of each class.<\/p>\n<blockquote>\n<p>Our intuition for cost-sensitive tree induction is to modify the weight of an instance proportional to the cost of misclassifying the class to which the instance belonged &hellip;<\/p>\n<\/blockquote>\n<p>&mdash;<a href=\"https:\/\/ieeexplore.ieee.org\/document\/1000348\">An Instance-weighting Method To Induce Cost-sensitive Trees<\/a>, 2002.<\/p>\n<p>This can be achieved by replacing the count of examples in each group by a weighted sum, where the coefficient is provided to weight the sum.<\/p>\n<p>Larger weight is assigned to the class with more importance, and a smaller weight is assigned to a class with less importance.<\/p>\n<ul>\n<li><strong>Small Weight<\/strong>: Less importance, lower impact on node purity.<\/li>\n<li><strong>Large Weight<\/strong>: More importance, higher impact on node purity.<\/li>\n<\/ul>\n<p>A small weight can be assigned to the majority class, which has the effect of improving (lowering) the purity score of a node that may otherwise look less well sorted. In turn, this may allow more examples from the majority class to be classified for the minority class, better accommodating those examples in the minority class.<\/p>\n<blockquote>\n<p>Higher weights [are] assigned to instances coming from the class with a higher value of misclassification cost.<\/p>\n<\/blockquote>\n<p>&mdash; Page 71, <a href=\"https:\/\/amzn.to\/307Xlva\">Learning from Imbalanced Data Sets<\/a>, 2018.<\/p>\n<p>As such, this modification of the decision tree algorithm is referred to as a weighted decision tree, a class-weighted decision tree, or a cost-sensitive decision tree.<\/p>\n<p>Modification of the split point calculation is the most common, although there has been a lot of research into a range of other modifications of the decision tree construction algorithm to better accommodate a class imbalance.<\/p>\n<h2>Weighted Decision Tree With Scikit-Learn<\/h2>\n<p>The scikit-learn Python machine learning library provides an implementation of the decision tree algorithm that supports class weighting.<\/p>\n<p>The <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.tree.DecisionTreeClassifier.html\">DecisionTreeClassifier class<\/a> provides the <em>class_weight<\/em> argument that can be specified as a model hyperparameter. The <em>class_weight<\/em> is a dictionary that defines each class label (e.g. 0 and 1) and the weighting to apply in the calculation of group purity for splits in the decision tree when fitting the model.<\/p>\n<p>For example, a 1 to 1 weighting for each class 0 and 1 can be defined as follows:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define model\r\nweights = {0:1.0, 1:1.0}\r\nmodel = DecisionTreeClassifier(class_weight=weights)<\/pre>\n<p>The class weighing can be defined multiple ways; for example:<\/p>\n<ul>\n<li><strong>Domain expertise<\/strong>, determined by talking to subject matter experts.<\/li>\n<li><strong>Tuning<\/strong>, determined by a hyperparameter search such as a grid search.<\/li>\n<li><strong>Heuristic<\/strong>, specified using a general best practice.<\/li>\n<\/ul>\n<p>A best practice for using the class weighting is to use the inverse of the class distribution present in the training dataset.<\/p>\n<p>For example, the class distribution of the test dataset is a 1:100 ratio for the minority class to the majority class. The invert of this ratio could be used with 1 for the majority class and 100 for the minority class.<\/p>\n<p>For example:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define model\r\nweights = {0:1.0, 1:100.0}\r\nmodel = DecisionTreeClassifier(class_weight=weights)<\/pre>\n<p>We might also define the same ratio using fractions and achieve the same result.<\/p>\n<p>For example:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define model\r\nweights = {0:0.01, 1:1.0}\r\nmodel = DecisionTreeClassifier(class_weight=weights)<\/pre>\n<p>This heuristic is available directly by setting the <em>class_weight<\/em> to &lsquo;<em>balanced<\/em>.&rsquo;<\/p>\n<p>For example:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define model\r\nmodel = DecisionTreeClassifier(class_weight='balanced')<\/pre>\n<p>We can evaluate the decision tree algorithm with a class weighting using the same evaluation procedure defined in the previous section.<\/p>\n<p>We would expect the class-weighted version of the decision tree to perform better than the standard version of the decision tree without any class weighting.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># decision tree with class weight on an imbalanced classification dataset\r\nfrom numpy import mean\r\nfrom sklearn.datasets import make_classification\r\nfrom sklearn.model_selection import cross_val_score\r\nfrom sklearn.model_selection import RepeatedStratifiedKFold\r\nfrom sklearn.tree import DecisionTreeClassifier\r\n# generate dataset\r\nX, y = make_classification(n_samples=10000, n_features=2, n_redundant=0,\r\n\tn_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=3)\r\n# define model\r\nmodel = DecisionTreeClassifier(class_weight='balanced')\r\n# define evaluation procedure\r\ncv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)\r\n# evaluate model\r\nscores = cross_val_score(model, X, y, scoring='roc_auc', cv=cv, n_jobs=-1)\r\n# summarize performance\r\nprint('Mean ROC AUC: %.3f' % mean(scores))<\/pre>\n<p>Running the example prepares the synthetic imbalanced classification dataset, then evaluates the class-weighted version of the decision tree algorithm using repeated cross-validation.<\/p>\n<p>Your specific results may vary given the stochastic nature of the learning algorithm. Try running the example a few times.<\/p>\n<p>The mean ROC AUC score is reported, in this case, showing a better score than the unweighted version of the decision tree algorithm: 0.759 as compared to 0.746.<\/p>\n<pre class=\"crayon-plain-tag\">Mean ROC AUC: 0.759<\/pre>\n<\/p>\n<h2>Grid Search Weighted Decision Tree<\/h2>\n<p>Using a class weighting that is the inverse ratio of the training data is just a heuristic.<\/p>\n<p>It is possible that better performance can be achieved with a different class weighting, and this too will depend on the choice of performance metric used to evaluate the model.<\/p>\n<p>In this section, we will grid search a range of different class weightings for the weighted decision tree and discover which results in the best ROC AUC score.<\/p>\n<p>We will try the following weightings for class 0 and 1:<\/p>\n<ul>\n<li>Class 0:100, Class 1:1.<\/li>\n<li>Class 0:10, Class 1:1.<\/li>\n<li>Class 0:1, Class 1:1.<\/li>\n<li>Class 0:1, Class 1:10.<\/li>\n<li>Class 0:1, Class 1:100.<\/li>\n<\/ul>\n<p>These can be defined as grid search parameters for the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.model_selection.GridSearchCV.html\">GridSearchCV<\/a> class as follows:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define grid\r\nbalance = [{0:100,1:1}, {0:10,1:1}, {0:1,1:1}, {0:1,1:10}, {0:1,1:100}]\r\nparam_grid = dict(class_weight=balance)<\/pre>\n<p>We can perform the grid search on these parameters using repeated cross-validation and estimate model performance using ROC AUC:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# define evaluation procedure\r\ncv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)\r\n# define grid search\r\ngrid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=cv, scoring='roc_auc')<\/pre>\n<p>Once executed, we can summarize the best configuration as well as all of the results as follows:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# report the best configuration\r\nprint(\"Best: %f using %s\" % (grid_result.best_score_, grid_result.best_params_))\r\n# report all configurations\r\nmeans = grid_result.cv_results_['mean_test_score']\r\nstds = grid_result.cv_results_['std_test_score']\r\nparams = grid_result.cv_results_['params']\r\nfor mean, stdev, param in zip(means, stds, params):\r\n    print(\"%f (%f) with: %r\" % (mean, stdev, param))<\/pre>\n<p>Tying this together, the example below grid searches five different class weights for the decision tree algorithm on the imbalanced dataset.<\/p>\n<p>We might expect that the heuristic class weighing is the best performing configuration.<\/p>\n<pre class=\"crayon-plain-tag\"># grid search class weights with decision tree for imbalance classification\r\nfrom numpy import mean\r\nfrom sklearn.datasets import make_classification\r\nfrom sklearn.model_selection import GridSearchCV\r\nfrom sklearn.model_selection import RepeatedStratifiedKFold\r\nfrom sklearn.tree import DecisionTreeClassifier\r\n# generate dataset\r\nX, y = make_classification(n_samples=10000, n_features=2, n_redundant=0,\r\n\tn_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=3)\r\n# define model\r\nmodel = DecisionTreeClassifier()\r\n# define grid\r\nbalance = [{0:100,1:1}, {0:10,1:1}, {0:1,1:1}, {0:1,1:10}, {0:1,1:100}]\r\nparam_grid = dict(class_weight=balance)\r\n# define evaluation procedure\r\ncv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)\r\n# define grid search\r\ngrid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=cv, scoring='roc_auc')\r\n# execute the grid search\r\ngrid_result = grid.fit(X, y)\r\n# report the best configuration\r\nprint(\"Best: %f using %s\" % (grid_result.best_score_, grid_result.best_params_))\r\n# report all configurations\r\nmeans = grid_result.cv_results_['mean_test_score']\r\nstds = grid_result.cv_results_['std_test_score']\r\nparams = grid_result.cv_results_['params']\r\nfor mean, stdev, param in zip(means, stds, params):\r\n    print(\"%f (%f) with: %r\" % (mean, stdev, param))<\/pre>\n<p>Running the example evaluates each class weighting using repeated k-fold cross-validation and reports the best configuration and the associated mean ROC AUC score.<\/p>\n<p>Your specific results may vary given the stochastic nature of the learning algorithm. Try running the example a few times.<\/p>\n<p>In this case, we can see that the 1:100 majority to minority class weighting achieved the best mean ROC score. This matches the configuration for the general heuristic.<\/p>\n<p>It might be interesting to explore even more severe class weightings to see their effect on the mean ROC AUC score.<\/p>\n<pre class=\"crayon-plain-tag\">Best: 0.752643 using {'class_weight': {0: 1, 1: 100}}\r\n0.737306 (0.080007) with: {'class_weight': {0: 100, 1: 1}}\r\n0.747306 (0.075298) with: {'class_weight': {0: 10, 1: 1}}\r\n0.740606 (0.074948) with: {'class_weight': {0: 1, 1: 1}}\r\n0.747407 (0.068104) with: {'class_weight': {0: 1, 1: 10}}\r\n0.752643 (0.073195) with: {'class_weight': {0: 1, 1: 100}}<\/pre>\n<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Papers<\/h3>\n<ul>\n<li><a href=\"https:\/\/ieeexplore.ieee.org\/document\/1000348\">An Instance-weighting Method To Induce Cost-sensitive Trees<\/a>, 2002.<\/li>\n<\/ul>\n<h3>Books<\/h3>\n<ul>\n<li><a href=\"https:\/\/amzn.to\/307Xlva\">Learning from Imbalanced Data Sets<\/a>, 2018.<\/li>\n<li><a href=\"https:\/\/amzn.to\/32K9K6d\">Imbalanced Learning: Foundations, Algorithms, and Applications<\/a>, 2013.<\/li>\n<\/ul>\n<h3>APIs<\/h3>\n<ul>\n<li><a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.utils.class_weight.compute_class_weight.html\">sklearn.utils.class_weight.compute_class_weight API<\/a>.<\/li>\n<li><a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.tree.DecisionTreeClassifier.html\">sklearn.tree.DecisionTreeClassifier API<\/a>.<\/li>\n<li><a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.model_selection.GridSearchCV.html\">sklearn.model_selection.GridSearchCV API<\/a>.<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered the weighted decision tree for imbalanced classification.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How the standard decision tree algorithm does not support imbalanced classification.<\/li>\n<li>How the decision tree algorithm can be modified to weight model error by class weight when selecting splits.<\/li>\n<li>How to configure class weight for the decision tree algorithm and how to grid search different class weight configurations.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/cost-sensitive-decision-trees-for-imbalanced-classification\/\">Cost-Sensitive Decision Trees for Imbalanced Classification<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/cost-sensitive-decision-trees-for-imbalanced-classification\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee The decision tree algorithm is effective for balanced classification, although it does not perform well on imbalanced datasets. The split points of [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2020\/01\/28\/cost-sensitive-decision-trees-for-imbalanced-classification\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":3074,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/3073"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=3073"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/3073\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/3074"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=3073"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=3073"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=3073"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}