{"id":4387,"date":"2021-02-11T18:00:23","date_gmt":"2021-02-11T18:00:23","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2021\/02\/11\/how-to-develop-a-neural-net-for-predicting-disturbances-in-the-ionosphere\/"},"modified":"2021-02-11T18:00:23","modified_gmt":"2021-02-11T18:00:23","slug":"how-to-develop-a-neural-net-for-predicting-disturbances-in-the-ionosphere","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2021\/02\/11\/how-to-develop-a-neural-net-for-predicting-disturbances-in-the-ionosphere\/","title":{"rendered":"How to Develop a Neural Net for Predicting Disturbances in the Ionosphere"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>It can be challenging to develop a neural network predictive model for a new dataset.<\/p>\n<p>One approach is to first inspect the dataset and develop ideas for what models might work, then explore the learning dynamics of simple models on the dataset, then finally develop and tune a model for the dataset with a robust test harness.<\/p>\n<p>This process can be used to develop effective neural network models for classification and regression predictive modeling problems.<\/p>\n<p>In this tutorial, you will discover how to develop a Multilayer Perceptron neural network model for the ionosphere binary classification dataset.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to load and summarize the ionosphere dataset and use the results to suggest data preparations and model configurations to use.<\/li>\n<li>How to explore the learning dynamics of simple MLP models on the dataset.<\/li>\n<li>How to develop robust estimates of model performance, tune model performance, and make predictions on new data.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_11898\" style=\"width: 810px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11898\" loading=\"lazy\" class=\"size-full wp-image-11898\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/03\/How-to-Develop-a-Neural-Net-for-Predicting-Disturbances-in-the-Ionosphere.jpg\" alt=\"How to Develop a Neural Net for Predicting Disturbances in the Ionosphere\" width=\"800\" height=\"534\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2021\/03\/How-to-Develop-a-Neural-Net-for-Predicting-Disturbances-in-the-Ionosphere.jpg 800w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2021\/03\/How-to-Develop-a-Neural-Net-for-Predicting-Disturbances-in-the-Ionosphere-300x200.jpg 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2021\/03\/How-to-Develop-a-Neural-Net-for-Predicting-Disturbances-in-the-Ionosphere-768x513.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\"><\/p>\n<p id=\"caption-attachment-11898\" class=\"wp-caption-text\">How to Develop a Neural Net for Predicting Disturbances in the Ionosphere<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/pesterev\/15779556605\/\">Sergey Pesterev<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into four parts; they are:<\/p>\n<ol>\n<li>Ionosphere Binary Classification Dataset<\/li>\n<li>Neural Network Learning Dynamics<\/li>\n<li>Evaluating and Tuning MLP Models<\/li>\n<li>Final Model and Make Predictions<\/li>\n<\/ol>\n<h2>Ionosphere Binary Classification Dataset<\/h2>\n<p>The first step is to define and explore the dataset.<\/p>\n<p>We will be working with the \u201c<em>Ionosphere<\/em>\u201d standard binary classification dataset.<\/p>\n<p>This dataset involves predicting whether a structure is in the atmosphere or not given radar returns.<\/p>\n<p>You can learn more about the dataset here:<\/p>\n<ul>\n<li><a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv\">Ionosphere Dataset (ionosphere.csv)<\/a><\/li>\n<li><a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.names\">Ionosphere Dataset Details (ionosphere.names)<\/a><\/li>\n<\/ul>\n<p>You can see the first few rows of the dataset below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1,0.03760,0.85243,-0.17755,0.59755,-0.44945,0.60536,-0.38223,0.84356,-0.38542,0.58212,-0.32192,0.56971,-0.29674,0.36946,-0.47357,0.56811,-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300,g\r\n1,0,1,-0.18829,0.93035,-0.36156,-0.10868,-0.93597,1,-0.04549,0.50874,-0.67743,0.34432,-0.69707,-0.51685,-0.97515,0.05499,-0.62237,0.33109,-1,-0.13151,-0.45300,-0.18056,-0.35734,-0.20332,-0.26569,-0.20468,-0.18401,-0.19040,-0.11593,-0.16626,-0.06288,-0.13738,-0.02447,b\r\n1,0,1,-0.03365,1,0.00485,1,-0.12062,0.88965,0.01198,0.73082,0.05346,0.85443,0.00827,0.54591,0.00299,0.83775,-0.13644,0.75535,-0.08540,0.70887,-0.27502,0.43385,-0.12062,0.57528,-0.40220,0.58984,-0.22145,0.43100,-0.17365,0.60436,-0.24180,0.56045,-0.38238,g\r\n1,0,1,-0.45161,1,1,0.71216,-1,0,0,0,0,0,0,-1,0.14516,0.54094,-0.39330,-1,-0.54467,-0.69975,1,0,0,1,0.90695,0.51613,1,1,-0.20099,0.25682,1,-0.32382,1,b\r\n1,0,1,-0.02401,0.94140,0.06531,0.92106,-0.23255,0.77152,-0.16399,0.52798,-0.20275,0.56409,-0.00712,0.34395,-0.27457,0.52940,-0.21780,0.45107,-0.17813,0.05982,-0.35575,0.02309,-0.52879,0.03286,-0.65158,0.13290,-0.53206,0.02431,-0.62197,-0.05707,-0.59573,-0.04608,-0.65697,g\r\n...<\/pre>\n<p>We can see that the values are all numeric and perhaps in the range [-1, 1]. This suggests some type of scaling would probably not be needed.<\/p>\n<p>We can also see that the label is a string (\u201c<em>g<\/em>\u201d and \u201c<em>b<\/em>\u201c), suggesting that the values will need to be encoded to 0 and 1 prior to fitting a model.<\/p>\n<p>We can load the dataset as a pandas DataFrame directly from the URL; for example:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># load the ionosphere dataset and summarize the shape\r\nfrom pandas import read_csv\r\n# define the location of the dataset\r\nurl = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\n# load the dataset\r\ndf = read_csv(url, header=None)\r\n# summarize shape\r\nprint(df.shape)<\/pre>\n<p>Running the example loads the dataset directly from the URL and reports the shape of the dataset.<\/p>\n<p>In this case, we can see that the dataset has 35 variables (34 input and one output) and that the dataset has 351 rows of data.<\/p>\n<p>This is not many rows of data for a neural network and suggests that a small network, perhaps with regularization, would be appropriate.<\/p>\n<p>It also suggests that using <a href=\"https:\/\/machinelearningmastery.com\/k-fold-cross-validation\/\">k-fold cross-validation<\/a> would be a good idea given that it will give a more reliable estimate of model performance than a train\/test split and because a single model will fit in seconds instead of hours or days with the largest datasets.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(351, 35)<\/pre>\n<p>Next, we can learn more about the dataset by looking at summary statistics and a plot of the data.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># show summary statistics and plots of the ionosphere dataset\r\nfrom pandas import read_csv\r\nfrom matplotlib import pyplot\r\n# define the location of the dataset\r\nurl = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\n# load the dataset\r\ndf = read_csv(url, header=None)\r\n# show summary statistics\r\nprint(df.describe())\r\n# plot histograms\r\ndf.hist()\r\npyplot.show()<\/pre>\n<p>Running the example first loads the data before and then prints summary statistics for each variable.<\/p>\n<p>We can see that the mean values for each variable are in the tens, with values ranging from -1 to 1. This confirms that scaling the data is probably not required.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">0      1           2   ...          31          32          33\r\ncount  351.000000  351.0  351.000000  ...  351.000000  351.000000  351.000000\r\nmean     0.891738    0.0    0.641342  ...   -0.003794    0.349364    0.014480\r\nstd      0.311155    0.0    0.497708  ...    0.513574    0.522663    0.468337\r\nmin      0.000000    0.0   -1.000000  ...   -1.000000   -1.000000   -1.000000\r\n25%      1.000000    0.0    0.472135  ...   -0.242595    0.000000   -0.165350\r\n50%      1.000000    0.0    0.871110  ...    0.000000    0.409560    0.000000\r\n75%      1.000000    0.0    1.000000  ...    0.200120    0.813765    0.171660\r\nmax      1.000000    0.0    1.000000  ...    1.000000    1.000000    1.000000<\/pre>\n<p>A histogram plot is then created for each variable.<\/p>\n<p>We can see that many variables have a Gaussian or Gaussian-like distribution.<\/p>\n<p>We may have some benefit in using a <a href=\"https:\/\/machinelearningmastery.com\/power-transforms-with-scikit-learn\/\">power transform<\/a> on each variable in order to make the probability distribution less skewed which will likely improve model performance.<\/p>\n<div id=\"attachment_11894\" style=\"width: 1290px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11894\" loading=\"lazy\" class=\"size-full wp-image-11894\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/11\/Histograms-of-the-Ionosphere-Classification-Dataset.png\" alt=\"Histograms of the Ionosphere Classification Dataset\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Histograms-of-the-Ionosphere-Classification-Dataset.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Histograms-of-the-Ionosphere-Classification-Dataset-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Histograms-of-the-Ionosphere-Classification-Dataset-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Histograms-of-the-Ionosphere-Classification-Dataset-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-11894\" class=\"wp-caption-text\">Histograms of the Ionosphere Classification Dataset<\/p>\n<\/div>\n<p>Now that we are familiar with the dataset, let\u2019s explore how we might develop a neural network model.<\/p>\n<h2>Neural Network Learning Dynamics<\/h2>\n<p>We will develop a Multilayer Perceptron (MLP) model for the dataset using TensorFlow.<\/p>\n<p>We cannot know what model architecture of learning hyperparameters would be good or best for this dataset, so we must experiment and discover what works well.<\/p>\n<p>Given that the dataset is small, a small <a href=\"https:\/\/machinelearningmastery.com\/how-to-control-the-speed-and-stability-of-training-neural-networks-with-gradient-descent-batch-size\/\">batch size<\/a> is probably a good idea, e.g. 16 or 32 rows. Using the Adam version of stochastic gradient descent is a good idea when getting started as it will automatically adapts the <a href=\"https:\/\/machinelearningmastery.com\/understand-the-dynamics-of-learning-rate-on-deep-learning-neural-networks\/\">learning rate<\/a> and works well on most datasets.<\/p>\n<p>Before we evaluate models in earnest, it is a good idea to review the learning dynamics and tune the model architecture and learning configuration until we have stable learning dynamics, then look at getting the most out of the model.<\/p>\n<p>We can do this by using a simple <a href=\"https:\/\/machinelearningmastery.com\/train-test-split-for-evaluating-machine-learning-algorithms\/\">train\/test split<\/a> of the data and review plots of the <a href=\"https:\/\/machinelearningmastery.com\/learning-curves-for-diagnosing-machine-learning-model-performance\/\">learning curves<\/a>. This will help us see if we are over-learning or under-learning; then we can adapt the configuration accordingly.<\/p>\n<p>First, we must ensure all input variables are floating-point values and encode the target label as integer values 0 and 1.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\ny = LabelEncoder().fit_transform(y)<\/pre>\n<p>Next, we can split the dataset into input and output variables, then into 67\/33 train and test sets.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# split into train and test datasets\r\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)<\/pre>\n<p>We can define a minimal MLP model. In this case, we will use one hidden layer with 10 nodes and one output layer (chosen arbitrarily). We will use the <a href=\"https:\/\/machinelearningmastery.com\/how-to-fix-vanishing-gradients-using-the-rectified-linear-activation-function\/\">ReLU activation function<\/a> in the hidden layer and the \u201c<em>he_normal<\/em>\u201d weight initialization, as together, they are a good practice.<\/p>\n<p>The output of the model is a sigmoid activation for binary classification and we will minimize <a href=\"https:\/\/machinelearningmastery.com\/cross-entropy-for-machine-learning\/\">binary cross-entropy loss<\/a>.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# determine the number of input features\r\nn_features = X.shape[1]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dense(1, activation='sigmoid'))\r\n# compile the model\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')<\/pre>\n<p>We will fit the model for 200 training epochs (chosen arbitrarily) with a batch size of 32 because it is a small dataset.<\/p>\n<p>We are fitting the model on raw data, which we think might be a good idea, but it is an important starting point.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# fit the model\r\nhistory = model.fit(X_train, y_train, epochs=200, batch_size=32, verbose=0, validation_data=(X_test,y_test))<\/pre>\n<p>At the end of training, we will evaluate the model\u2019s performance on the test dataset and report performance as the classification accuracy.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# predict test set\r\nyhat = model.predict_classes(X_test)\r\n# evaluate predictions\r\nscore = accuracy_score(y_test, yhat)\r\nprint('Accuracy: %.3f' % score)<\/pre>\n<p>Finally, we will plot learning curves of the cross-entropy loss on the train and test sets during training.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# plot learning curves\r\npyplot.title('Learning Curves')\r\npyplot.xlabel('Epoch')\r\npyplot.ylabel('Cross Entropy')\r\npyplot.plot(history.history['loss'], label='train')\r\npyplot.plot(history.history['val_loss'], label='val')\r\npyplot.legend()\r\npyplot.show()<\/pre>\n<p>Tying this all together, the complete example of evaluating our first MLP on the ionosphere dataset is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># fit a simple mlp model on the ionosphere and review learning curves\r\nfrom pandas import read_csv\r\nfrom sklearn.model_selection import train_test_split\r\nfrom sklearn.preprocessing import LabelEncoder\r\nfrom sklearn.metrics import accuracy_score\r\nfrom tensorflow.keras import Sequential\r\nfrom tensorflow.keras.layers import Dense\r\nfrom matplotlib import pyplot\r\n# load the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\ndf = read_csv(path, header=None)\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\ny = LabelEncoder().fit_transform(y)\r\n# split into train and test datasets\r\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)\r\n# determine the number of input features\r\nn_features = X.shape[1]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dense(1, activation='sigmoid'))\r\n# compile the model\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n# fit the model\r\nhistory = model.fit(X_train, y_train, epochs=200, batch_size=32, verbose=0, validation_data=(X_test,y_test))\r\n# predict test set\r\nyhat = model.predict_classes(X_test)\r\n# evaluate predictions\r\nscore = accuracy_score(y_test, yhat)\r\nprint('Accuracy: %.3f' % score)\r\n# plot learning curves\r\npyplot.title('Learning Curves')\r\npyplot.xlabel('Epoch')\r\npyplot.ylabel('Cross Entropy')\r\npyplot.plot(history.history['loss'], label='train')\r\npyplot.plot(history.history['val_loss'], label='val')\r\npyplot.legend()\r\npyplot.show()<\/pre>\n<p>Running the example first fits the model on the training dataset, then reports the classification accuracy on the test dataset.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that the model achieved an accuracy of about 88 percent, which is a good baseline in performance that we might be able to improve upon.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Accuracy: 0.888<\/pre>\n<p>Line plots of the loss on the train and test sets are then created.<\/p>\n<p>We can see that the model appears to converge but has overfit the training dataset.<\/p>\n<div id=\"attachment_11895\" style=\"width: 1290px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11895\" loading=\"lazy\" class=\"size-full wp-image-11895\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Simple-MLP-on-Ionosphere-Dataset.png\" alt=\"Learning Curves of Simple MLP on Ionosphere Dataset\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Simple-MLP-on-Ionosphere-Dataset.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Simple-MLP-on-Ionosphere-Dataset-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Simple-MLP-on-Ionosphere-Dataset-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Simple-MLP-on-Ionosphere-Dataset-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-11895\" class=\"wp-caption-text\">Learning Curves of Simple MLP on Ionosphere Dataset<\/p>\n<\/div>\n<p>Let\u2019s try increasing the capacity of the model.<\/p>\n<p>This will slow down learning for the same learning hyperparameters and may offer better accuracy.<\/p>\n<p>We will add a second hidden layer with eight nodes, chosen arbitrarily.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dense(8, activation='relu', kernel_initializer='he_normal'))\r\nmodel.add(Dense(1, activation='sigmoid'))<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># fit a deeper mlp model on the ionosphere and review learning curves\r\nfrom pandas import read_csv\r\nfrom sklearn.model_selection import train_test_split\r\nfrom sklearn.preprocessing import LabelEncoder\r\nfrom sklearn.metrics import accuracy_score\r\nfrom tensorflow.keras import Sequential\r\nfrom tensorflow.keras.layers import Dense\r\nfrom matplotlib import pyplot\r\n# load the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\ndf = read_csv(path, header=None)\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\ny = LabelEncoder().fit_transform(y)\r\n# split into train and test datasets\r\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)\r\n# determine the number of input features\r\nn_features = X.shape[1]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dense(8, activation='relu', kernel_initializer='he_normal'))\r\nmodel.add(Dense(1, activation='sigmoid'))\r\n# compile the model\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n# fit the model\r\nhistory = model.fit(X_train, y_train, epochs=200, batch_size=32, verbose=0, validation_data=(X_test,y_test))\r\n# predict test set\r\nyhat = model.predict_classes(X_test)\r\n# evaluate predictions\r\nscore = accuracy_score(y_test, yhat)\r\nprint('Accuracy: %.3f' % score)\r\n# plot learning curves\r\npyplot.title('Learning Curves')\r\npyplot.xlabel('Epoch')\r\npyplot.ylabel('Cross Entropy')\r\npyplot.plot(history.history['loss'], label='train')\r\npyplot.plot(history.history['val_loss'], label='val')\r\npyplot.legend()<\/pre>\n<p>Running the example first fits the model on the training dataset, then reports the accuracy on the test dataset.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see a slight improvement in accuracy to about 93 percent, although the high variance of the train\/test split means that this evaluation is not reliable.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Accuracy: 0.931<\/pre>\n<p>Learning curves for the loss on the train and test sets are then plotted. We can see that the model still appears to show an overfitting behavior.<\/p>\n<div id=\"attachment_11896\" style=\"width: 1290px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11896\" loading=\"lazy\" class=\"size-full wp-image-11896\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Deeper-MLP-on-the-Ionosphere-Dataset.png\" alt=\"Learning Curves of Deeper MLP on the Ionosphere Dataset\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Deeper-MLP-on-the-Ionosphere-Dataset.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Deeper-MLP-on-the-Ionosphere-Dataset-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Deeper-MLP-on-the-Ionosphere-Dataset-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Deeper-MLP-on-the-Ionosphere-Dataset-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-11896\" class=\"wp-caption-text\">Learning Curves of Deeper MLP on the Ionosphere Dataset<\/p>\n<\/div>\n<p>Finally, we can try a wider network.<\/p>\n<p>We will increase the number of nodes in the first hidden layer from 10 to 50, and in the second hidden layer from 8 to 10.<\/p>\n<p>This will add more capacity to the model, slow down learning, and may further improve results.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(50, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal'))\r\nmodel.add(Dense(1, activation='sigmoid'))<\/pre>\n<p>We will also reduce the number of training epochs from 200 to 100.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# fit the model\r\nhistory = model.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0, validation_data=(X_test,y_test))<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># fit a wider mlp model on the ionosphere and review learning curves\r\nfrom pandas import read_csv\r\nfrom sklearn.model_selection import train_test_split\r\nfrom sklearn.preprocessing import LabelEncoder\r\nfrom sklearn.metrics import accuracy_score\r\nfrom tensorflow.keras import Sequential\r\nfrom tensorflow.keras.layers import Dense\r\nfrom matplotlib import pyplot\r\n# load the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\ndf = read_csv(path, header=None)\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\ny = LabelEncoder().fit_transform(y)\r\n# split into train and test datasets\r\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)\r\n# determine the number of input features\r\nn_features = X.shape[1]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(50, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal'))\r\nmodel.add(Dense(1, activation='sigmoid'))\r\n# compile the model\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n# fit the model\r\nhistory = model.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0, validation_data=(X_test,y_test))\r\n# predict test set\r\nyhat = model.predict_classes(X_test)\r\n# evaluate predictions\r\nscore = accuracy_score(y_test, yhat)\r\nprint('Accuracy: %.3f' % score)\r\n# plot learning curves\r\npyplot.title('Learning Curves')\r\npyplot.xlabel('Epoch')\r\npyplot.ylabel('Cross Entropy')\r\npyplot.plot(history.history['loss'], label='train')\r\npyplot.plot(history.history['val_loss'], label='val')\r\npyplot.legend()\r\npyplot.show()<\/pre>\n<p>Running the example first fits the model on the training dataset, then reports the accuracy on the test dataset.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, the model achieves a better accuracy score, with a value of about 94 percent. We will ignore model performance for now.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Accuracy: 0.940<\/pre>\n<p>Line plots of the learning curves are created showing that the model achieved a reasonable fit and had more than enough time to converge.<\/p>\n<div id=\"attachment_11897\" style=\"width: 1290px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11897\" loading=\"lazy\" class=\"size-full wp-image-11897\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Wider-MLP-on-the-Ionosphere-Dataset.png\" alt=\"Learning Curves of Wider MLP on the Ionosphere Dataset\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Wider-MLP-on-the-Ionosphere-Dataset.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Wider-MLP-on-the-Ionosphere-Dataset-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Wider-MLP-on-the-Ionosphere-Dataset-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Learning-Curves-of-Wider-MLP-on-the-Ionosphere-Dataset-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-11897\" class=\"wp-caption-text\">Learning Curves of Wider MLP on the Ionosphere Dataset<\/p>\n<\/div>\n<p>Now that we have some idea of the learning dynamics for simple MLP models on the dataset, we can look at evaluating the performance of the models as well as tuning the configuration of the models.<\/p>\n<h2>Evaluating and Tuning MLP Models<\/h2>\n<p>The k-fold cross-validation procedure can provide a more reliable estimate of MLP performance, although it can be very slow.<\/p>\n<p>This is because <em>k<\/em> models must be fit and evaluated. This is not a problem when the dataset size is small, such as the ionosphere dataset.<\/p>\n<p>We can use the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.model_selection.StratifiedKFold.html\">StratifiedKFold<\/a> class and enumerate each fold manually, fit the model, evaluate it, and then report the mean of the evaluation scores at the end of the procedure.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># prepare cross validation\r\nkfold = KFold(10)\r\n# enumerate splits\r\nscores = list()\r\nfor train_ix, test_ix in kfold.split(X, y):\r\n\t# fit and evaluate the model...\r\n\t...\r\n...\r\n# summarize all scores\r\nprint('Mean Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))<\/pre>\n<p>We can use this framework to develop a reliable estimate of MLP model performance with a range of different data preparations, model architectures, and learning configurations.<\/p>\n<p>It is important that we first developed an understanding of the learning dynamics of the model on the dataset in the previous section before using k-fold cross-validation to estimate the performance. If we started to tune the model directly, we might get good results, but if not, we might have no idea of why, e.g. that the model was over or under fitting.<\/p>\n<p>If we make large changes to the model again, it is a good idea to go back and confirm that the model is converging appropriately.<\/p>\n<p>The complete example of this framework to evaluate the base MLP model from the previous section is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># k-fold cross-validation of base model for the ionosphere dataset\r\nfrom numpy import mean\r\nfrom numpy import std\r\nfrom pandas import read_csv\r\nfrom sklearn.model_selection import StratifiedKFold\r\nfrom sklearn.preprocessing import LabelEncoder\r\nfrom sklearn.metrics import accuracy_score\r\nfrom tensorflow.keras import Sequential\r\nfrom tensorflow.keras.layers import Dense\r\nfrom matplotlib import pyplot\r\n# load the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\ndf = read_csv(path, header=None)\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\ny = LabelEncoder().fit_transform(y)\r\n# prepare cross validation\r\nkfold = StratifiedKFold(10)\r\n# enumerate splits\r\nscores = list()\r\nfor train_ix, test_ix in kfold.split(X, y):\r\n\t# split data\r\n\tX_train, X_test, y_train, y_test = X[train_ix], X[test_ix], y[train_ix], y[test_ix]\r\n\t# determine the number of input features\r\n\tn_features = X.shape[1]\r\n\t# define model\r\n\tmodel = Sequential()\r\n\tmodel.add(Dense(50, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\n\tmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal'))\r\n\tmodel.add(Dense(1, activation='sigmoid'))\r\n\t# compile the model\r\n\tmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n\t# fit the model\r\n\tmodel.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0)\r\n\t# predict test set\r\n\tyhat = model.predict_classes(X_test)\r\n\t# evaluate predictions\r\n\tscore = accuracy_score(y_test, yhat)\r\n\tprint('&gt;%.3f' % score)\r\n\tscores.append(score)\r\n# summarize all scores\r\nprint('Mean Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))<\/pre>\n<p>Running the example reports the model performance each iteration of the evaluation procedure and reports the mean and standard deviation of classification accuracy at the end of the run.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that the MLP model achieved a mean accuracy of about 93.4 percent.<\/p>\n<p>We will use this result as our baseline to see if we can achieve better performance.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;0.972\r\n&gt;0.886\r\n&gt;0.943\r\n&gt;0.886\r\n&gt;0.914\r\n&gt;0.943\r\n&gt;0.943\r\n&gt;1.000\r\n&gt;0.971\r\n&gt;0.886\r\nMean Accuracy: 0.934 (0.039)<\/pre>\n<p>Next, let\u2019s try adding regularization to reduce overfitting of the model.<\/p>\n<p>In this case, we can add dropout layers between the hidden layers of the network. For example:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(50, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dropout(0.4))\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal'))\r\nmodel.add(Dropout(0.4))\r\nmodel.add(Dense(1, activation='sigmoid'))<\/pre>\n<p>The complete example of the MLP model with dropout is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># k-fold cross-validation of the MLP with dropout for the ionosphere dataset\r\nfrom numpy import mean\r\nfrom numpy import std\r\nfrom pandas import read_csv\r\nfrom sklearn.model_selection import StratifiedKFold\r\nfrom sklearn.preprocessing import LabelEncoder\r\nfrom sklearn.metrics import accuracy_score\r\nfrom tensorflow.keras import Sequential\r\nfrom tensorflow.keras.layers import Dense\r\nfrom tensorflow.keras.layers import Dropout\r\nfrom matplotlib import pyplot\r\n# load the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\ndf = read_csv(path, header=None)\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\ny = LabelEncoder().fit_transform(y)\r\n# prepare cross validation\r\nkfold = StratifiedKFold(10)\r\n# enumerate splits\r\nscores = list()\r\nfor train_ix, test_ix in kfold.split(X, y):\r\n\t# split data\r\n\tX_train, X_test, y_train, y_test = X[train_ix], X[test_ix], y[train_ix], y[test_ix]\r\n\t# determine the number of input features\r\n\tn_features = X.shape[1]\r\n\t# define model\r\n\tmodel = Sequential()\r\n\tmodel.add(Dense(50, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\n\tmodel.add(Dropout(0.4))\r\n\tmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal'))\r\n\tmodel.add(Dropout(0.4))\r\n\tmodel.add(Dense(1, activation='sigmoid'))\r\n\t# compile the model\r\n\tmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n\t# fit the model\r\n\tmodel.fit(X_train, y_train, epochs=100, batch_size=32, verbose=0)\r\n\t# predict test set\r\n\tyhat = model.predict_classes(X_test)\r\n\t# evaluate predictions\r\n\tscore = accuracy_score(y_test, yhat)\r\n\tprint('&gt;%.3f' % score)\r\n\tscores.append(score)\r\n# summarize all scores\r\nprint('Mean Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))<\/pre>\n<p>Running reports the mean and standard deviation of the classification accuracy at the end of the run.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that the MLP model with dropout achieves better results with an accuracy of about 94.6 percent compared to 93.4 percent without dropout<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Mean Accuracy: 0.946 (0.043)<\/pre>\n<p>Finally, we will try reducing the batch size from 32 down to 8.<\/p>\n<p>This will result in more noisy gradients and may also slow down the speed at which the model is learning the problem.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# fit the model\r\nmodel.fit(X_train, y_train, epochs=100, batch_size=8, verbose=0)<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># k-fold cross-validation of the MLP with dropout for the ionosphere dataset\r\nfrom numpy import mean\r\nfrom numpy import std\r\nfrom pandas import read_csv\r\nfrom sklearn.model_selection import StratifiedKFold\r\nfrom sklearn.preprocessing import LabelEncoder\r\nfrom sklearn.metrics import accuracy_score\r\nfrom tensorflow.keras import Sequential\r\nfrom tensorflow.keras.layers import Dense\r\nfrom tensorflow.keras.layers import Dropout\r\nfrom matplotlib import pyplot\r\n# load the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\ndf = read_csv(path, header=None)\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\ny = LabelEncoder().fit_transform(y)\r\n# prepare cross validation\r\nkfold = StratifiedKFold(10)\r\n# enumerate splits\r\nscores = list()\r\nfor train_ix, test_ix in kfold.split(X, y):\r\n\t# split data\r\n\tX_train, X_test, y_train, y_test = X[train_ix], X[test_ix], y[train_ix], y[test_ix]\r\n\t# determine the number of input features\r\n\tn_features = X.shape[1]\r\n\t# define model\r\n\tmodel = Sequential()\r\n\tmodel.add(Dense(50, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\n\tmodel.add(Dropout(0.4))\r\n\tmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal'))\r\n\tmodel.add(Dropout(0.4))\r\n\tmodel.add(Dense(1, activation='sigmoid'))\r\n\t# compile the model\r\n\tmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n\t# fit the model\r\n\tmodel.fit(X_train, y_train, epochs=100, batch_size=8, verbose=0)\r\n\t# predict test set\r\n\tyhat = model.predict_classes(X_test)\r\n\t# evaluate predictions\r\n\tscore = accuracy_score(y_test, yhat)\r\n\tprint('&gt;%.3f' % score)\r\n\tscores.append(score)\r\n# summarize all scores\r\nprint('Mean Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))<\/pre>\n<p>Running reports the mean and standard deviation of the classification accuracy at the end of the run.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that the MLP model with dropout achieves slightly better results with an accuracy of about 94.9 percent.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Mean Accuracy: 0.949 (0.042)<\/pre>\n<p>We will use this configuration as our final model.<\/p>\n<p>We could continue to test alternate configurations to the model architecture (more or fewer nodes or layers), learning hyperparameters (more or fewer batches), and data transforms.<\/p>\n<p>I leave this as an exercise; let me know what you discover.<strong> Can you get better results?<\/strong><br \/>\nPost your results in the comments below, I\u2019d love to see what you get.<\/p>\n<p>Next, let\u2019s look at how we might fit a final model and use it to make predictions.<\/p>\n<h2>Final Model and Make Predictions<\/h2>\n<p>Once we choose a model configuration, we can train a final model on all available data and use it to make predictions on new data.<\/p>\n<p>In this case, we will use the model with dropout and a small batch size as our final model.<\/p>\n<p>We can prepare the data and fit the model as before, although on the entire dataset instead of a training subset of the dataset.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\nle = LabelEncoder()\r\ny = le.fit_transform(y)\r\n# determine the number of input features\r\nn_features = X.shape[1]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(50, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dropout(0.4))\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal'))\r\nmodel.add(Dropout(0.4))\r\nmodel.add(Dense(1, activation='sigmoid'))\r\n# compile the model\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')<\/pre>\n<p>We can then use this model to make predictions on new data.<\/p>\n<p>First, we can define a row of new data.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# define a row of new data\r\nrow = [1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1,0.03760,0.85243,-0.17755,0.59755,-0.44945,0.60536,-0.38223,0.84356,-0.38542,0.58212,-0.32192,0.56971,-0.29674,0.36946,-0.47357,0.56811,-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300]<\/pre>\n<p>Note: I took this row from the first row of the dataset and the expected label is a \u2018<em>g<\/em>\u2018.<\/p>\n<p>We can then make a prediction.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# make prediction\r\nyhat = model.predict_classes([row])<\/pre>\n<p>Then invert the transform on the prediction, so we can use or interpret the result in the correct label.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# invert transform to get label for class\r\nyhat = le.inverse_transform(yhat)<\/pre>\n<p>And in this case, we will simply report the prediction.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# report prediction\r\nprint('Predicted: %s' % (yhat[0]))<\/pre>\n<p>Tying this all together, the complete example of fitting a final model for the ionosphere dataset and using it to make a prediction on new data is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># fit a final model and make predictions on new data for the ionosphere dataset\r\nfrom pandas import read_csv\r\nfrom sklearn.preprocessing import LabelEncoder\r\nfrom sklearn.metrics import accuracy_score\r\nfrom tensorflow.keras import Sequential\r\nfrom tensorflow.keras.layers import Dense\r\nfrom tensorflow.keras.layers import Dropout\r\n# load the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/ionosphere.csv'\r\ndf = read_csv(path, header=None)\r\n# split into input and output columns\r\nX, y = df.values[:, :-1], df.values[:, -1]\r\n# ensure all data are floating point values\r\nX = X.astype('float32')\r\n# encode strings to integer\r\nle = LabelEncoder()\r\ny = le.fit_transform(y)\r\n# determine the number of input features\r\nn_features = X.shape[1]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(50, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))\r\nmodel.add(Dropout(0.4))\r\nmodel.add(Dense(10, activation='relu', kernel_initializer='he_normal'))\r\nmodel.add(Dropout(0.4))\r\nmodel.add(Dense(1, activation='sigmoid'))\r\n# compile the model\r\nmodel.compile(optimizer='adam', loss='binary_crossentropy')\r\n# fit the model\r\nmodel.fit(X, y, epochs=100, batch_size=8, verbose=0)\r\n# define a row of new data\r\nrow = [1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1,0.03760,0.85243,-0.17755,0.59755,-0.44945,0.60536,-0.38223,0.84356,-0.38542,0.58212,-0.32192,0.56971,-0.29674,0.36946,-0.47357,0.56811,-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300]\r\n# make prediction\r\nyhat = model.predict_classes([row])\r\n# invert transform to get label for class\r\nyhat = le.inverse_transform(yhat)\r\n# report prediction\r\nprint('Predicted: %s' % (yhat[0]))<\/pre>\n<p>Running the example fits the model on the entire dataset and makes a prediction for a single row of new data.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see that the model predicted a \u201cg\u201d label for the input row.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Predicted: g<\/pre>\n<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Tutorials<\/h3>\n<ul>\n<li><a href=\"https:\/\/machinelearningmastery.com\/results-for-standard-classification-and-regression-machine-learning-datasets\/\">Best Results for Standard Machine Learning Datasets<\/a><\/li>\n<li><a href=\"https:\/\/machinelearningmastery.com\/tensorflow-tutorial-deep-learning-with-tf-keras\/\">TensorFlow 2 Tutorial: Get Started in Deep Learning With tf.keras<\/a><\/li>\n<li><a href=\"https:\/\/machinelearningmastery.com\/k-fold-cross-validation\/\">A Gentle Introduction to k-fold Cross-Validation<\/a><\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to develop a Multilayer Perceptron neural network model for the ionosphere binary classification dataset.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to load and summarize the ionosphere dataset and use the results to suggest data preparations and model configurations to use.<\/li>\n<li>How to explore the learning dynamics of simple MLP models on the dataset.<\/li>\n<li>How to develop robust estimates of model performance, tune model performance and make predictions on new data.<\/li>\n<\/ul>\n<p><strong>Do you have any questions?<\/strong><br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/predicting-disturbances-in-the-ionosphere\/\">How to Develop a Neural Net for Predicting Disturbances in the Ionosphere<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/predicting-disturbances-in-the-ionosphere\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee It can be challenging to develop a neural network predictive model for a new dataset. One approach is to first inspect the [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2021\/02\/11\/how-to-develop-a-neural-net-for-predicting-disturbances-in-the-ionosphere\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":4388,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4387"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=4387"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4387\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/4388"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=4387"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=4387"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=4387"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}