{"id":5502,"date":"2022-03-21T14:00:49","date_gmt":"2022-03-21T14:00:49","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2022\/03\/21\/data-visualization-in-python-with-matplotlib-seaborn-and-bokeh\/"},"modified":"2022-03-21T14:00:49","modified_gmt":"2022-03-21T14:00:49","slug":"data-visualization-in-python-with-matplotlib-seaborn-and-bokeh","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2022\/03\/21\/data-visualization-in-python-with-matplotlib-seaborn-and-bokeh\/","title":{"rendered":"Data Visualization in Python with matplotlib, Seaborn and Bokeh"},"content":{"rendered":"<p>Author: Mehreen Saeed<\/p>\n<div>\n<p>Data visualization is an important aspect of all AI and machine learning applications. You can gain key insights of your data through different graphical representations. In this tutorial, we\u2019ll talk about a few options for data visualization in Python. We\u2019ll use the MNIST dataset and the Tensorflow library for number crunching and data manipulation. To illustrate various methods for creating different types of graphs, we\u2019ll use the Python\u2019s graphing libraries namely matplotlib, Seaborn and Bokeh.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to visualize images in matplotlib<\/li>\n<li>How to make scatter plots in matplotlib, Seaborn and Bokeh<\/li>\n<li>How to make multiline plots in matplotlib, Seaborn and Bokeh<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_13327\" style=\"width: 688px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/IMG_0570-scaled.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-13327\" class=\"wp-image-13327 \" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/IMG_0570-1024x768.jpg\" alt=\"Picture of Istanbul taken from airplane\" width=\"678\" height=\"509\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/IMG_0570-1024x768.jpg 1024w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/IMG_0570-300x225.jpg 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/IMG_0570-768x576.jpg 768w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/IMG_0570-1536x1152.jpg 1536w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/IMG_0570-2048x1536.jpg 2048w\" sizes=\"(max-width: 678px) 100vw, 678px\"><\/a><\/p>\n<p id=\"caption-attachment-13327\" class=\"wp-caption-text\">Data Visualization in Python With matplotlib, Seaborn and Bokeh <br \/>Photo by Mehreen Saeed, some rights reserved.<\/p>\n<\/div>\n<h2 id=\"Tutorial-Overview\">Tutorial Overview<\/h2>\n<p>This tutorial is divided into 7 parts; they are:<\/p>\n<ul>\n<li>Preparation of scatter data<\/li>\n<li>Figures in matplotlib<\/li>\n<li>Scatter plots in matplotlib and Seaborn<\/li>\n<li>Scatter plots in Bokeh<\/li>\n<li>Preparation of line plot data<\/li>\n<li>Line plots in matplotlib, Seaborn, and Bokeh<\/li>\n<li>More on visualization<\/li>\n<\/ul>\n<h2 id=\"The-Import-Section\">Preparation of scatter data<\/h2>\n<p>In this post, we will use matplotlib, seaborn, and bokeh. They are all external libraries need to be installed. To install them using <code>pip<\/code>, run the following command:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">pip install matplotlib seaborn boken<\/pre>\n<p>For demonstration purposes, we will also use the MNIST handwritten digits dataset. We will load it from Tensorflow and run PCA algorithm on it. Hence we will also need to install Tensorflow and pandas:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">pip install tensorflow pandas<\/pre>\n<p>The code afterwards will assume the following imports are executed:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># Importing from tensorflow and keras\r\nfrom tensorflow.keras.datasets import mnist\r\nfrom tensorflow.keras.models import Sequential\r\nfrom tensorflow.keras.layers import Dense, Reshape\r\nfrom tensorflow.keras import utils\r\nfrom tensorflow import dtypes, tensordot\r\nfrom tensorflow import convert_to_tensor, linalg, transpose\r\n# For math operations\r\nimport numpy as np\r\n# For plotting with matplotlib\r\nimport matplotlib.pyplot as plt\r\n# For plotting with seaborn\r\nimport seaborn as sns  \r\n# For plotting with bokeh\r\nfrom bokeh.plotting import figure, show\r\nfrom bokeh.models import Legend, LegendItem\r\n# For pandas dataframe\r\nimport pandas as pd<\/pre>\n<p>We load the MNIST dataset from <code>keras.datasets<\/code> library. To keep things simple, we\u2019ll retain only the subset of data containing the first three digits. We\u2019ll also ignore the test set for now.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\n# Print the statistics\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Training data has  18623 images\r\nEach image is of size  28 x 28<\/pre>\n<\/p>\n<h2 id=\"Render-Images-Side-By-Side-Using-Matplotlib-and-Subplots\">Figures in matplotlib<\/h2>\n<p>Seaborn is indeed an add-on to matplotlib. Therefore you need to understand how matplotlib handles plots even if you\u2019re using Seaborn.<\/p>\n<p>Matplotlib calls its canvas the figure. You can divide the figure into several sections called subplots, so you can put two visualizations side-by-side.<\/p>\n<p>As an example, let\u2019s visualize the first 16 images of our MNIST dataset using matplotlib. We\u2019ll create 2 rows and 8 columns using the <code>subplots()<\/code> function. The <code>subplots()<\/code> function will create the <strong>axes<\/strong> objects for each unit. Then we will display each image on each axes object using the <code>imshow()<\/code> method. Finally, the figure will be shown using the <code>show()<\/code>\u00a0function.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">img_per_row = 8\r\nfig,ax = plt.subplots(nrows=2, ncols=img_per_row,\r\n                      figsize=(18,4),\r\n                      subplot_kw=dict(xticks=[], yticks=[]))\r\nfor row in [0, 1]:\r\n    for col in range(img_per_row):\r\n        ax[row, col].imshow(x_train[row*img_per_row + col].astype('int'))   \r\nplt.show()<\/pre>\n<\/p>\n<div id=\"attachment_13319\" style=\"width: 1024px\" class=\"wp-caption alignnone\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_8_0.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13319\" loading=\"lazy\" class=\"wp-image-13319 size-full\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_8_0.png\" alt=\"First 16 images of the training dataset displayed in 2 rows and 8 columns\" width=\"1014\" height=\"235\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_8_0.png 1014w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_8_0-300x70.png 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_8_0-768x178.png 768w\" sizes=\"(max-width: 1014px) 100vw, 1014px\"><\/a><\/p>\n<p id=\"caption-attachment-13319\" class=\"wp-caption-text\">First 16 images of the training dataset displayed in 2 rows and 8 columns<\/p>\n<\/div>\n<p>Here we can see a few properties of matplotlib. There is a default figure and default axes in matplotlib. There are a number of functions defined in matplotlib under the <code>pyplot<\/code> submodule for plotting on the default axes. If we want to plot on a particular axes, we can use the plotting function under the axes objects. The operations to manipulate a figure is procedural. Meaning, there is a data structure remembered internally by matplotlib and our operations will mutate it. The <code>show()<\/code> function simply display the result of a series of operations. Because of that, we can gradually fine-tune a lot of details on the figure. In the example above, we hid the \u201cticks\u201d (i.e., the markers on axes) by setting <code>xticks<\/code> and <code>yticks<\/code> to empty lists.<\/p>\n<h2>Scatter plots in matplotlib and Seaborn<\/h2>\n<p>One of the common visualizations we use in machine learning projects is the scatter plot.<\/p>\n<p>As an example, we apply PCA to the MNIST dataset and extract the first three components of each image. In the code below, we compute the eigenvectors and eigenvalues from the dataset, then projects the data of each image along the direction of the eigenvectors, and store the result in <code>x_pca<\/code>. For simplicity, we didn\u2019t normalize the data to zero mean and unit variance before computing the eigenvectors. This omission does not affect our purpose of visualization.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# Convert the dataset into a 2D array of shape 18623 x 784\r\nx = convert_to_tensor(np.reshape(x_train, (x_train.shape[0], -1)),\r\n                      dtype=dtypes.float32)\r\n# Eigen-decomposition from a 784 x 784 matrix\r\neigenvalues, eigenvectors = linalg.eigh(tensordot(transpose(x), x, axes=1))\r\n# Print the three largest eigenvalues\r\nprint('3 largest eigenvalues: ', eigenvalues[-3:])\r\n# Project the data to eigenvectors\r\nx_pca = tensordot(x, eigenvectors, axes=1)<\/pre>\n<p>The eigenvalues printed are as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">3 largest eigenvalues:  tf.Tensor([5.1999642e+09 1.1419439e+10 4.8231231e+10], shape=(3,), dtype=float32)<\/pre>\n<p>The array <code>x_pca<\/code> is in shape 18623 x 784. Let\u2019s consider the last two columns as the x- and y-coordinates and make the point of each row in the plot. We can further color the point according to which digit it corresponds to.<\/p>\n<p>The following code generates a scatter plot using matplotlib. The plot is created using the axes object\u2019s <code>scatter()<\/code> function, which takes the x- and y-coordinates as the first two argument. The <code>c<\/code> argument to <code>scatter()<\/code> method specifies a value that will become its color. The <code>s<\/code> argument specifies its size. The code also creates a legend and adds a title to the plot.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">fig, ax = plt.subplots(figsize=(12, 8))\r\nscatter = ax.scatter(x_pca[:, -1], x_pca[:, -2], c=train_labels, s=5)\r\nlegend_plt = ax.legend(*scatter.legend_elements(),\r\n                       loc=\"lower left\", title=\"Digits\")\r\nax.add_artist(legend_plt)\r\nplt.title('First Two Dimensions of Projected Data After Applying PCA')\r\nplt.show()<\/pre>\n<\/p>\n<div id=\"attachment_13320\" style=\"width: 734px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_13_0.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13320\" loading=\"lazy\" class=\"wp-image-13320 size-full\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_13_0.png\" alt=\"2D scatter plot generated using Matplotlib\" width=\"724\" height=\"482\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_13_0.png 724w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_13_0-300x200.png 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_13_0-600x400.png 600w\" sizes=\"(max-width: 724px) 100vw, 724px\"><\/a><\/p>\n<p id=\"caption-attachment-13320\" class=\"wp-caption-text\">2D scatter plot generated using matplotlib<\/p>\n<\/div>\n<p>Putting the above altogether, the following is the complete code to generate the 2D scatter plot using matplotlib:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow import dtypes, tensordot\r\nfrom tensorflow import convert_to_tensor, linalg, transpose\r\nimport numpy as np\r\nimport matplotlib.pyplot as plt\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n# Convert the dataset into a 2D array of shape 18623 x 784\r\nx = convert_to_tensor(np.reshape(x_train, (x_train.shape[0], -1)),\r\n                      dtype=dtypes.float32)\r\n# Eigen-decomposition from a 784 x 784 matrix\r\neigenvalues, eigenvectors = linalg.eigh(tensordot(transpose(x), x, axes=1))\r\n# Print the three largest eigenvalues\r\nprint('3 largest eigenvalues: ', eigenvalues[-3:])\r\n# Project the data to eigenvectors\r\nx_pca = tensordot(x, eigenvectors, axes=1)\r\n\r\n# Create the plot\r\nfig, ax = plt.subplots(figsize=(12, 8))\r\nscatter = ax.scatter(x_pca[:, -1], x_pca[:, -2], c=train_labels, s=5)\r\nlegend_plt = ax.legend(*scatter.legend_elements(),\r\n                       loc=\"lower left\", title=\"Digits\")\r\nax.add_artist(legend_plt)\r\nplt.title('First Two Dimensions of Projected Data After Applying PCA')\r\nplt.show()<\/pre>\n<p>Matplotlib also allows a 3D scatter plot to be produced. To do so, you need to create an axes object with 3D projection first. Then the 3D scatter plot is created with the <code>scatter3D()<\/code> function, with the x-, y-, and z-coordinates as the first three arguments. The code below uses the data projected along the eigenvectors corresponding to the three largest eigenvalues. Instead of creating a legend, this code creates a colorbar.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">fig = plt.figure(figsize=(12, 8))\r\nax = plt.axes(projection='3d')\r\nplt_3d = ax.scatter3D(x_pca[:, -1], x_pca[:, -2], x_pca[:, -3], c=train_labels, s=1)\r\nplt.colorbar(plt_3d)\r\nplt.show()<\/pre>\n<\/p>\n<div id=\"attachment_13321\" style=\"width: 644px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_15_0.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13321\" loading=\"lazy\" class=\"wp-image-13321 size-full\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_15_0.png\" alt=\"3D scatter plot generated using Matplotlib\" width=\"634\" height=\"459\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_15_0.png 634w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_15_0-300x217.png 300w\" sizes=\"(max-width: 634px) 100vw, 634px\"><\/a><\/p>\n<p id=\"caption-attachment-13321\" class=\"wp-caption-text\">3D scatter plot generated using matplotlib<\/p>\n<\/div>\n<p>The <code>scatter3D()<\/code> function just puts the points onto the 3D space. Afterwards, we can still modify how the figure displays such as the label of each axis and the background color. But in 3D plots, one common tweak is the <strong>viewport<\/strong>, namely, the angle we look at the 3D space. Viewport is controlled by the <code>view_init()<\/code> function in the axes object:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">ax.view_init(elev=30, azim=-60)<\/pre>\n<p>The viewport is controlled by the elevation angle (i.e., angle to the horizon plane) and the azimuthal angle (i.e., rotation on the horizon plane). By default, matplotlib uses 30 degree elevation and -60 degree azimuthal, as shown above.<\/p>\n<p>Putting everything together, the following is the complete code to create the 3D scatter plot in matplotlib:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow import dtypes, tensordot\r\nfrom tensorflow import convert_to_tensor, linalg, transpose\r\nimport numpy as np\r\nimport matplotlib.pyplot as plt\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n# Convert the dataset into a 2D array of shape 18623 x 784\r\nx = convert_to_tensor(np.reshape(x_train, (x_train.shape[0], -1)),\r\n                      dtype=dtypes.float32)\r\n# Eigen-decomposition from a 784 x 784 matrix\r\neigenvalues, eigenvectors = linalg.eigh(tensordot(transpose(x), x, axes=1))\r\n# Print the three largest eigenvalues\r\nprint('3 largest eigenvalues: ', eigenvalues[-3:])\r\n# Project the data to eigenvectors\r\nx_pca = tensordot(x, eigenvectors, axes=1)\r\n\r\n# Create the plot\r\nfig = plt.figure(figsize=(12, 8))\r\nax = plt.axes(projection='3d')\r\nax.view_init(elev=30, azim=-60)\r\nplt_3d = ax.scatter3D(x_pca[:, -1], x_pca[:, -2], x_pca[:, -3], c=train_labels, s=1)\r\nplt.colorbar(plt_3d)\r\nplt.show()<\/pre>\n<p>Creating scatter plots in Seaborn is similarly easy. The <code>scatterplot()<\/code> method automatically creates a legend and uses different symbols for different classes when plotting the points. By default, the plot is created on the \u201ccurrent axes\u201d from matplotlib, unless the axes object is specified by the <code>ax<\/code> argument.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">fig, ax = plt.subplots(figsize=(12, 8))\r\nsns.scatterplot(x_pca[:, -1], x_pca[:, -2],\r\n                style=train_labels, hue=train_labels,\r\n                palette=[\"red\", \"green\", \"blue\"])\r\nplt.title('First Two Dimensions of Projected Data After Applying PCA')\r\nplt.show()<\/pre>\n<\/p>\n<div id=\"attachment_13322\" style=\"width: 734px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_17_0.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13322\" loading=\"lazy\" class=\"wp-image-13322 size-full\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_17_0.png\" alt=\"2D scatter plot generated using Seaborn\" width=\"724\" height=\"482\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_17_0.png 724w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_17_0-300x200.png 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/output_17_0-600x400.png 600w\" sizes=\"(max-width: 724px) 100vw, 724px\"><\/a><\/p>\n<p id=\"caption-attachment-13322\" class=\"wp-caption-text\">2D scatter plot generated using Seaborn<\/p>\n<\/div>\n<p>The benefit of Seaborn over matplotlib is two fold: First we have a polished default style. For example, if we compare the point style in the two scatter plots above, the Seaborn one has a border around the dot to prevent the many points smurged together. Indeed, if we run the following line before calling any matplotlib functions:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">sns.set(style = \"darkgrid\")<\/pre>\n<p>we can still use the matplotlib functions but get a better looking figure by using Seaborn\u2019s style. Secondly, it is more convenient to use Seaborn if we are using pandas DataFrame to hold our data. As an example, let\u2019s convert our MNIST data from a tensor into a pandas DataFrame:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">df_mnist = pd.DataFrame(x_pca[:, -3:].numpy(), columns=[\"pca3\",\"pca2\",\"pca1\"])\r\ndf_mnist[\"label\"] = train_labels\r\nprint(df_mnist)<\/pre>\n<p>which the DataFrame looks like the following:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">pca3        pca2         pca1  label\r\n0     -537.730103  926.885254  1965.881592      0\r\n1      167.375885 -947.360107  1070.359375      1\r\n2      553.685425 -163.121826  1754.754272      2\r\n3     -642.905579 -767.283020  1053.937988      1\r\n4     -651.812988 -586.034424   662.468201      1\r\n...           ...         ...          ...    ...\r\n18618  415.358948 -645.245972   853.439209      1\r\n18619  754.555786    7.873116  1897.690552      2\r\n18620 -321.809357  665.038086  1840.480225      0\r\n18621  643.843628  -85.524895  1113.795166      2\r\n18622   94.964279 -549.570984   561.743042      1\r\n\r\n[18623 rows x 4 columns]<\/pre>\n<p>Then, we can reproduce the Seaborn\u2019s scatter plot with the following:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">fig, ax = plt.subplots(figsize=(12, 8))\r\nsns.scatterplot(data=df_mnist, x=\"pca1\", y=\"pca2\",\r\n                style=\"label\", hue=\"label\",\r\n                palette=[\"red\", \"green\", \"blue\"])\r\nplt.title('First Two Dimensions of Projected Data After Applying PCA')\r\nplt.show()<\/pre>\n<p>which we do not pass in arrays as coordinates to the <code>scatterplot()<\/code> function, but column names to the <code>data<\/code> argument instead.<\/p>\n<p>The following is the complete code to generate a scatter plot using Seaborn with the data stored in pandas:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow import dtypes, tensordot\r\nfrom tensorflow import convert_to_tensor, linalg, transpose\r\nimport numpy as np\r\nimport pandas as pd\r\nimport matplotlib.pyplot as plt\r\nimport seaborn as sns\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n# Convert the dataset into a 2D array of shape 18623 x 784\r\nx = convert_to_tensor(np.reshape(x_train, (x_train.shape[0], -1)),\r\n                      dtype=dtypes.float32)\r\n# Eigen-decomposition from a 784 x 784 matrix\r\neigenvalues, eigenvectors = linalg.eigh(tensordot(transpose(x), x, axes=1))\r\n# Print the three largest eigenvalues\r\nprint('3 largest eigenvalues: ', eigenvalues[-3:])\r\n# Project the data to eigenvectors\r\nx_pca = tensordot(x, eigenvectors, axes=1)\r\n\r\n# Making pandas DataFrame\r\ndf_mnist = pd.DataFrame(x_pca[:, -3:].numpy(), columns=[\"pca3\",\"pca2\",\"pca1\"])\r\ndf_mnist[\"label\"] = train_labels\r\n\r\n# Create the plot\r\nfig, ax = plt.subplots(figsize=(12, 8))\r\nsns.scatterplot(data=df_mnist, x=\"pca1\", y=\"pca2\",\r\n                style=\"label\", hue=\"label\",\r\n                palette=[\"red\", \"green\", \"blue\"])\r\nplt.title('First Two Dimensions of Projected Data After Applying PCA')\r\nplt.show()<\/pre>\n<p>Seaborn as a wrapper to some matplotlib functions, is not replacing matplotlib entirely. Plotting in 3D, for example, are not supported by Seaborn and we still need to resort to matplotlib functions for such purposes.<\/p>\n<h2>Scatter plots in Bokeh<\/h2>\n<p>The plots created by matplotlib and Seaborn are static images. If you need to zoom in, pan, or toggle the display of some part of the plot, you should use Bokeh instead.<\/p>\n<p>Creating scatter plots in Bokeh is also easy. The following code generates a scatter plot and adds a legend. The <code>show()<\/code> method from Bokeh library opens a new browser window to display the image. You can interact with the plot by scaling, zooming, scrolling and more options that are shown in the toolbar next to the rendered plot. You can also hide part of the scatter by clicking on the legend.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">colormap = {0: \"red\", 1:\"green\", 2:\"blue\"}\r\nmy_scatter = figure(title=\"First Two Dimensions of Projected Data After Applying PCA\", \r\n                    x_axis_label=\"Dimension 1\",\r\n                    y_axis_label=\"Dimension 2\")\r\nfor digit in [0, 1, 2]:\r\n    selection = x_pca[train_labels == digit]\r\n    my_scatter.scatter(selection[:,-1].numpy(), selection[:,-2].numpy(),\r\n                       color=colormap[digit], size=5,\r\n                       legend_label=\"Digit \"+str(digit))\r\nmy_scatter.legend.click_policy = \"hide\"\r\nshow(my_scatter)<\/pre>\n<p>Bokeh will produce the plot in HTML with Javascript. All your actions to control the plot are handled by some Javascript functions. Its output would looks like the following:<\/p>\n<div id=\"attachment_13326\" style=\"width: 810px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/bokeh_scatter.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13326\" class=\"wp-image-13326 size-large\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/bokeh_scatter-1024x1016.png\" alt=\"2D scatter plot generated using Bokeh in a new browser window. Note the various options on the right for interacting with the plot.\" width=\"800\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/bokeh_scatter-1024x1016.png 1024w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/bokeh_scatter-300x298.png 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/bokeh_scatter-150x150.png 150w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/bokeh_scatter-768x762.png 768w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/bokeh_scatter.png 1216w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/a><\/p>\n<p id=\"caption-attachment-13326\" class=\"wp-caption-text\">2D scatter plot generated using Bokeh in a new browser window. Note the various options on the right for interacting with the plot.<\/p>\n<\/div>\n<p>The following is the complete code to generate the above scatter plot using Bokeh:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow import dtypes, tensordot\r\nfrom tensorflow import convert_to_tensor, linalg, transpose\r\nimport numpy as np\r\nfrom bokeh.plotting import figure, show\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n# Convert the dataset into a 2D array of shape 18623 x 784\r\nx = convert_to_tensor(np.reshape(x_train, (x_train.shape[0], -1)),\r\n                      dtype=dtypes.float32)\r\n# Eigen-decomposition from a 784 x 784 matrix\r\neigenvalues, eigenvectors = linalg.eigh(tensordot(transpose(x), x, axes=1))\r\n# Print the three largest eigenvalues\r\nprint('3 largest eigenvalues: ', eigenvalues[-3:])\r\n# Project the data to eigenvectors\r\nx_pca = tensordot(x, eigenvectors, axes=1)\r\n\r\n# Create scatter plot in Bokeh\r\ncolormap = {0: \"red\", 1:\"green\", 2:\"blue\"}\r\nmy_scatter = figure(title=\"First Two Dimensions of Projected Data After Applying PCA\",\r\n                    x_axis_label=\"Dimension 1\",\r\n                    y_axis_label=\"Dimension 2\")\r\nfor digit in [0, 1, 2]:\r\n    selection = x_pca[train_labels == digit]\r\n    my_scatter.scatter(selection[:,-1].numpy(), selection[:,-2].numpy(),\r\n                       color=colormap[digit], size=5, alpha=0.5,\r\n                       legend_label=\"Digit \"+str(digit))\r\nmy_scatter.legend.click_policy = \"hide\"\r\nshow(my_scatter)<\/pre>\n<p>If you are rendering the Bokeh plot in Jupyter notebook, you may see the plot is produced in a new browser window. To put the plot in the Jupyter notebook, you need to tell Bokeh that you are under the notebook environment by running the following before the Bokeh functions:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from bokeh.io import output_notebook\r\noutput_notebook()<\/pre>\n<p>Also note that we create the scatter plot of the three digit in a loop, one digit at a time. This is required to make the legend interactive, since each time <code>scatter()<\/code> is called, a new object is created. If we use create all scatter points at once, like the following, clicking on the legend will hide and show everything instead of only the points of one of the digits.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">colormap = {0: \"red\", 1:\"green\", 2:\"blue\"}\r\ncolors = [colormap[i] for i in train_labels]\r\nmy_scatter = figure(title=\"First Two Dimensions of Projected Data After Applying PCA\", \r\n           x_axis_label=\"Dimension 1\", y_axis_label=\"Dimension 2\")\r\nscatter_obj = my_scatter.scatter(x_pca[:, -1].numpy(), x_pca[:, -2].numpy(), color=colors, size=5)\r\nlegend = Legend(items=[\r\n    LegendItem(label=\"Digit 0\", renderers=[scatter_obj], index=0),\r\n    LegendItem(label=\"Digit 1\", renderers=[scatter_obj], index=1),\r\n    LegendItem(label=\"Digit 2\", renderers=[scatter_obj], index=2),\r\n    ])\r\nmy_scatter.add_layout(legend)\r\nmy_scatter.legend.click_policy = \"hide\"\r\nshow(my_scatter)<\/pre>\n<\/p>\n<h2 id=\"Train-a-Classification-Model-Using-the-Keras-Library\">Preparation of line plot data<\/h2>\n<p>Before we move on to show how we can visualize line plot data, let\u2019s generate some data for illustration. Below is a simple classifier using the Keras library, which we train it to learn the handwritten digit classification. The history object returned by the <code>fit()<\/code> method is a dictionary that contains all the learning history of the training stage. For simplicity, we\u2019ll train the model using only 10 epochs.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">epochs = 10\r\ny_train = utils.to_categorical(train_labels)\r\ninput_dim = img_length*img_width\r\n# Create a Sequential model\r\nmodel = Sequential()\r\n# First layer for reshaping input images from 2D to 1D\r\nmodel.add(Reshape((input_dim, ), input_shape=(img_length, img_width)))\r\n# Dense layer of 8 neurons\r\nmodel.add(Dense(8, activation='relu'))\r\n# Output layer\r\nmodel.add(Dense(total_classes, activation='softmax'))\r\n# Compile model\r\nmodel.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\r\nhistory = model.fit(x_train, y_train, validation_split=0.33, epochs=epochs, batch_size=10, verbose=0)\r\nprint('Learning history: ', history.history)<\/pre>\n<p>The code above will produce a dictionary with keys <code>loss<\/code>, <code>accuracy<\/code>, <code>val_loss<\/code>, and <code>val_accuracy<\/code>, as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Learning history:  {'loss': [0.5362154245376587, 0.08184114843606949, ...],\r\n'accuracy': [0.9426144361495972, 0.9763565063476562, ...],\r\n'val_loss': [0.09874073415994644, 0.07835448533296585, ...],\r\n'val_accuracy': [0.9716889262199402, 0.9788480401039124, ...]}<\/pre>\n<\/p>\n<h2 id=\"Visualize-the-Learning-History-of-Training-the-Keras-Classifier\">Line plots in matplotlib, Seaborn, and Bokeh<\/h2>\n<p>Let\u2019s look at various options for visualizing the learning history obtained from training our classifier.<\/p>\n<p>Creating a multi-line plots in matplotlib is as trivial as following. We obtain the list of values of the training and validation accuracies from the history, and by default, matplotlib will consider that as sequential data (i.e., x-coordinates are integers counting from 0 onwards).<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">plt.plot(history.history['accuracy'], label=\"Training accuracy\")\r\nplt.plot(history.history['val_accuracy'], label=\"Validation accuracy\")\r\nplt.title('Training and validation accuracy')\r\nplt.xlabel('Epochs')\r\nplt.ylabel('Accuracy')\r\nplt.legend()\r\nplt.show()<\/pre>\n<\/p>\n<div id=\"attachment_13323\" style=\"width: 402px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot1.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13323\" loading=\"lazy\" class=\"wp-image-13323 size-full\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot1.png\" alt=\"Multi-line plot using Matplotlib\" width=\"392\" height=\"278\"><\/a><\/p>\n<p id=\"caption-attachment-13323\" class=\"wp-caption-text\">Multi-line plot using Matplotlib<\/p>\n<\/div>\n<p>The complete code for creating the multi-line plot is as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow.keras import utils\r\nfrom tensorflow.keras.models import Sequential\r\nfrom tensorflow.keras.layers import Dense, Reshape\r\nimport numpy as np\r\nimport matplotlib.pyplot as plt\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n# Prepare for classifier network\r\nepochs = 10\r\ny_train = utils.to_categorical(train_labels)\r\ninput_dim = img_length*img_width\r\n# Create a Sequential model\r\nmodel = Sequential()\r\n# First layer for reshaping input images from 2D to 1D\r\nmodel.add(Reshape((input_dim, ), input_shape=(img_length, img_width)))\r\n# Dense layer of 8 neurons\r\nmodel.add(Dense(8, activation='relu'))\r\n# Output layer\r\nmodel.add(Dense(total_classes, activation='softmax'))\r\n# Compile model\r\nmodel.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\r\nhistory = model.fit(x_train, y_train, validation_split=0.33, epochs=epochs, batch_size=10, verbose=0)\r\nprint('Learning history: ', history.history)\r\n\r\n# Plot accuracy in Matplotlib\r\nplt.plot(history.history['accuracy'], label=\"Training accuracy\")\r\nplt.plot(history.history['val_accuracy'], label=\"Validation accuracy\")\r\nplt.title('Training and validation accuracy')\r\nplt.xlabel('Epochs')\r\nplt.ylabel('Accuracy')\r\nplt.legend()\r\nplt.show()<\/pre>\n<p>Similarly, we can do the same in Seaborn. As we have seen in the case of scatter plot, we can pass in the data to Seaborn as a series of values explicitly, or through a pandas DataFrame. Let\u2019s plot the training loss and validation loss in the following using a pandas DataFrame:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># Create pandas DataFrame\r\ndf_history = pd.DataFrame(history.history)\r\nprint(df_history)\r\n\r\n# Plot using Seaborn\r\nmy_plot = sns.lineplot(data=df_history[[\"loss\",\"val_loss\"]])\r\nmy_plot.set_xlabel('Epochs')\r\nmy_plot.set_ylabel('Loss')\r\nplt.legend(labels=[\"Training\", \"Validation\"])\r\nplt.title('Training and Validation Loss')\r\nplt.show()<\/pre>\n<p>It will print the following table, which is the DataFrame we created from the history:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">loss  accuracy  val_loss  val_accuracy\r\n0  0.536215  0.942614  0.098741      0.971689\r\n1  0.081841  0.976357  0.078354      0.978848\r\n2  0.064002  0.978841  0.080637      0.972991\r\n3  0.055695  0.981726  0.064659      0.979987\r\n4  0.054693  0.984371  0.070817      0.983729\r\n5  0.053512  0.985173  0.069099      0.977709\r\n6  0.053916  0.983089  0.068139      0.979662\r\n7  0.048681  0.985093  0.064914      0.977709\r\n8  0.052084  0.982929  0.080508      0.971363\r\n9  0.040484  0.983890  0.111380      0.982590<\/pre>\n<p>And the plot it generated is as follows:<\/p>\n<div id=\"attachment_13324\" style=\"width: 396px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot2.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13324\" loading=\"lazy\" class=\"wp-image-13324 size-full\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot2.png\" alt=\"Multi-line plot using Seaborn\" width=\"386\" height=\"278\"><\/a><\/p>\n<p id=\"caption-attachment-13324\" class=\"wp-caption-text\">Multi-line plot using Seaborn<\/p>\n<\/div>\n<p>By default, Seaborn will understand the column labels from the DataFrame and use it as legend. In the above, we provide a new label for each plot. Moreover, the x-axis of the line plot is taken from the index of the DataFrame by default, which is integer running from 0 to 9 in our case as we can see above.<\/p>\n<p>The complete code of producing the plot in Seaborn is as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow.keras import utils\r\nfrom tensorflow.keras.models import Sequential\r\nfrom tensorflow.keras.layers import Dense, Reshape\r\nimport numpy as np\r\nimport pandas as pd\r\nimport matplotlib.pyplot as plt\r\nimport seaborn as sns\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n# Prepare for classifier network\r\nepochs = 10\r\ny_train = utils.to_categorical(train_labels)\r\ninput_dim = img_length*img_width\r\n# Create a Sequential model\r\nmodel = Sequential()\r\n# First layer for reshaping input images from 2D to 1D\r\nmodel.add(Reshape((input_dim, ), input_shape=(img_length, img_width)))\r\n# Dense layer of 8 neurons\r\nmodel.add(Dense(8, activation='relu'))\r\n# Output layer\r\nmodel.add(Dense(total_classes, activation='softmax'))\r\n# Compile model\r\nmodel.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\r\nhistory = model.fit(x_train, y_train, validation_split=0.33, epochs=epochs, batch_size=10, verbose=0)\r\n\r\n# Prepare pandas DataFrame\r\ndf_history = pd.DataFrame(history.history)\r\nprint(df_history)\r\n\r\n# Plot loss in seaborn\r\nmy_plot = sns.lineplot(data=df_history[[\"loss\",\"val_loss\"]])\r\nmy_plot.set_xlabel('Epochs')\r\nmy_plot.set_ylabel('Loss')\r\nplt.legend(labels=[\"Training\", \"Validation\"])\r\nplt.title('Training and Validation Loss')\r\nplt.show()<\/pre>\n<p>As you can expect, we can also provide arguments <code>x<\/code> and <code>y<\/code> together with <code>data<\/code> to our call to <code>lineplot()<\/code> as in our example of Seaborn scatter plot above if we want to control the x- and y-coordinates precisely.<\/p>\n<p>Bokeh can also generate multi-line plots, as illustrated in the code below. As we saw in the scatter plot example, we need to provide the x- and y-coordinates explicitly and do one line at a time. Again, the <code>show()<\/code> method opens a new browser window to display the plot and you can interact with it.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">p = figure(title=\"Training and validation accuracy\",\r\n           x_axis_label=\"Epochs\", y_axis_label=\"Accuracy\")\r\nepochs_array = np.arange(epochs)\r\np.line(epochs_array, df_history['accuracy'], legend_label=\"Training\",\r\n       color=\"blue\", line_width=2)\r\np.line(epochs_array, df_history['val_accuracy'], legend_label=\"Validation\",\r\n       color=\"green\")\r\np.legend.click_policy = \"hide\"\r\np.legend.location = 'bottom_right'\r\nshow(p)<\/pre>\n<\/p>\n<div id=\"attachment_13325\" style=\"width: 810px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot3.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13325\" class=\"wp-image-13325 size-large\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot3.png\" alt=\"Multi-line plot using Bokeh. Note the options for user interaction shown on the toolbar on the right.\" width=\"800\"><\/a><\/p>\n<p id=\"caption-attachment-13325\" class=\"wp-caption-text\">Multi-line plot using Bokeh. Note the options for user interaction shown on the toolbar on the right.<\/p>\n<\/div>\n<p>The complete code for making the Bokeh plot is as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow.keras import utils\r\nfrom tensorflow.keras.models import Sequential\r\nfrom tensorflow.keras.layers import Dense, Reshape\r\nimport numpy as np\r\nimport pandas as pd\r\nfrom bokeh.plotting import figure, show\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n# Prepare for classifier network\r\nepochs = 10\r\ny_train = utils.to_categorical(train_labels)\r\ninput_dim = img_length*img_width\r\n# Create a Sequential model\r\nmodel = Sequential()\r\n# First layer for reshaping input images from 2D to 1D\r\nmodel.add(Reshape((input_dim, ), input_shape=(img_length, img_width)))\r\n# Dense layer of 8 neurons\r\nmodel.add(Dense(8, activation='relu'))\r\n# Output layer\r\nmodel.add(Dense(total_classes, activation='softmax'))\r\n# Compile model\r\nmodel.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\r\nhistory = model.fit(x_train, y_train, validation_split=0.33, epochs=epochs, batch_size=10, verbose=0)\r\n\r\n# Prepare pandas DataFrame\r\ndf_history = pd.DataFrame(history.history)\r\nprint(df_history)\r\n\r\n# Plot accuracy in Bokeh\r\np = figure(title=\"Training and validation accuracy\",\r\n           x_axis_label=\"Epochs\", y_axis_label=\"Accuracy\")\r\nepochs_array = np.arange(epochs)\r\np.line(epochs_array, df_history['accuracy'], legend_label=\"Training\",\r\n       color=\"blue\", line_width=2)\r\np.line(epochs_array, df_history['val_accuracy'], legend_label=\"Validation\",\r\n       color=\"green\")\r\np.legend.click_policy = \"hide\"\r\np.legend.location = 'bottom_right'\r\nshow(p)<\/pre>\n<\/p>\n<h2>More on visualization<\/h2>\n<p>Each of the tools we introduced above has a lot more functions for us to control the bits and pieces of the details in the visualization. It is important to search on their respective documentation to find the ways you can polish your plots. It is equally important to check out the example code in their documentation to learn how you can possibly make your visualization better.<\/p>\n<p>Without providing too much detail, here are some ideas that you may want to add to your visualization:<\/p>\n<ul>\n<li>add auxiliary lines, such as to mark the training and validation dataset on a time series data. The <code>axvline()<\/code> function from matplotlib can make a vertical line on plots for this purpose<\/li>\n<li>add annotations, such as arrows and text labels to identify key points on the plot. See the <code>annotate()<\/code> function in matplotlib axes objects.<\/li>\n<li>control the transparency level in case of overlapping graphic elements. All plotting functions we introduced above allows an <code>alpha<\/code> argument to provide a value between 0 and 1 for how much we can see through the graph.<\/li>\n<li>if the data is better illustrated this way, we may show some of the axes in log scale. It is usually called the log plot or semilog plot.<\/li>\n<\/ul>\n<p>Before we conclude this post, the following is an example that we can create a side-by-side visualization in matplotlib, which one of them is created using Seaborn:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow.keras import utils\r\nfrom tensorflow.keras.models import Sequential\r\nfrom tensorflow.keras.layers import Dense, Reshape\r\nfrom tensorflow import dtypes, tensordot\r\nfrom tensorflow import convert_to_tensor, linalg, transpose\r\nimport numpy as np\r\nimport pandas as pd\r\nimport matplotlib.pyplot as plt\r\nimport seaborn as sns\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n\r\n# Convert the dataset into a 2D array of shape 18623 x 784\r\nx = convert_to_tensor(np.reshape(x_train, (x_train.shape[0], -1)),\r\n                      dtype=dtypes.float32)\r\n# Eigen-decomposition from a 784 x 784 matrix\r\neigenvalues, eigenvectors = linalg.eigh(tensordot(transpose(x), x, axes=1))\r\n# Print the three largest eigenvalues\r\nprint('3 largest eigenvalues: ', eigenvalues[-3:])\r\n# Project the data to eigenvectors\r\nx_pca = tensordot(x, eigenvectors, axes=1)\r\n\r\n\r\n# Prepare for classifier network\r\nepochs = 10\r\ny_train = utils.to_categorical(train_labels)\r\ninput_dim = img_length*img_width\r\n# Create a Sequential model\r\nmodel = Sequential()\r\n# First layer for reshaping input images from 2D to 1D\r\nmodel.add(Reshape((input_dim, ), input_shape=(img_length, img_width)))\r\n# Dense layer of 8 neurons\r\nmodel.add(Dense(8, activation='relu'))\r\n# Output layer\r\nmodel.add(Dense(total_classes, activation='softmax'))\r\n# Compile model\r\nmodel.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\r\nhistory = model.fit(x_train, y_train, validation_split=0.33, epochs=epochs, batch_size=10, verbose=0)\r\n\r\n\r\n# Prepare pandas DataFrame\r\ndf_history = pd.DataFrame(history.history)\r\nprint(df_history)\r\n\r\n\r\n# Plot side-by-side\r\nfig, ax = plt.subplots(nrows=1, ncols=2, figsize=(15,6))\r\n# left plot\r\nscatter = ax[0].scatter(x_pca[:, -1], x_pca[:, -2], c=train_labels, s=5)\r\nlegend_plt = ax[0].legend(*scatter.legend_elements(),\r\n                         loc=\"lower left\", title=\"Digits\")\r\nax[0].add_artist(legend_plt)\r\nax[0].set_title('First Two Dimensions of Projected Data After Applying PCA')\r\n# right plot\r\nmy_plot = sns.lineplot(data=df_history[[\"loss\",\"val_loss\"]], ax=ax[1])\r\nmy_plot.set_xlabel('Epochs')\r\nmy_plot.set_ylabel('Loss')\r\nax[1].legend(labels=[\"Training\", \"Validation\"])\r\nax[1].set_title('Training and Validation Loss')\r\nplt.show()<\/pre>\n<\/p>\n<div id=\"attachment_13342\" style=\"width: 902px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-13342\" loading=\"lazy\" class=\"size-full wp-image-13342\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot4.png\" alt=\"\" width=\"892\" height=\"387\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot4.png 892w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot4-300x130.png 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot4-768x333.png 768w\" sizes=\"(max-width: 892px) 100vw, 892px\"><\/p>\n<p id=\"caption-attachment-13342\" class=\"wp-caption-text\">Side-by-side visualization created using matplotlib and Seaborn<\/p>\n<\/div>\n<p>The equivalent in Bokeh is to create each subplot separately and then specify the layout when we show it:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.datasets import mnist\r\nfrom tensorflow.keras import utils\r\nfrom tensorflow.keras.models import Sequential\r\nfrom tensorflow.keras.layers import Dense, Reshape\r\nfrom tensorflow import dtypes, tensordot\r\nfrom tensorflow import convert_to_tensor, linalg, transpose\r\nimport numpy as np\r\nimport pandas as pd\r\nfrom bokeh.plotting import figure, show\r\nfrom bokeh.layouts import row\r\n\r\n# Load dataset\r\n(x_train, train_labels), (_, _) = mnist.load_data()\r\n# Choose only the digits 0, 1, 2\r\ntotal_classes = 3\r\nind = np.where(train_labels &lt; total_classes)\r\nx_train, train_labels = x_train[ind], train_labels[ind]\r\n# Verify the shape of training data\r\ntotal_examples, img_length, img_width = x_train.shape\r\nprint('Training data has ', total_examples, 'images')\r\nprint('Each image is of size ', img_length, 'x', img_width)\r\n\r\n\r\n# Convert the dataset into a 2D array of shape 18623 x 784\r\nx = convert_to_tensor(np.reshape(x_train, (x_train.shape[0], -1)),\r\n                      dtype=dtypes.float32)\r\n# Eigen-decomposition from a 784 x 784 matrix\r\neigenvalues, eigenvectors = linalg.eigh(tensordot(transpose(x), x, axes=1))\r\n# Print the three largest eigenvalues\r\nprint('3 largest eigenvalues: ', eigenvalues[-3:])\r\n# Project the data to eigenvectors\r\nx_pca = tensordot(x, eigenvectors, axes=1)\r\n\r\n\r\n# Prepare for classifier network\r\nepochs = 10\r\ny_train = utils.to_categorical(train_labels)\r\ninput_dim = img_length*img_width\r\n# Create a Sequential model\r\nmodel = Sequential()\r\n# First layer for reshaping input images from 2D to 1D\r\nmodel.add(Reshape((input_dim, ), input_shape=(img_length, img_width)))\r\n# Dense layer of 8 neurons\r\nmodel.add(Dense(8, activation='relu'))\r\n# Output layer\r\nmodel.add(Dense(total_classes, activation='softmax'))\r\n# Compile model\r\nmodel.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])\r\nhistory = model.fit(x_train, y_train, validation_split=0.33, epochs=epochs, batch_size=10, verbose=0)\r\n\r\n\r\n# Prepare pandas DataFrame\r\ndf_history = pd.DataFrame(history.history)\r\nprint(df_history)\r\n\r\n\r\n# Create scatter plot in Bokeh\r\ncolormap = {0: \"red\", 1:\"green\", 2:\"blue\"}\r\nmy_scatter = figure(title=\"First Two Dimensions of Projected Data After Applying PCA\",\r\n                    x_axis_label=\"Dimension 1\",\r\n                    y_axis_label=\"Dimension 2\",\r\n                    width=500, height=400)\r\nfor digit in [0, 1, 2]:\r\n    selection = x_pca[train_labels == digit]\r\n    my_scatter.scatter(selection[:,-1].numpy(), selection[:,-2].numpy(),\r\n                       color=colormap[digit], size=5, alpha=0.5,\r\n                       legend_label=\"Digit \"+str(digit))\r\nmy_scatter.legend.click_policy = \"hide\"\r\n\r\n\r\n# Plot accuracy in Bokeh\r\np = figure(title=\"Training and validation accuracy\",\r\n           x_axis_label=\"Epochs\", y_axis_label=\"Accuracy\",\r\n           width=500, height=400)\r\nepochs_array = np.arange(epochs)\r\np.line(epochs_array, df_history['accuracy'], legend_label=\"Training\",\r\n       color=\"blue\", line_width=2)\r\np.line(epochs_array, df_history['val_accuracy'], legend_label=\"Validation\",\r\n       color=\"green\")\r\np.legend.click_policy = \"hide\"\r\np.legend.location = 'bottom_right'\r\n\r\nshow(row(my_scatter, p))<\/pre>\n<\/p>\n<div id=\"attachment_13343\" style=\"width: 2028px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-13343\" loading=\"lazy\" class=\"size-full wp-image-13343\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot5.png\" alt=\"\" width=\"2018\" height=\"808\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot5.png 2018w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot5-300x120.png 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot5-1024x410.png 1024w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot5-768x308.png 768w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/03\/lineplot5-1536x615.png 1536w\" sizes=\"(max-width: 2018px) 100vw, 2018px\"><\/p>\n<p id=\"caption-attachment-13343\" class=\"wp-caption-text\">Side-by-side plot created in Bokeh<\/p>\n<\/div>\n<h2 id=\"Further-Reading\">Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3 id=\"Books\">Books<\/h3>\n<ul>\n<li>\n<a href=\"https:\/\/greenteapress.com\/thinkpython\/html\/index.html\" target=\"_blank\" rel=\"noopener\">Think Python: How to Think Like a Computer Scientist<\/a> by Allen B. Downey<\/li>\n<li>\n<a href=\"https:\/\/www.amazon.com\/dp\/B001OFK2DK\/\" target=\"_blank\" rel=\"noopener\">Programming in Python 3: A Complete Introduction to the Python Language<\/a> by Mark Summerfield<\/li>\n<li>\n<a href=\"https:\/\/www.amazon.com\/dp\/1590282418\/\" target=\"_blank\" rel=\"noopener\">Python Programming: An Introduction to Computer Science<\/a> by John Zelle<\/li>\n<li>\n<a href=\"https:\/\/www.amazon.com\/dp\/1491957662\">Python for Data Analysis<\/a>, 2nd edition, by Wes McKinney<\/li>\n<\/ul>\n<h3>Articles<\/h3>\n<ul>\n<li><a href=\"https:\/\/machinelearningmastery.com\/data-visualization-methods-in-python\/\">A Gentle Introduction to Data Visualization Methods in Python<\/a><\/li>\n<li class=\"title entry-title\"><a href=\"https:\/\/machinelearningmastery.com\/seaborn-data-visualization-for-machine-learning\/\">How to use Seaborn Data Visualization for Machine Learning<\/a><\/li>\n<\/ul>\n<h3>API Reference<\/h3>\n<ul>\n<li><a href=\"https:\/\/matplotlib.org\/stable\/api\/_as_gen\/matplotlib.pyplot.scatter.html#matplotlib.pyplot.scatter\" target=\"_blank\" rel=\"noopener\">matplotlib.pyplot.scatter<\/a><\/li>\n<li><a href=\"https:\/\/matplotlib.org\/stable\/api\/_as_gen\/matplotlib.pyplot.plot.html\" target=\"_blank\" rel=\"noopener\">matplotlib.pyplot.plot<\/a><\/li>\n<li><a href=\"https:\/\/seaborn.pydata.org\/generated\/seaborn.scatterplot.html\" target=\"_blank\" rel=\"noopener\">seaborn.scatterplot<\/a><\/li>\n<li><a href=\"https:\/\/seaborn.pydata.org\/generated\/seaborn.lineplot.html\" target=\"_blank\" rel=\"noopener\">seaborn.lineplot<\/a><\/li>\n<li><a href=\"https:\/\/docs.bokeh.org\/en\/latest\/docs\/user_guide\/plotting.html\" target=\"_blank\" rel=\"noopener\">Bokeh plotting with basic glyphs<\/a><\/li>\n<li><a href=\"https:\/\/docs.bokeh.org\/en\/latest\/docs\/reference\/plotting\/figure.html#bokeh.plotting.Figure.scatter\" target=\"_blank\" rel=\"noopener\">Bokeh scatter plots<\/a><\/li>\n<li><a href=\"https:\/\/docs.bokeh.org\/en\/latest\/docs\/first_steps\/first_steps_1.html\" target=\"_blank\" rel=\"noopener\">Bokeh line charts<\/a><\/li>\n<\/ul>\n<h2 id=\"Summary\">Summary<\/h2>\n<p>In this tutorial, you discovered various options for data visualization in Python.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to create subplots in different rows and columns<\/li>\n<li>How to render images using Matplotlib<\/li>\n<li>How to generate 2D and 3D scatter plots using Matplotlib<\/li>\n<li>How to create 2D plots using seaborn and Bokeh<\/li>\n<li>How to create multi-line plots using Matplotlib, Seaborn and Bokeh<\/li>\n<\/ul>\n<p>Do you have any questions about data visualization options discussed in this post? Ask your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/data-visualization-in-python-with-matplotlib-seaborn-and-bokeh\/\">Data Visualization in Python with matplotlib, Seaborn and Bokeh<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/data-visualization-in-python-with-matplotlib-seaborn-and-bokeh\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Mehreen Saeed Data visualization is an important aspect of all AI and machine learning applications. You can gain key insights of your data through [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2022\/03\/21\/data-visualization-in-python-with-matplotlib-seaborn-and-bokeh\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":5503,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5502"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=5502"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5502\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/5503"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=5502"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=5502"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=5502"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}