{"id":5771,"date":"2022-07-21T06:28:55","date_gmt":"2022-07-21T06:28:55","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2022\/07\/21\/image-augmentation-with-keras-preprocessing-layers-and-tf-image\/"},"modified":"2022-07-21T06:28:55","modified_gmt":"2022-07-21T06:28:55","slug":"image-augmentation-with-keras-preprocessing-layers-and-tf-image","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2022\/07\/21\/image-augmentation-with-keras-preprocessing-layers-and-tf-image\/","title":{"rendered":"Image Augmentation with Keras Preprocessing Layers and tf.image"},"content":{"rendered":"<p>Author: Adrian Tam<\/p>\n<div>\n<p>When we work on a machine learning problem related to images, not only we need to collect some images as training data, but also need to employ augmentation to create variations in the image. It is especially true for more complex object recognition problems.<\/p>\n<p>There are many ways for image augmentation. You may use some external libraries or write your own functions for that. There are some modules in TensorFlow and Keras for augmentation, too. In this post you will discover how we can use the Keras preprocessing layer as well as <code>tf.image<\/code> module in TensorFlow for image augmentation.<\/p>\n<p>After reading this post, you will know:<\/p>\n<ul>\n<li>What are the Keras preprocessing layers and how to use them<\/li>\n<li>What are the functions provided by <code>tf.image<\/code> module for image augmentation<\/li>\n<li>How to use augmentation together with <code>tf.data<\/code> dataset<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_13770\" style=\"width: 810px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-13770\" class=\"wp-image-13770 size-full\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/steven-kamenar-MMJx78V7xS8-unsplash.jpg\" alt=\"\" width=\"800\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/steven-kamenar-MMJx78V7xS8-unsplash.jpg 1920w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/steven-kamenar-MMJx78V7xS8-unsplash-300x225.jpg 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/steven-kamenar-MMJx78V7xS8-unsplash-1024x768.jpg 1024w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/steven-kamenar-MMJx78V7xS8-unsplash-768x576.jpg 768w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/steven-kamenar-MMJx78V7xS8-unsplash-1536x1152.jpg 1536w\" sizes=\"(max-width: 1920px) 100vw, 1920px\"><\/p>\n<p id=\"caption-attachment-13770\" class=\"wp-caption-text\">Image Augmentation with Keras Preprocessing Layers and tf.image.<br \/>Photo by <a href=\"https:\/\/unsplash.com\/photos\/MMJx78V7xS8\">Steven Kamenar<\/a>. Some rights reserved.<\/p>\n<\/div>\n<h2>Overview<\/h2>\n<p>This article is split into five sections; they are:<\/p>\n<ul>\n<li>Getting Images<\/li>\n<li>Visualizing the Images<\/li>\n<li>Keras Preprocessing Layesr<\/li>\n<li>Using tf.image API for Augmentation<\/li>\n<li>Using Preprocessing Layers in Neural Networks<\/li>\n<\/ul>\n<h2>Getting Images<\/h2>\n<p>Before we see how we can do augmentation, we need to get the images. Ultimately, we need the images to be represented as arrays, for example, in HxWx3 in 8-bit integers for the RGB pixel value. There are many ways to get the images. Some can be downloaded as a ZIP file. If you\u2019re using TensorFlow, you may get some image dataset from the <code>tensorflow_datasets<\/code> library.<\/p>\n<p>In this tutorial, we are going to use the citrus leaves images, which is a small dataset in less than 100MB. It can be downloaded from <code>tensorflow_datasets<\/code> as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import tensorflow_datasets as tfds\r\nds, meta = tfds.load('citrus_leaves', with_info=True, split='train', shuffle_files=True)<\/pre>\n<p>Running this code the first time will download the image dataset into your computer with the following output:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Downloading and preparing dataset 63.87 MiB (download: 63.87 MiB, generated: 37.89 MiB, total: 101.76 MiB) to ~\/tensorflow_datasets\/citrus_leaves\/0.1.2...\r\nExtraction completed...: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1\/1 [00:06&lt;00:00,  6.54s\/ file]\r\nDl Size...: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 63\/63 [00:06&lt;00:00,  9.63 MiB\/s]\r\nDl Completed...: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1\/1 [00:06&lt;00:00,  6.54s\/ url]\r\nDataset citrus_leaves downloaded and prepared to ~\/tensorflow_datasets\/citrus_leaves\/0.1.2. Subsequent calls will reuse this data.<\/pre>\n<p>The function above returns the images as a <code>tf.data<\/code> dataset object and the metadata. This is a classification dataset. We can print the training labels with the following:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\nfor i in range(meta.features['label'].num_classes):\r\n    print(meta.features['label'].int2str(i))<\/pre>\n<p>and this prints:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Black spot\r\ncanker\r\ngreening\r\nhealthy<\/pre>\n<p>If you run this code again at a later time, you will reuse the downloaded image. But the other way to load the downloaded images into a <code>tf.data<\/code> dataset is to the <code>image_dataset_from_directory()<\/code> function.<\/p>\n<p>As we can see the screen output above, the dataset is downloaded into the directory <code>~\/tensorflow_datasets<\/code>. If you look at the directory, you see the directory structure as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\/Citrus\/Leaves\r\n\u251c\u2500\u2500 Black spot\r\n\u251c\u2500\u2500 Melanose\r\n\u251c\u2500\u2500 canker\r\n\u251c\u2500\u2500 greening\r\n\u2514\u2500\u2500 healthy<\/pre>\n<p>The directories are the labels and the images are files stored under their corresponding directory. We can let the function to read the directory recursively into a dataset:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import tensorflow as tf\r\nfrom tensorflow.keras.utils import image_dataset_from_directory\r\n\r\n# set to fixed image size 256x256\r\nPATH = \"...\/Citrus\/Leaves\"\r\nds = image_dataset_from_directory(PATH,\r\n                                  validation_split=0.2, subset=\"training\",\r\n                                  image_size=(256,256), interpolation=\"bilinear\",\r\n                                  crop_to_aspect_ratio=True,\r\n                                  seed=42, shuffle=True, batch_size=32)<\/pre>\n<p>You may want to set <code>batch_size=None<\/code> if you do not want the dataset to be batched. Usually we would like the dataset to be batched for training a neural network model.<\/p>\n<h2>Visualizing the Images<\/h2>\n<p>It is important to visualize the augmentation result so we can verify the augmentation result is what we want it to be. We can use matplotlib for this.<\/p>\n<p>In matplotlib, we have the <code>imshow()<\/code> function to display an image. However, for the image to be displayed correctly, the image should be presented as an array of 8-bit unsigned integer (uint8).<\/p>\n<p>Given we have a dataset created using <code>image_dataset_from_directory()<\/code>, we can get the first batch (of 32 images) and display a few of them using <code>imshow()<\/code>, as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\nimport matplotlib.pyplot as plt\r\n\r\nfig, ax = plt.subplots(3, 3, sharex=True, sharey=True, figsize=(5,5))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        for j in range(3):\r\n            ax[i][j].imshow(images[i*3+j].numpy().astype(\"uint8\"))\r\n            ax[i][j].set_title(ds.class_names[labels[i*3+j]])\r\nplt.show()<\/pre>\n<p>Here we display 9 images in a grid, and label the images with their corresponding classification label, using <code>ds.class_names<\/code>. The images should be converted to NumPy array in uint8 for display. This code displays an image like the following:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13771\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-1.png\" alt=\"\" width=\"317\" height=\"319\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-1.png 317w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-1-298x300.png 298w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-1-150x150.png 150w\" sizes=\"(max-width: 317px) 100vw, 317px\"><\/p>\n<p>The complete code from loading the image to display is as follows.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.utils import image_dataset_from_directory\r\nimport matplotlib.pyplot as plt\r\n\r\n# use image_dataset_from_directory() to load images, with image size scaled to 256x256\r\nPATH='...\/Citrus\/Leaves'  # modify to your path\r\nds = image_dataset_from_directory(PATH,\r\n                                  validation_split=0.2, subset=\"training\",\r\n                                  image_size=(256,256), interpolation=\"mitchellcubic\",\r\n                                  crop_to_aspect_ratio=True,\r\n                                  seed=42, shuffle=True, batch_size=32)\r\n\r\n# Take one batch from dataset and display the images\r\nfig, ax = plt.subplots(3, 3, sharex=True, sharey=True, figsize=(5,5))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        for j in range(3):\r\n            ax[i][j].imshow(images[i*3+j].numpy().astype(\"uint8\"))\r\n            ax[i][j].set_title(ds.class_names[labels[i*3+j]])\r\nplt.show()<\/pre>\n<p>Note that, if you\u2019re using <code>tensorflow_datasets<\/code> to get the image, the samples are presented as a dictionary instead of a tuple of (image,label). You should change your code slightly into the following:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import tensorflow_datasets as tfds\r\nimport matplotlib.pyplot as plt\r\n\r\n# use tfds.load() or image_dataset_from_directory() to load images\r\nds, meta = tfds.load('citrus_leaves', with_info=True, split='train', shuffle_files=True)\r\nds = ds.batch(32)\r\n\r\n# Take one batch from dataset and display the images\r\nfig, ax = plt.subplots(3, 3, sharex=True, sharey=True, figsize=(5,5))\r\n\r\nfor sample in ds.take(1):\r\n    images, labels = sample[\"image\"], sample[\"label\"]\r\n    for i in range(3):\r\n        for j in range(3):\r\n            ax[i][j].imshow(images[i*3+j].numpy().astype(\"uint8\"))\r\n            ax[i][j].set_title(meta.features['label'].int2str(labels[i*3+j]))\r\nplt.show()<\/pre>\n<p>In the rest of this post, we assume the dataset is created using <code>image_dataset_from_directory()<\/code>. You may need to tweak the code slightly if your dataset is created differently.<\/p>\n<h2>Keras Preprocessing Layers<\/h2>\n<p>Keras comes with many neural network layers such as convolution layers that we need to train. There are also layers with no parameters to train, such as flatten layers to convert an array such as an image into a vector.<\/p>\n<p>The preprocessing layers in Keras are specifically designed to use in early stages in a neural network. We can use them for image preprocessing, such as to resize or rotate the image or to adjust the brightness and contrast. While the preprocessing layers are supposed to be part of a larger neural network, we can also use them as functions. Below is how we can use the resizing layer as a function to transform some images and display them side-by-side with the original:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n\r\n# create a resizing layer\r\nout_height, out_width = 128,256\r\nresize = tf.keras.layers.Resizing(out_height, out_width)\r\n\r\n# show original vs resized\r\nfig, ax = plt.subplots(2, 3, figsize=(6,4))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # resize\r\n        ax[1][i].imshow(resize(images[i]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"resize\")\r\nplt.show()<\/pre>\n<p>Our images are in 256\u00d7256 pixels and the resizing layer will make them into 256\u00d7128 pixels. The output of the above code is as follows:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13772\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-2.png\" alt=\"\" width=\"375\" height=\"239\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-2.png 375w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-2-300x191.png 300w\" sizes=\"(max-width: 375px) 100vw, 375px\"><\/p>\n<p>Since the resizing layer is a function itself, we can chain them to the dataset itself. For example,<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\ndef augment(image, label):\r\n    return resize(image), label\r\n\r\nresized_ds = ds.map(augment)\r\n\r\nfor image, label in resized_ds:\r\n   ...<\/pre>\n<p>The dataset <code>ds<\/code> has samples in the form of <code>(image, label)<\/code>. Hence we created a function that takes in such tuple and preprocess the image with the resizing layer. We assigned this function as an argument for <code>map()<\/code> in the dataset. When we draw a sample from the new dataset created with the <code>map()<\/code> function, the image will be a transformed one.<\/p>\n<p>There are more preprocessing layers available. In below, we demonstrate some.<\/p>\n<p>As we saw above, we can resize the image. We can also randomly enlarge or shrink the height or width of an image. Similarly, we can zoom in or zoom out on an image. Below is an example to manipulate the image size in various ways for a maximum of 30% increase or decrease:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n\r\n# Create preprocessing layers\r\nout_height, out_width = 128,256\r\nresize = tf.keras.layers.Resizing(out_height, out_width)\r\nheight = tf.keras.layers.RandomHeight(0.3)\r\nwidth = tf.keras.layers.RandomWidth(0.3)\r\nzoom = tf.keras.layers.RandomZoom(0.3)\r\n\r\n# Visualize images and augmentations\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # resize\r\n        ax[1][i].imshow(resize(images[i]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"resize\")\r\n        # height\r\n        ax[2][i].imshow(height(images[i]).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"height\")\r\n        # width\r\n        ax[3][i].imshow(width(images[i]).numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"width\")\r\n        # zoom\r\n        ax[4][i].imshow(zoom(images[i]).numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"zoom\")\r\nplt.show()<\/pre>\n<p>This code shows images as follows:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13773\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-3.png\" alt=\"\" width=\"375\" height=\"775\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-3.png 375w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-3-145x300.png 145w\" sizes=\"(max-width: 375px) 100vw, 375px\"><\/p>\n<p>While we specified a fixed dimension in resize, we have a random amount of manipulation in other augmentations.<\/p>\n<p>We can also do flipping, rotation, cropping, and geometric translation using preprocessing layers:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# Create preprocessing layers\r\nflip = tf.keras.layers.RandomFlip(\"horizontal_and_vertical\") # or \"horizontal\", \"vertical\"\r\nrotate = tf.keras.layers.RandomRotation(0.2)\r\ncrop = tf.keras.layers.RandomCrop(out_height, out_width)\r\ntranslation = tf.keras.layers.RandomTranslation(height_factor=0.2, width_factor=0.2)\r\n\r\n# Visualize augmentations\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # flip\r\n        ax[1][i].imshow(flip(images[i]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"flip\")\r\n        # crop\r\n        ax[2][i].imshow(crop(images[i]).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"crop\")\r\n        # translation\r\n        ax[3][i].imshow(translation(images[i]).numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"translation\")\r\n        # rotate\r\n        ax[4][i].imshow(rotate(images[i]).numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"rotate\")\r\nplt.show()<\/pre>\n<p>This code shows the following images:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13774\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-4.png\" alt=\"\" width=\"375\" height=\"775\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-4.png 375w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-4-145x300.png 145w\" sizes=\"(max-width: 375px) 100vw, 375px\"><\/p>\n<p>And finally, we can do augmentations on color adjustments as well:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\nbrightness = tf.keras.layers.RandomBrightness([-0.8,0.8])\r\ncontrast = tf.keras.layers.RandomContrast(0.2)\r\n\r\n# Visualize augmentation\r\nfig, ax = plt.subplots(3, 3, figsize=(6,7))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # brightness\r\n        ax[1][i].imshow(brightness(images[i]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"brightness\")\r\n        # contrast\r\n        ax[2][i].imshow(contrast(images[i]).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"contrast\")\r\nplt.show()<\/pre>\n<p>This shows the images as follows:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13775\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-5.png\" alt=\"\" width=\"375\" height=\"414\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-5.png 375w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-5-272x300.png 272w\" sizes=\"(max-width: 375px) 100vw, 375px\"><\/p>\n<p>For completeness, below is the code to display the result of various augmentations:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.utils import image_dataset_from_directory\r\nimport tensorflow as tf\r\nimport matplotlib.pyplot as plt\r\n\r\n# use image_dataset_from_directory() to load images, with image size scaled to 256x256\r\nPATH='...\/Citrus\/Leaves'  # modify to your path\r\nds = image_dataset_from_directory(PATH,\r\n                                  validation_split=0.2, subset=\"training\",\r\n                                  image_size=(256,256), interpolation=\"mitchellcubic\",\r\n                                  crop_to_aspect_ratio=True,\r\n                                  seed=42, shuffle=True, batch_size=32)\r\n\r\n# Create preprocessing layers\r\nout_height, out_width = 128,256\r\nresize = tf.keras.layers.Resizing(out_height, out_width)\r\nheight = tf.keras.layers.RandomHeight(0.3)\r\nwidth = tf.keras.layers.RandomWidth(0.3)\r\nzoom = tf.keras.layers.RandomZoom(0.3)\r\n\r\nflip = tf.keras.layers.RandomFlip(\"horizontal_and_vertical\")\r\nrotate = tf.keras.layers.RandomRotation(0.2)\r\ncrop = tf.keras.layers.RandomCrop(out_height, out_width)\r\ntranslation = tf.keras.layers.RandomTranslation(height_factor=0.2, width_factor=0.2)\r\n\r\nbrightness = tf.keras.layers.RandomBrightness([-0.8,0.8])\r\ncontrast = tf.keras.layers.RandomContrast(0.2)\r\n\r\n# Visualize images and augmentations\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # resize\r\n        ax[1][i].imshow(resize(images[i]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"resize\")\r\n        # height\r\n        ax[2][i].imshow(height(images[i]).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"height\")\r\n        # width\r\n        ax[3][i].imshow(width(images[i]).numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"width\")\r\n        # zoom\r\n        ax[4][i].imshow(zoom(images[i]).numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"zoom\")\r\nplt.show()\r\n\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # flip\r\n        ax[1][i].imshow(flip(images[i]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"flip\")\r\n        # crop\r\n        ax[2][i].imshow(crop(images[i]).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"crop\")\r\n        # translation\r\n        ax[3][i].imshow(translation(images[i]).numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"translation\")\r\n        # rotate\r\n        ax[4][i].imshow(rotate(images[i]).numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"rotate\")\r\nplt.show()\r\n\r\nfig, ax = plt.subplots(3, 3, figsize=(6,7))\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # brightness\r\n        ax[1][i].imshow(brightness(images[i]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"brightness\")\r\n        # contrast\r\n        ax[2][i].imshow(contrast(images[i]).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"contrast\")\r\nplt.show()<\/pre>\n<p>Finally, it is important to point out that most neural network model can work better if the input images are scaled. While we usually use 8-bit unsigned integer for the pixel values in an image (e.g., for display using <code>imshow()<\/code> as above), neural network prefers the pixel values to be between 0 and 1, or between -1 and +1. This can be done with a preprocessing layers, too. Below is how we can update one of our example above to add the scaling layer into the augmentation:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\nout_height, out_width = 128,256\r\nresize = tf.keras.layers.Resizing(out_height, out_width)\r\nrescale = tf.keras.layers.Rescaling(1\/127.5, offset=-1)  # rescale pixel values to [-1,1]\r\n\r\ndef augment(image, label):\r\n    return rescale(resize(image)), label\r\n\r\nrescaled_resized_ds = ds.map(augment)\r\n\r\nfor image, label in rescaled_resized_ds:\r\n   ...<\/pre>\n<\/p>\n<h2>Using tf.image API for Augmentation<\/h2>\n<p>Besides the preprocessing layer, the <code>tf.image<\/code> module also provided some functions for augmentation. Unlike the preprocessing layer, these functions are intended to be used in a user-defined function and assigned to a dataset using <code>map()<\/code> as we saw above.<\/p>\n<p>The functions provided by <code>tf.image<\/code> are not duplicates of the preprocessing layers, although there are some overlap. Below is an example of using the <code>tf.image<\/code> functions to resize and crop images:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        # original\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # resize\r\n        h = int(256 * tf.random.uniform([], minval=0.8, maxval=1.2))\r\n        w = int(256 * tf.random.uniform([], minval=0.8, maxval=1.2))\r\n        ax[1][i].imshow(tf.image.resize(images[i], [h,w]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"resize\")\r\n        # crop\r\n        y, x, h, w = (128 * tf.random.uniform((4,))).numpy().astype(\"uint8\")\r\n        ax[2][i].imshow(tf.image.crop_to_bounding_box(images[i], y, x, h, w).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"crop\")\r\n        # central crop\r\n        x = tf.random.uniform([], minval=0.4, maxval=1.0)\r\n        ax[3][i].imshow(tf.image.central_crop(images[i], x).numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"central crop\")\r\n        # crop to (h,w) at random offset\r\n        h, w = (256 * tf.random.uniform((2,))).numpy().astype(\"uint8\")\r\n        seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(\"int32\")\r\n        ax[4][i].imshow(tf.image.stateless_random_crop(images[i], [h,w,3], seed).numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"random crop\")\r\nplt.show()<\/pre>\n<p>Below is the output of the above code:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13776\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-6.png\" alt=\"\" width=\"376\" height=\"792\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-6.png 376w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-6-142x300.png 142w\" sizes=\"(max-width: 376px) 100vw, 376px\"><\/p>\n<p>While the display of images match what we would expect from the code, the use of <code>tf.image<\/code> functions is quite different from that of the preprocessing layers. Every <code>tf.image<\/code> function is different. Therefore, we can see the <code>crop_to_bounding_box()<\/code> function takes pixel coordinates but the <code>central_crop()<\/code> function assumes a fraction ratio as argument.<\/p>\n<p>These functions are also different in the way randomness is handled. Some of these function does not assume random behavior. Therefore, the random resize should have the exact output size generated using a random number generator separately before calling the resize function. Some other function, such as <code>stateless_random_crop()<\/code>, can do augmentation randomly but a pair of random seed in <code>int32<\/code> needs to be specified explicitly.<\/p>\n<p>To continue the example, there are the functions for flipping an image and extracting the Sobel edges:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # flip\r\n        seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(\"int32\")\r\n        ax[1][i].imshow(tf.image.stateless_random_flip_left_right(images[i], seed).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"flip left-right\")\r\n        # flip\r\n        seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(\"int32\")\r\n        ax[2][i].imshow(tf.image.stateless_random_flip_up_down(images[i], seed).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"flip up-down\")\r\n        # sobel edge\r\n        sobel = tf.image.sobel_edges(images[i:i+1])\r\n        ax[3][i].imshow(sobel[0, ..., 0].numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"sobel y\")\r\n        # sobel edge\r\n        ax[4][i].imshow(sobel[0, ..., 1].numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"sobel x\")\r\nplt.show()<\/pre>\n<p>which shows the following:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13777\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-7.png\" alt=\"\" width=\"375\" height=\"775\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-7.png 375w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-7-145x300.png 145w\" sizes=\"(max-width: 375px) 100vw, 375px\"><\/p>\n<p>And the following are the functions to manipulate the brightness, contrast, and colors:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\n\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # brightness\r\n        seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(\"int32\")\r\n        ax[1][i].imshow(tf.image.stateless_random_brightness(images[i], 0.3, seed).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"brightness\")\r\n        # contrast\r\n        ax[2][i].imshow(tf.image.stateless_random_contrast(images[i], 0.7, 1.3, seed).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"contrast\")\r\n        # saturation\r\n        ax[3][i].imshow(tf.image.stateless_random_saturation(images[i], 0.7, 1.3, seed).numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"saturation\")\r\n        # hue\r\n        ax[4][i].imshow(tf.image.stateless_random_hue(images[i], 0.3, seed).numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"hue\")\r\nplt.show()<\/pre>\n<p>This code shows the following:<img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13778\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-8.png\" alt=\"\" width=\"375\" height=\"775\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-8.png 375w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/07\/tfimage-8-145x300.png 145w\" sizes=\"(max-width: 375px) 100vw, 375px\"><\/p>\n<p>Below is the complete code to display all of the above:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.utils import image_dataset_from_directory\r\nimport tensorflow as tf\r\nimport matplotlib.pyplot as plt\r\n\r\n# use image_dataset_from_directory() to load images, with image size scaled to 256x256\r\nPATH='...\/Citrus\/Leaves'  # modify to your path\r\nds = image_dataset_from_directory(PATH,\r\n                                  validation_split=0.2, subset=\"training\",\r\n                                  image_size=(256,256), interpolation=\"mitchellcubic\",\r\n                                  crop_to_aspect_ratio=True,\r\n                                  seed=42, shuffle=True, batch_size=32)\r\n\r\n# Visualize tf.image augmentations\r\n\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        # original\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # resize\r\n        h = int(256 * tf.random.uniform([], minval=0.8, maxval=1.2))\r\n        w = int(256 * tf.random.uniform([], minval=0.8, maxval=1.2))\r\n        ax[1][i].imshow(tf.image.resize(images[i], [h,w]).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"resize\")\r\n        # crop\r\n        y, x, h, w = (128 * tf.random.uniform((4,))).numpy().astype(\"uint8\")\r\n        ax[2][i].imshow(tf.image.crop_to_bounding_box(images[i], y, x, h, w).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"crop\")\r\n        # central crop\r\n        x = tf.random.uniform([], minval=0.4, maxval=1.0)\r\n        ax[3][i].imshow(tf.image.central_crop(images[i], x).numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"central crop\")\r\n        # crop to (h,w) at random offset\r\n        h, w = (256 * tf.random.uniform((2,))).numpy().astype(\"uint8\")\r\n        seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(\"int32\")\r\n        ax[4][i].imshow(tf.image.stateless_random_crop(images[i], [h,w,3], seed).numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"random crop\")\r\nplt.show()\r\n\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # flip\r\n        seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(\"int32\")\r\n        ax[1][i].imshow(tf.image.stateless_random_flip_left_right(images[i], seed).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"flip left-right\")\r\n        # flip\r\n        seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(\"int32\")\r\n        ax[2][i].imshow(tf.image.stateless_random_flip_up_down(images[i], seed).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"flip up-down\")\r\n        # sobel edge\r\n        sobel = tf.image.sobel_edges(images[i:i+1])\r\n        ax[3][i].imshow(sobel[0, ..., 0].numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"sobel y\")\r\n        # sobel edge\r\n        ax[4][i].imshow(sobel[0, ..., 1].numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"sobel x\")\r\nplt.show()\r\n\r\nfig, ax = plt.subplots(5, 3, figsize=(6,14))\r\nfor images, labels in ds.take(1):\r\n    for i in range(3):\r\n        ax[0][i].imshow(images[i].numpy().astype(\"uint8\"))\r\n        ax[0][i].set_title(\"original\")\r\n        # brightness\r\n        seed = tf.random.uniform((2,), minval=0, maxval=65536).numpy().astype(\"int32\")\r\n        ax[1][i].imshow(tf.image.stateless_random_brightness(images[i], 0.3, seed).numpy().astype(\"uint8\"))\r\n        ax[1][i].set_title(\"brightness\")\r\n        # contrast\r\n        ax[2][i].imshow(tf.image.stateless_random_contrast(images[i], 0.7, 1.3, seed).numpy().astype(\"uint8\"))\r\n        ax[2][i].set_title(\"contrast\")\r\n        # saturation\r\n        ax[3][i].imshow(tf.image.stateless_random_saturation(images[i], 0.7, 1.3, seed).numpy().astype(\"uint8\"))\r\n        ax[3][i].set_title(\"saturation\")\r\n        # hue\r\n        ax[4][i].imshow(tf.image.stateless_random_hue(images[i], 0.3, seed).numpy().astype(\"uint8\"))\r\n        ax[4][i].set_title(\"hue\")\r\nplt.show()<\/pre>\n<p>These augmentation functions should be enough for most use. But if you have some specific idea on augmentation, probably you would need a better image processing library. <a href=\"https:\/\/docs.opencv.org\/4.x\/d6\/d00\/tutorial_py_root.html\">OpenCV<\/a> and <a href=\"https:\/\/pillow.readthedocs.io\/en\/stable\/\">Pillow<\/a> are common but powerful libraries that allows you to transform images better.<\/p>\n<h2>Using Preprocessing Layers in Neural Networks<\/h2>\n<p>We used the Keras preprocessing layers as functions in the examples above. But they can also be used as layers in a neural network. It is trivial to use. Below is an example on how we can incorporate a preprocessing layer into a classification network and train it using a dataset:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras.utils import image_dataset_from_directory\r\nimport tensorflow as tf\r\nimport matplotlib.pyplot as plt\r\n\r\n# use image_dataset_from_directory() to load images, with image size scaled to 256x256\r\nPATH='...\/Citrus\/Leaves'  # modify to your path\r\nds = image_dataset_from_directory(PATH,\r\n                                  validation_split=0.2, subset=\"training\",\r\n                                  image_size=(256,256), interpolation=\"mitchellcubic\",\r\n                                  crop_to_aspect_ratio=True,\r\n                                  seed=42, shuffle=True, batch_size=32)\r\n\r\nAUTOTUNE = tf.data.AUTOTUNE\r\nds = ds.cache().prefetch(buffer_size=AUTOTUNE)\r\n\r\nnum_classes = 5\r\nmodel = tf.keras.Sequential([\r\n  tf.keras.layers.RandomFlip(\"horizontal_and_vertical\"),\r\n  tf.keras.layers.RandomRotation(0.2),\r\n  tf.keras.layers.Rescaling(1\/127.0, offset=-1),\r\n  tf.keras.layers.Conv2D(32, 3, activation='relu'),\r\n  tf.keras.layers.MaxPooling2D(),\r\n  tf.keras.layers.Conv2D(32, 3, activation='relu'),\r\n  tf.keras.layers.MaxPooling2D(),\r\n  tf.keras.layers.Conv2D(32, 3, activation='relu'),\r\n  tf.keras.layers.MaxPooling2D(),\r\n  tf.keras.layers.Flatten(),\r\n  tf.keras.layers.Dense(128, activation='relu'),\r\n  tf.keras.layers.Dense(num_classes)\r\n])\r\n\r\nmodel.compile(optimizer='adam',\r\n              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\r\n              metrics=['accuracy'])\r\n  \r\nmodel.fit(ds, epochs=3)<\/pre>\n<p>Running this code gives the following output:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Found 609 files belonging to 5 classes.\r\nUsing 488 files for training.\r\nEpoch 1\/3\r\n16\/16 [==============================] - 5s 253ms\/step - loss: 1.4114 - accuracy: 0.4283\r\nEpoch 2\/3\r\n16\/16 [==============================] - 4s 259ms\/step - loss: 0.8101 - accuracy: 0.6475\r\nEpoch 3\/3\r\n16\/16 [==============================] - 4s 267ms\/step - loss: 0.7015 - accuracy: 0.7111<\/pre>\n<p>In the code above, we created the dataset with <code>cache()<\/code> and <code>prefetch()<\/code>. This is a performance technique to allow the dataset to prepare data asynchronously while the neural network is trained. This would be significant if the dataset has some other augmentation assigned using the <code>map()<\/code> function.<\/p>\n<p>You will see some improvement in accuracy if you removed the <code>RandomFlip<\/code> and <code>RandomRotation<\/code> layers because you make the problem easier. However, as we want the network to predict well on a wide variations of image quality and properties, using augmentation can help our resulting network more powerful.<\/p>\n<h2>Further Reading<\/h2>\n<p>Below are documentations from TensorFlow that are related to the examples above:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/data\/Dataset\"><code>tf.data.Dataset<\/code> API<\/a><\/li>\n<li><a href=\"https:\/\/www.tensorflow.org\/datasets\/catalog\/citrus_leaves\">Citrus leaves dataset<\/a><\/li>\n<li><a href=\"https:\/\/www.tensorflow.org\/tutorials\/load_data\/images\">Load and preprocess images<\/a><\/li>\n<li><a href=\"https:\/\/www.tensorflow.org\/tutorials\/images\/data_augmentation\">Data augmentation<\/a><\/li>\n<li>\n<a href=\"https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/image\"><code>tf.image<\/code><\/a>\u00a0API<\/li>\n<li><a href=\"https:\/\/www.tensorflow.org\/guide\/data_performance\"><code>tf.data<\/code> performance<\/a><\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this post, you have seen how we can use the <code>tf.data<\/code> dataset with image augmentation functions from Keras and TensorFlow.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to use the preprocessing layers from Keras, both as a function and as part of a neural network<\/li>\n<li>How to create your own image augmentation function and apply it to the dataset using the <code>map()<\/code> function<\/li>\n<li>How to use the functions provided by the <code>tf.image<\/code> module for image augmentation<\/li>\n<\/ul>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/image-augmentation-with-keras-preprocessing-layers-and-tf-image\/\">Image Augmentation with Keras Preprocessing Layers and tf.image<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/image-augmentation-with-keras-preprocessing-layers-and-tf-image\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Adrian Tam When we work on a machine learning problem related to images, not only we need to collect some images as training data, [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2022\/07\/21\/image-augmentation-with-keras-preprocessing-layers-and-tf-image\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":5772,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5771"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=5771"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5771\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/5772"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=5771"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=5771"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=5771"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}