{"id":2446,"date":"2019-08-08T19:00:38","date_gmt":"2019-08-08T19:00:38","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/08\/how-to-develop-a-cyclegan-for-image-to-image-translation-with-keras\/"},"modified":"2019-08-08T19:00:38","modified_gmt":"2019-08-08T19:00:38","slug":"how-to-develop-a-cyclegan-for-image-to-image-translation-with-keras","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/08\/how-to-develop-a-cyclegan-for-image-to-image-translation-with-keras\/","title":{"rendered":"How to Develop a CycleGAN for Image-to-Image Translation with Keras"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>The Cycle Generative Adversarial Network, or CycleGAN, is an approach to training a deep convolutional neural network for image-to-image translation tasks.<\/p>\n<p>Unlike other GAN models for image translation, the CycleGAN does not require a dataset of paired images. For example, if we are interested in translating photographs of oranges to apples, we do not require a training dataset of oranges that have been manually converted to apples. This allows the development of a translation model on problems where training datasets may not exist, such as translating paintings to photographs.<\/p>\n<p>In this tutorial, you will discover how to develop a CycleGAN model to translate photos of horses to zebras, and back again.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to load and prepare the horses to zebras image translation dataset for modeling.<\/li>\n<li>How to train a pair of CycleGAN generator models for translating horses to zebras and zebras to horses.<\/li>\n<li>How to load saved CycleGAN models and use them to translate photographs.<\/li>\n<\/ul>\n<p>Discover how to develop DCGANs, conditional GANs, Pix2Pix, CycleGANs, and more with Keras <a href=\"https:\/\/machinelearningmastery.com\/generative_adversarial_networks\/\" rel=\"nofollow\">in my new GANs book<\/a>, with 29 step-by-step tutorials and full source code.<\/p>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_8408\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8408\" class=\"size-full wp-image-8408\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/08\/How-to-Develop-a-CycleGAN-for-Image-to-Image-Translation-with-Keras.jpg\" alt=\"How to Develop a CycleGAN for Image-to-Image Translation with Keras\" width=\"640\" height=\"396\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/How-to-Develop-a-CycleGAN-for-Image-to-Image-Translation-with-Keras.jpg 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/How-to-Develop-a-CycleGAN-for-Image-to-Image-Translation-with-Keras-300x186.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8408\" class=\"wp-caption-text\">How to Develop a CycleGAN for Image-to-Image Translation with Keras<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/tzirma\/4346635061\/\">A. Munar<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into four parts; they are:<\/p>\n<ol>\n<li>What Is the CycleGAN?<\/li>\n<li>How to Prepare the Horses to Zebras Dataset<\/li>\n<li>How to Develop a CycleGAN to Translate Horses to Zebras<\/li>\n<li>How to Perform Image Translation with CycleGAN Generators<\/li>\n<\/ol>\n<h2>What Is the CycleGAN?<\/h2>\n<p>The CycleGAN model was described by <a href=\"https:\/\/people.csail.mit.edu\/junyanz\/\">Jun-Yan Zhu<\/a>, et al. in their 2017 paper titled \u201c<a href=\"https:\/\/arxiv.org\/abs\/1703.10593\">Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks<\/a>.\u201d<\/p>\n<p>The benefit of the CycleGAN model is that it can be trained without paired examples. That is, it does not require examples of photographs before and after the translation in order to train the model, e.g. photos of the same city landscape during the day and at night. Instead, the model is able to use a collection of photographs from each domain and extract and harness the underlying style of images in the collection in order to perform the translation.<\/p>\n<p>The model architecture is comprised of two generator models: one generator (Generator-A) for generating images for the first domain (Domain-A) and the second generator (Generator-B) for generating images for the second domain (Domain-B).<\/p>\n<ul>\n<li>Generator-A -> Domain-A<\/li>\n<li>Generator-B -> Domain-B<\/li>\n<\/ul>\n<p>The generator models perform image translation, meaning that the image generation process is conditional on an input image, specifically an image from the other domain. Generator-A takes an image from Domain-B as input and Generator-B takes an image from Domain-A as input.<\/p>\n<ul>\n<li>Domain-B -> Generator-A -> Domain-A<\/li>\n<li>Domain-A -> Generator-B -> Domain-B<\/li>\n<\/ul>\n<p>Each generator has a corresponding discriminator model. The first discriminator model (Discriminator-A) takes real images from Domain-A and generated images from Generator-A and predicts whether they are real or fake. The second discriminator model (Discriminator-B) takes real images from Domain-B and generated images from Generator-B and predicts whether they are real or fake.<\/p>\n<ul>\n<li>Domain-A -> Discriminator-A -> [Real\/Fake]<\/li>\n<li>Domain-B -> Generator-A -> Discriminator-A -> [Real\/Fake]<\/li>\n<li>Domain-B -> Discriminator-B -> [Real\/Fake]<\/li>\n<li>Domain-A -> Generator-B -> Discriminator-B -> [Real\/Fake]<\/li>\n<\/ul>\n<p>The discriminator and generator models are trained in an adversarial zero-sum process, like normal GAN models. The generators learn to better fool the discriminators and the discriminator learn to better detect fake images. Together, the models find an equilibrium during the training process.<\/p>\n<p>Additionally, the generator models are regularized to not just create new images in the target domain, but instead translate more reconstructed versions of the input images from the source domain. This is achieved by using generated images as input to the corresponding generator model and comparing the output image to the original images. Passing an image through both generators is called a cycle. Together, each pair of generator models are trained to better reproduce the original source image, referred to as cycle consistency.<\/p>\n<ul>\n<li>Domain-B -> Generator-A -> Domain-A -> Generator-B -> Domain-B<\/li>\n<li>Domain-A -> Generator-B -> Domain-B -> Generator-A -> Domain-A<\/li>\n<\/ul>\n<p>There is one further element to the architecture, referred to as the identity mapping. This is where a generator is provided with images as input from the target domain and is expected to generate the same image without change. This addition to the architecture is optional, although results in a better matching of the color profile of the input image.<\/p>\n<ul>\n<li>Domain-A -> Generator-A -> Domain-A<\/li>\n<li>Domain-B -> Generator-B -> Domain-B<\/li>\n<\/ul>\n<p>Now that we are familiar with the model architecture, we can take a closer look at each model in turn and how they can be implemented.<\/p>\n<p>The <a href=\"https:\/\/arxiv.org\/abs\/1703.10593\">paper<\/a> provides a good description of the models and training process, although the <a href=\"https:\/\/github.com\/junyanz\/CycleGAN\">official Torch implementation<\/a> was used as the definitive description for each model and training process and provides the basis for the the model implementations described below.<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Want to Develop GANs from Scratch?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/162526e1b172a2%3A164f8be4f346dc\/5926953912500224\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\" rel=\"noopener noreferrer\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"162526e1b172a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/162526e1b172a2%3A164f8be4f346dc\/5926953912500224\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1562872266.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h2>How to Prepare the Horses to Zebras Dataset<\/h2>\n<p>One of the impressive examples of the CycleGAN in the paper was to transform photographs of horses to zebras, and the reverse, zebras to horses.<\/p>\n<p>The authors of the paper referred to this as the problem of \u201c<em>object transfiguration<\/em>\u201d and it was also demonstrated on photographs of apples and oranges.<\/p>\n<p>In this tutorial, we will develop a CycleGAN from scratch for image-to-image translation (or object transfiguration) from horses to zebras and the reverse.<\/p>\n<p>We will refer to this dataset as \u201c<em>horses2zebra<\/em>\u201c. The zip file for this dataset about 111 megabytes and can be downloaded from the CycleGAN webpage:<\/p>\n<ul>\n<li><a href=\"https:\/\/people.eecs.berkeley.edu\/~taesung_park\/CycleGAN\/datasets\/horse2zebra.zip\">Download Horses to Zebras Dataset (111 megabytes)<\/a><\/li>\n<\/ul>\n<p>Download the dataset into your current working directory.<\/p>\n<p>You will see the following directory structure:<\/p>\n<pre class=\"crayon-plain-tag\">horse2zebra\r\n\u251c\u2500\u2500 testA\r\n\u251c\u2500\u2500 testB\r\n\u251c\u2500\u2500 trainA\r\n\u2514\u2500\u2500 trainB<\/pre>\n<p>The \u201c<em>A<\/em>\u201d category refers to horse and \u201c<em>B<\/em>\u201d category refers to zebra, and the dataset is comprised of train and test elements. We will load all photographs and use them as a training dataset.<\/p>\n<p>The photographs are square with the shape 256\u00d7256 and have filenames like \u201c<em>n02381460_2.jpg<\/em>\u201c.<\/p>\n<p>The example below will load all photographs from the train and test folders and create an array of images for category A and another for category B.<\/p>\n<p>Both arrays are then saved to a new file in compressed NumPy array format.<\/p>\n<pre class=\"crayon-plain-tag\"># example of preparing the horses and zebra dataset\r\nfrom os import listdir\r\nfrom numpy import asarray\r\nfrom numpy import vstack\r\nfrom keras.preprocessing.image import img_to_array\r\nfrom keras.preprocessing.image import load_img\r\nfrom numpy import savez_compressed\r\n\r\n# load all images in a directory into memory\r\ndef load_images(path, size=(256,256)):\r\n\tdata_list = list()\r\n\t# enumerate filenames in directory, assume all are images\r\n\tfor filename in listdir(path):\r\n\t\t# load and resize the image\r\n\t\tpixels = load_img(path + filename, target_size=size)\r\n\t\t# convert to numpy array\r\n\t\tpixels = img_to_array(pixels)\r\n\t\t# store\r\n\t\tdata_list.append(pixels)\r\n\treturn asarray(data_list)\r\n\r\n# dataset path\r\npath = 'horse2zebra\/'\r\n# load dataset A\r\ndataA1 = load_images(path + 'trainA\/')\r\ndataAB = load_images(path + 'testA\/')\r\ndataA = vstack((dataA1, dataAB))\r\nprint('Loaded dataA: ', dataA.shape)\r\n# load dataset B\r\ndataB1 = load_images(path + 'trainB\/')\r\ndataB2 = load_images(path + 'testB\/')\r\ndataB = vstack((dataB1, dataB2))\r\nprint('Loaded dataB: ', dataB.shape)\r\n# save as compressed numpy array\r\nfilename = 'horse2zebra_256.npz'\r\nsavez_compressed(filename, dataA, dataB)\r\nprint('Saved dataset: ', filename)<\/pre>\n<p>Running the example first loads all images into memory, showing that there are 1,187 photos in category A (horses) and 1,474 in category B (zebras).<\/p>\n<p>The arrays are then saved in compressed NumPy format with the filename \u201c<em>horse2zebra_256.npz<\/em>\u201c. Note: this data file is about 570 megabytes, larger than the raw images as we are storing pixel values as 32-bit floating point values.<\/p>\n<pre class=\"crayon-plain-tag\">Loaded dataA:  (1187, 256, 256, 3)\r\nLoaded dataB:  (1474, 256, 256, 3)\r\nSaved dataset:  horse2zebra_256.npz<\/pre>\n<p>We can then load the dataset and plot some of the photos to confirm that we are handling the image data correctly.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># load and plot the prepared dataset\r\nfrom numpy import load\r\nfrom matplotlib import pyplot\r\n# load the dataset\r\ndata = load('horse2zebra_256.npz')\r\ndataA, dataB = data['arr_0'], data['arr_1']\r\nprint('Loaded: ', dataA.shape, dataB.shape)\r\n# plot source images\r\nn_samples = 3\r\nfor i in range(n_samples):\r\n\tpyplot.subplot(2, n_samples, 1 + i)\r\n\tpyplot.axis('off')\r\n\tpyplot.imshow(dataA[i].astype('uint8'))\r\n# plot target image\r\nfor i in range(n_samples):\r\n\tpyplot.subplot(2, n_samples, 1 + n_samples + i)\r\n\tpyplot.axis('off')\r\n\tpyplot.imshow(dataB[i].astype('uint8'))\r\npyplot.show()<\/pre>\n<p>Running the example first loads the dataset, confirming the number of examples and shape of the color images match our expectations.<\/p>\n<pre class=\"crayon-plain-tag\">Loaded: (1187, 256, 256, 3) (1474, 256, 256, 3)<\/pre>\n<p>A plot is created showing a row of three images from the horse photo dataset (dataA) and a row of three images from the zebra dataset (dataB).<\/p>\n<div id=\"attachment_8400\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8400\" class=\"size-large wp-image-8400\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-Photographs-from-the-Horses2Zeba-Dataset-1024x768.png\" alt=\"Plot of Photographs from the Horses2Zeba Dataset\" width=\"1024\" height=\"768\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-Photographs-from-the-Horses2Zeba-Dataset-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-Photographs-from-the-Horses2Zeba-Dataset-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-Photographs-from-the-Horses2Zeba-Dataset-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-Photographs-from-the-Horses2Zeba-Dataset.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/p>\n<p id=\"caption-attachment-8400\" class=\"wp-caption-text\">Plot of Photographs from the Horses2Zeba Dataset<\/p>\n<\/div>\n<p>Now that we have prepared the dataset for modeling, we can develop the CycleGAN generator models that can translate photos from one category to the other, and the reverse.<\/p>\n<h2>How to Develop a CycleGAN to Translate Horse to Zebra<\/h2>\n<p>In this section, we will develop the CycleGAN model for translating photos of horses to zebras and photos of zebras to horses<\/p>\n<p>The same model architecture and configuration described in the paper was used across a range of image-to-image translation tasks. This architecture is both described in the body paper, with additional detail in the appendix of the paper, and a <a href=\"https:\/\/github.com\/junyanz\/CycleGAN\/\">fully working implementation<\/a> provided as open source implemented for the Torch deep learning framework.<\/p>\n<p>The implementation in this section will use the Keras deep learning framework based directly on the model described in the paper and implemented in the author\u2019s codebase, designed to take and generate color images with the size 256\u00d7256 pixels.<\/p>\n<p>The architecture is comprised of four models, two discriminator models, and two generator models.<\/p>\n<p>The discriminator is a deep <a href=\"https:\/\/machinelearningmastery.com\/convolutional-layers-for-deep-learning-neural-networks\/\">convolutional neural network<\/a> that performs image classification. It takes a source image as input and predicts the likelihood of whether the target image is a real or fake image. Two discriminator models are used, one for Domain-A (horses) and one for Domain-B (zebras).<\/p>\n<p>The discriminator design is based on the effective receptive field of the model, which defines the relationship between one output of the model to the number of pixels in the input image. This is called a PatchGAN model and is carefully designed so that each output prediction of the model maps to a 70\u00d770 square or patch of the input image. The benefit of this approach is that the same model can be applied to input images of different sizes, e.g. larger or smaller than 256\u00d7256 pixels.<\/p>\n<p>The output of the model depends on the size of the input image but may be one value or a square activation map of values. Each value is a probability for the likelihood that a patch in the input image is real. These values can be averaged to give an overall likelihood or classification score if needed.<\/p>\n<p>A pattern of Convolutional-BatchNorm-LeakyReLU layers is used in the model, which is common to deep convolutional discriminator models. Unlike other models, the CycleGAN discriminator uses <em>InstanceNormalization<\/em> instead of <em>BatchNormalization<\/em>. It is a very simple type of normalization and involves standardizing (e.g. scaling to a standard Gaussian) the values on each output feature map, rather than across features in a batch.<\/p>\n<p>An implementation of instance normalization is provided in the <a href=\"https:\/\/github.com\/keras-team\/keras-contrib\">keras-contrib project<\/a> that provides early access to community supplied Keras features.<\/p>\n<p>The keras-contrib library can be installed via pip as follows:<\/p>\n<pre class=\"crayon-plain-tag\">sudo pip install git+https:\/\/www.github.com\/keras-team\/keras-contrib.git<\/pre>\n<p>Or, if you are using an <a href=\"https:\/\/machinelearningmastery.com\/setup-python-environment-machine-learning-deep-learning-anaconda\/\">Anaconda<\/a> virtual environment, <a href=\"https:\/\/machinelearningmastery.com\/develop-evaluate-large-deep-learning-models-keras-amazon-web-services\/\">such as on EC2<\/a>:<\/p>\n<pre class=\"crayon-plain-tag\">git clone https:\/\/www.github.com\/keras-team\/keras-contrib.git\r\ncd keras-contrib\r\nsudo ~\/anaconda3\/envs\/tensorflow_p36\/bin\/python setup.py install<\/pre>\n<p>The new <em>InstanceNormalization<\/em> layer can then be used as follows:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\nfrom keras_contrib.layers.normalization.instancenormalization import InstanceNormalization\r\n# define layer\r\nlayer = InstanceNormalization(axis=-1)\r\n...<\/pre>\n<p>The \u201c<em>axis<\/em>\u201d argument is set to -1 to ensure that features are normalized per feature map.<\/p>\n<p>The <em>define_discriminator()<\/em> function below implements the 70\u00d770 PatchGAN discriminator model as per the design of the model in the paper. The model takes a 256\u00d7256 sized image as input and outputs a patch of predictions. The model is optimized using least squares loss (L2) implemented as mean squared error, and a weighting it used so that updates to the model have half (0.5) the usual effect. The authors of CycleGAN paper recommend this weighting of model updates to slow down changes to the discriminator, relative to the generator model during training.<\/p>\n<pre class=\"crayon-plain-tag\"># define the discriminator model\r\ndef define_discriminator(image_shape):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# source image input\r\n\tin_image = Input(shape=image_shape)\r\n\t# C64\r\n\td = Conv2D(64, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# C128\r\n\td = Conv2D(128, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)\r\n\td = InstanceNormalization(axis=-1)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# C256\r\n\td = Conv2D(256, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)\r\n\td = InstanceNormalization(axis=-1)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# C512\r\n\td = Conv2D(512, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)\r\n\td = InstanceNormalization(axis=-1)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# second last output layer\r\n\td = Conv2D(512, (4,4), padding='same', kernel_initializer=init)(d)\r\n\td = InstanceNormalization(axis=-1)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# patch output\r\n\tpatch_out = Conv2D(1, (4,4), padding='same', kernel_initializer=init)(d)\r\n\t# define model\r\n\tmodel = Model(in_image, patch_out)\r\n\t# compile model\r\n\tmodel.compile(loss='mse', optimizer=Adam(lr=0.0002, beta_1=0.5), loss_weights=[0.5])\r\n\treturn model<\/pre>\n<p>The generator model is more complex than the discriminator model.<\/p>\n<p>The generator is an encoder-decoder model architecture. The model takes a source image (e.g. horse photo) and generates a target image (e.g. zebra photo). It does this by first downsampling or encoding the input image down to a bottleneck layer, then interpreting the encoding with a number of ResNet layers that use skip connections, followed by a series of layers that upsample or decode the representation to the size of the output image.<\/p>\n<p>First, we need a function to define the <a href=\"https:\/\/machinelearningmastery.com\/how-to-implement-major-architecture-innovations-for-convolutional-neural-networks\/\">ResNet blocks<\/a>. These are blocks comprised of two 3\u00d73 CNN layers where the input to the block is concatenated to the output of the block, channel-wise.<\/p>\n<p>This is implemented in the <em>resnet_block()<\/em> function that creates two <em>Convolution-InstanceNorm<\/em> blocks with 3\u00d73 filters and <a href=\"https:\/\/machinelearningmastery.com\/padding-and-stride-for-convolutional-neural-networks\/\">1\u00d71 stride<\/a> and without a <a href=\"https:\/\/machinelearningmastery.com\/rectified-linear-activation-function-for-deep-learning-neural-networks\/\">ReLU activation<\/a> after the second block, matching the official Torch implementation in the <a href=\"https:\/\/github.com\/junyanz\/CycleGAN\/blob\/master\/models\/architectures.lua#L197\">build_conv_block() function<\/a>. Same padding is used instead of reflection padded recommended in the paper for simplicity.<\/p>\n<pre class=\"crayon-plain-tag\"># generator a resnet block\r\ndef resnet_block(n_filters, input_layer):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# first layer convolutional layer\r\n\tg = Conv2D(n_filters, (3,3), padding='same', kernel_initializer=init)(input_layer)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# second convolutional layer\r\n\tg = Conv2D(n_filters, (3,3), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\t# concatenate merge channel-wise with input layer\r\n\tg = Concatenate()([g, input_layer])\r\n\treturn g<\/pre>\n<p>Next, we can define a function that will create the 9-resnet block version for 256\u00d7256 input images. This can easily be changed to the 6-resnet block version by setting <em>image_shape<\/em> to (128x128x3) and <em>n_resnet<\/em> function argument to 6.<\/p>\n<p>Importantly, the model outputs pixel values with the shape as the input and pixel values are in the range [-1, 1], typical for GAN generator models.<\/p>\n<pre class=\"crayon-plain-tag\"># define the standalone generator model\r\ndef define_generator(image_shape, n_resnet=9):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# image input\r\n\tin_image = Input(shape=image_shape)\r\n\t# c7s1-64\r\n\tg = Conv2D(64, (7,7), padding='same', kernel_initializer=init)(in_image)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# d128\r\n\tg = Conv2D(128, (3,3), strides=(2,2), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# d256\r\n\tg = Conv2D(256, (3,3), strides=(2,2), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# R256\r\n\tfor _ in range(n_resnet):\r\n\t\tg = resnet_block(256, g)\r\n\t# u128\r\n\tg = Conv2DTranspose(128, (3,3), strides=(2,2), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# u64\r\n\tg = Conv2DTranspose(64, (3,3), strides=(2,2), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# c7s1-3\r\n\tg = Conv2D(3, (7,7), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tout_image = Activation('tanh')(g)\r\n\t# define model\r\n\tmodel = Model(in_image, out_image)\r\n\treturn model<\/pre>\n<p>The discriminator models are trained directly on real and generated images, whereas the generator models are not.<\/p>\n<p>Instead, the generator models are trained via their related discriminator models. Specifically, they are updated to minimize the loss predicted by the discriminator for generated images marked as \u201c<em>real<\/em>\u201c, called adversarial loss. As such, they are encouraged to generate images that better fit into the target domain.<\/p>\n<p>The generator models are also updated based on how effective they are at the regeneration of a source image when used with the other generator model, called cycle loss. Finally, a generator model is expected to output an image without translation when provided an example from the target domain, called identity loss.<\/p>\n<p>Altogether, each generator model is optimized via the combination of four outputs with four loss functions:<\/p>\n<ul>\n<li>Adversarial loss (L2 or mean squared error).<\/li>\n<li>Identity loss (L1 or mean absolute error).<\/li>\n<li>Forward cycle loss (L1 or mean absolute error).<\/li>\n<li>Backward cycle loss (L1 or mean absolute error).<\/li>\n<\/ul>\n<p>This can be achieved by defining a composite model used to train each generator model that is responsible for only updating the weights of that generator model, although it is required to share the weights with the related discriminator model and the other generator model.<\/p>\n<p>This is implemented in the <em>define_composite_model()<\/em> function below that takes a defined generator model (<em>g_model_1<\/em>) as well as the defined discriminator model for the generator models output (<em>d_model<\/em>) and the other generator model (<em>g_model_2<\/em>). The weights of the other models are marked as not trainable as we are only interested in updating the first generator model, i.e. the focus of this composite model.<\/p>\n<p>The discriminator is connected to the output of the generator in order to classify generated images as real or fake. A second input for the composite model is defined as an image from the target domain (instead of the source domain), which the generator is expected to output without translation for the identity mapping. Next, forward cycle loss involves connecting the output of the generator to the other generator, which will reconstruct the source image. Finally, the backward cycle loss involves the image from the target domain used for the identity mapping that is also passed through the other generator whose output is connected to our main generator as input and outputs a reconstructed version of that image from the target domain.<\/p>\n<p>To summarize, a composite model has two inputs for the real photos from Domain-A and Domain-B, and four outputs for the discriminator output, identity generated image, forward cycle generated image, and backward cycle generated image.<\/p>\n<p>Only the weights of the first or main generator model are updated for the composite model and this is done via the weighted sum of all loss functions. The cycle loss is given more weight (10-times) than the adversarial loss as described in the paper, and the identity loss is always used with a weighting half that of the cycle loss (5-times), matching the official implementation source code.<\/p>\n<pre class=\"crayon-plain-tag\"># define a composite model for updating generators by adversarial and cycle loss\r\ndef define_composite_model(g_model_1, d_model, g_model_2, image_shape):\r\n\t# ensure the model we're updating is trainable\r\n\tg_model_1.trainable = True\r\n\t# mark discriminator as not trainable\r\n\td_model.trainable = False\r\n\t# mark other generator model as not trainable\r\n\tg_model_2.trainable = False\r\n\t# discriminator element\r\n\tinput_gen = Input(shape=image_shape)\r\n\tgen1_out = g_model_1(input_gen)\r\n\toutput_d = d_model(gen1_out)\r\n\t# identity element\r\n\tinput_id = Input(shape=image_shape)\r\n\toutput_id = g_model_1(input_id)\r\n\t# forward cycle\r\n\toutput_f = g_model_2(gen1_out)\r\n\t# backward cycle\r\n\tgen2_out = g_model_2(input_id)\r\n\toutput_b = g_model_1(gen2_out)\r\n\t# define model graph\r\n\tmodel = Model([input_gen, input_id], [output_d, output_id, output_f, output_b])\r\n\t# define optimization algorithm configuration\r\n\topt = Adam(lr=0.0002, beta_1=0.5)\r\n\t# compile model with weighting of least squares loss and L1 loss\r\n\tmodel.compile(loss=['mse', 'mae', 'mae', 'mae'], loss_weights=[1, 5, 10, 10], optimizer=opt)\r\n\treturn model<\/pre>\n<p>We need to create a composite model for each generator model, e.g. the Generator-A (BtoA) for zebra to horse translation, and the Generator-B (AtoB) for horse to zebra translation.<\/p>\n<p>All of this forward and backward across two domains gets confusing. Below is a complete listing of all of the inputs and outputs for each of the composite models. Identity and cycle loss are calculated as the L1 distance between the input and output image for each sequence of translations. Adversarial loss is calculated as the L2 distance between the model output and the target values of 1.0 for real and 0.0 for fake.<\/p>\n<p><strong>Generator-A Composite Model (BtoA or Zebra to Horse)<\/strong><\/p>\n<p>The inputs, transformations, and outputs of the model are as follows:<\/p>\n<ul>\n<li><strong>Adversarial Loss: Domain-B<\/strong> -> Generator-A -> Domain-A -> Discriminator-A -> [real\/fake]<\/li>\n<li><strong>Identity Loss<\/strong>: Domain-A -> Generator-A -> Domain-A<\/li>\n<li><strong>Forward Cycle Loss<\/strong>: Domain-B -> Generator-A -> Domain-A -> Generator-B -> Domain-B<\/li>\n<li><strong>Backward Cycle Loss<\/strong>: Domain-A -> Generator-B -> Domain-B -> Generator-A -> Domain-A<\/li>\n<\/ul>\n<p>We can summarize the inputs and outputs as:<\/p>\n<ul>\n<li><strong>Inputs<\/strong>: Domain-B, Domain-A<\/li>\n<li><strong>Outputs<\/strong>: Real, Domain-A, Domain-B, Domain-A<\/li>\n<\/ul>\n<p><strong>Generator-B Composite Model (AtoB or Horse to Zebra)<\/strong><\/p>\n<p>The inputs, transformations, and outputs of the model are as follows:<\/p>\n<ul>\n<li><strong>Adversarial Loss<\/strong>: Domain-A -> Generator-B -> Domain-B -> Discriminator-B -> [real\/fake]<\/li>\n<li><strong>Identity Loss<\/strong>: Domain-B -> Generator-B -> Domain-B<\/li>\n<li><strong>Forward Cycle Loss<\/strong>: Domain-A -> Generator-B -> Domain-B -> Generator-A -> Domain-A<\/li>\n<li><strong>Backward Cycle Loss<\/strong>: Domain-B -> Generator-A -> Domain-A -> Generator-B -> Domain-B<\/li>\n<\/ul>\n<p>We can summarize the inputs and outputs as:<\/p>\n<ul>\n<li><strong>Inputs<\/strong>: Domain-A, Domain-B<\/li>\n<li><strong>Outputs<\/strong>: Real, Domain-B, Domain-A, Domain-B<\/li>\n<\/ul>\n<p>Defining the models is the hard part of the CycleGAN; the rest is standard GAN training and relatively straightforward.<\/p>\n<p>Next, we can load our paired images dataset in compressed NumPy array format. This will return a list of two NumPy arrays: the first for source images and the second for corresponding target images.<\/p>\n<pre class=\"crayon-plain-tag\"># load and prepare training images\r\ndef load_real_samples(filename):\r\n\t# load the dataset\r\n\tdata = load(filename)\r\n\t# unpack arrays\r\n\tX1, X2 = data['arr_0'], data['arr_1']\r\n\t# scale from [0,255] to [-1,1]\r\n\tX1 = (X1 - 127.5) \/ 127.5\r\n\tX2 = (X2 - 127.5) \/ 127.5\r\n\treturn [X1, X2]<\/pre>\n<p>Each training iteration we will require a sample of real images from each domain as input to the discriminator and composite generator models. This can be achieved by selecting a random batch of samples.<\/p>\n<p>The <em>generate_real_samples()<\/em> function below implements this, taking a <a href=\"https:\/\/machinelearningmastery.com\/gentle-introduction-n-dimensional-arrays-python-numpy\/\">NumPy array<\/a> for a domain as input and returning the requested number of randomly selected images, as well as the target for the PatchGAN discriminator model indicating the images are real (<em>target=1.0<\/em>). As such, the shape of the PatchgAN output is also provided, which in the case of 256\u00d7256 images will be 16, or a 16x16x1 activation map, defined by the patch_shape function argument.<\/p>\n<pre class=\"crayon-plain-tag\"># select a batch of random samples, returns images and target\r\ndef generate_real_samples(dataset, n_samples, patch_shape):\r\n\t# choose random instances\r\n\tix = randint(0, dataset.shape[0], n_samples)\r\n\t# retrieve selected images\r\n\tX = dataset[ix]\r\n\t# generate 'real' class labels (1)\r\n\ty = ones((n_samples, patch_shape, patch_shape, 1))\r\n\treturn X, y<\/pre>\n<p>Similarly, a sample of generated images is required to update each discriminator model in each training iteration.<\/p>\n<p>The <em>generate_fake_samples()<\/em> function below generates this sample given a generator model and the sample of real images from the source domain. Again, target values for each generated image are provided with the correct shape of the PatchGAN, indicating that they are fake or generated (<em>target=0.0<\/em>).<\/p>\n<pre class=\"crayon-plain-tag\"># generate a batch of images, returns images and targets\r\ndef generate_fake_samples(g_model, dataset, patch_shape):\r\n\t# generate fake instance\r\n\tX = g_model.predict(dataset)\r\n\t# create 'fake' class labels (0)\r\n\ty = zeros((len(X), patch_shape, patch_shape, 1))\r\n\treturn X, y<\/pre>\n<p>Typically, GAN models do not converge; instead, an equilibrium is found between the generator and discriminator models. As such, we cannot easily judge whether training should stop. Therefore, we can save the model and use it to generate sample image-to-image translations periodically during training, such as every one or five training epochs.<\/p>\n<p>We can then review the generated images at the end of training and use the image quality to choose a final model.<\/p>\n<p>The <em>save_models()<\/em> function below will <a href=\"https:\/\/machinelearningmastery.com\/save-load-keras-deep-learning-models\/\">save each generator model<\/a> to the current directory in H5 format, including the training iteration number in the filename. This will require that the <a href=\"https:\/\/www.h5py.org\/\">h5py library is installed<\/a>.<\/p>\n<pre class=\"crayon-plain-tag\"># save the generator models to file\r\ndef save_models(step, g_model_AtoB, g_model_BtoA):\r\n\t# save the first generator model\r\n\tfilename1 = 'g_model_AtoB_%06d.h5' % (step+1)\r\n\tg_model_AtoB.save(filename1)\r\n\t# save the second generator model\r\n\tfilename2 = 'g_model_BtoA_%06d.h5' % (step+1)\r\n\tg_model_BtoA.save(filename2)\r\n\tprint('>Saved: %s and %s' % (filename1, filename2))<\/pre>\n<p>The <em>summarize_performance()<\/em> function below uses a given generator model to generate translated versions of a few randomly selected source photographs and saves the plot to file.<\/p>\n<p>The source images are plotted on the first row and the generated images are plotted on the second row. Again, the plot filename includes the training iteration number.<\/p>\n<pre class=\"crayon-plain-tag\"># generate samples and save as a plot and save the model\r\ndef summarize_performance(step, g_model, trainX, name, n_samples=5):\r\n\t# select a sample of input images\r\n\tX_in, _ = generate_real_samples(trainX, n_samples, 0)\r\n\t# generate translated images\r\n\tX_out, _ = generate_fake_samples(g_model, X_in, 0)\r\n\t# scale all pixels from [-1,1] to [0,1]\r\n\tX_in = (X_in + 1) \/ 2.0\r\n\tX_out = (X_out + 1) \/ 2.0\r\n\t# plot real images\r\n\tfor i in range(n_samples):\r\n\t\tpyplot.subplot(2, n_samples, 1 + i)\r\n\t\tpyplot.axis('off')\r\n\t\tpyplot.imshow(X_in[i])\r\n\t# plot translated image\r\n\tfor i in range(n_samples):\r\n\t\tpyplot.subplot(2, n_samples, 1 + n_samples + i)\r\n\t\tpyplot.axis('off')\r\n\t\tpyplot.imshow(X_out[i])\r\n\t# save plot to file\r\n\tfilename1 = '%s_generated_plot_%06d.png' % (name, (step+1))\r\n\tpyplot.savefig(filename1)\r\n\tpyplot.close()<\/pre>\n<p>We are nearly ready to define the training of the models.<\/p>\n<p>The discriminator models are updated directly on real and generated images, although in an effort to further manage how quickly the discriminator models learn, a pool of fake images is maintained.<\/p>\n<p>The paper defines an image pool of 50 generated images for each discriminator model that is first populated and probabilistically either adds new images to the pool by replacing an existing image or uses a generated image directly. We can implement this as a Python list of images for each discriminator and use the <em>update_image_pool()<\/em> function below to maintain each pool list.<\/p>\n<pre class=\"crayon-plain-tag\"># update image pool for fake images\r\ndef update_image_pool(pool, images, max_size=50):\r\n\tselected = list()\r\n\tfor image in images:\r\n\t\tif len(pool) < max_size:\r\n\t\t\t# stock the pool\r\n\t\t\tpool.append(image)\r\n\t\t\tselected.append(image)\r\n\t\telif random() < 0.5:\r\n\t\t\t# use image, but don't add it to the pool\r\n\t\t\tselected.append(image)\r\n\t\telse:\r\n\t\t\t# replace an existing image and use replaced image\r\n\t\t\tix = randint(0, len(pool))\r\n\t\t\tselected.append(pool[ix])\r\n\t\t\tpool[ix] = image\r\n\treturn asarray(selected)<\/pre>\n<p>We can now define the training of each of the generator models.<\/p>\n<p>The <em>train()<\/em> function below takes all six models (two discriminator, two generator, and two composite models) as arguments along with the dataset and trains the models.<\/p>\n<p>The batch size is fixed at one image to match the description in the paper and the models are fit for <a href=\"https:\/\/machinelearningmastery.com\/difference-between-a-batch-and-an-epoch\/\">100 epochs<\/a>. Given that the horses dataset has 1,187 images, one epoch is defined as 1,187 batches and the same number of training iterations. Images are generated using both generators each epoch and models are saved every five epochs or (1187 * 5) 5,935 training iterations.<\/p>\n<p>The order of model updates is implemented to match the official Torch implementation. First, a batch of real images from each domain is selected, then a batch of fake images for each domain is generated. The fake images are then used to update each discriminator\u2019s fake image pool.<\/p>\n<p>Next, the Generator-A model (zebras to horses) is updated via the composite model, followed by the Discriminator-A model (horses). Then the Generator-B (horses to zebra) composite model and Discriminator-B (zebras) models are updated.<\/p>\n<p>Loss for each of the updated models is then reported at the end of the training iteration. Importantly, only the weighted average loss used to update each generator is reported.<\/p>\n<pre class=\"crayon-plain-tag\"># train cyclegan models\r\ndef train(d_model_A, d_model_B, g_model_AtoB, g_model_BtoA, c_model_AtoB, c_model_BtoA, dataset):\r\n\t# define properties of the training run\r\n\tn_epochs, n_batch, = 100, 1\r\n\t# determine the output square shape of the discriminator\r\n\tn_patch = d_model_A.output_shape[1]\r\n\t# unpack dataset\r\n\ttrainA, trainB = dataset\r\n\t# prepare image pool for fakes\r\n\tpoolA, poolB = list(), list()\r\n\t# calculate the number of batches per training epoch\r\n\tbat_per_epo = int(len(trainA) \/ n_batch)\r\n\t# calculate the number of training iterations\r\n\tn_steps = bat_per_epo * n_epochs\r\n\t# manually enumerate epochs\r\n\tfor i in range(n_steps):\r\n\t\t# select a batch of real samples\r\n\t\tX_realA, y_realA = generate_real_samples(trainA, n_batch, n_patch)\r\n\t\tX_realB, y_realB = generate_real_samples(trainB, n_batch, n_patch)\r\n\t\t# generate a batch of fake samples\r\n\t\tX_fakeA, y_fakeA = generate_fake_samples(g_model_BtoA, X_realB, n_patch)\r\n\t\tX_fakeB, y_fakeB = generate_fake_samples(g_model_AtoB, X_realA, n_patch)\r\n\t\t# update fakes from pool\r\n\t\tX_fakeA = update_image_pool(poolA, X_fakeA)\r\n\t\tX_fakeB = update_image_pool(poolB, X_fakeB)\r\n\t\t# update generator B->A via adversarial and cycle loss\r\n\t\tg_loss2, _, _, _, _  = c_model_BtoA.train_on_batch([X_realB, X_realA], [y_realA, X_realA, X_realB, X_realA])\r\n\t\t# update discriminator for A -> [real\/fake]\r\n\t\tdA_loss1 = d_model_A.train_on_batch(X_realA, y_realA)\r\n\t\tdA_loss2 = d_model_A.train_on_batch(X_fakeA, y_fakeA)\r\n\t\t# update generator A->B via adversarial and cycle loss\r\n\t\tg_loss1, _, _, _, _ = c_model_AtoB.train_on_batch([X_realA, X_realB], [y_realB, X_realB, X_realA, X_realB])\r\n\t\t# update discriminator for B -> [real\/fake]\r\n\t\tdB_loss1 = d_model_B.train_on_batch(X_realB, y_realB)\r\n\t\tdB_loss2 = d_model_B.train_on_batch(X_fakeB, y_fakeB)\r\n\t\t# summarize performance\r\n\t\tprint('>%d, dA[%.3f,%.3f] dB[%.3f,%.3f] g[%.3f,%.3f]' % (i+1, dA_loss1,dA_loss2, dB_loss1,dB_loss2, g_loss1,g_loss2))\r\n\t\t# evaluate the model performance every so often\r\n\t\tif (i+1) % (bat_per_epo * 1) == 0:\r\n\t\t\t# plot A->B translation\r\n\t\t\tsummarize_performance(i, g_model_AtoB, trainA, 'AtoB')\r\n\t\t\t# plot B->A translation\r\n\t\t\tsummarize_performance(i, g_model_BtoA, trainB, 'BtoA')\r\n\t\tif (i+1) % (bat_per_epo * 5) == 0:\r\n\t\t\t# save the models\r\n\t\t\tsave_models(i, g_model_AtoB, g_model_BtoA)<\/pre>\n<p>Tying all of this together, the complete example of training a CycleGAN model to translate photos of horses to zebras and zebras to horses is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of training a cyclegan on the horse2zebra dataset\r\nfrom random import random\r\nfrom numpy import load\r\nfrom numpy import zeros\r\nfrom numpy import ones\r\nfrom numpy import asarray\r\nfrom numpy.random import randint\r\nfrom keras.optimizers import Adam\r\nfrom keras.initializers import RandomNormal\r\nfrom keras.models import Model\r\nfrom keras.models import Input\r\nfrom keras.layers import Conv2D\r\nfrom keras.layers import Conv2DTranspose\r\nfrom keras.layers import LeakyReLU\r\nfrom keras.layers import Activation\r\nfrom keras.layers import Concatenate\r\nfrom keras_contrib.layers.normalization.instancenormalization import InstanceNormalization\r\nfrom matplotlib import pyplot\r\n\r\n# define the discriminator model\r\ndef define_discriminator(image_shape):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# source image input\r\n\tin_image = Input(shape=image_shape)\r\n\t# C64\r\n\td = Conv2D(64, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# C128\r\n\td = Conv2D(128, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)\r\n\td = InstanceNormalization(axis=-1)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# C256\r\n\td = Conv2D(256, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)\r\n\td = InstanceNormalization(axis=-1)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# C512\r\n\td = Conv2D(512, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)\r\n\td = InstanceNormalization(axis=-1)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# second last output layer\r\n\td = Conv2D(512, (4,4), padding='same', kernel_initializer=init)(d)\r\n\td = InstanceNormalization(axis=-1)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# patch output\r\n\tpatch_out = Conv2D(1, (4,4), padding='same', kernel_initializer=init)(d)\r\n\t# define model\r\n\tmodel = Model(in_image, patch_out)\r\n\t# compile model\r\n\tmodel.compile(loss='mse', optimizer=Adam(lr=0.0002, beta_1=0.5), loss_weights=[0.5])\r\n\treturn model\r\n\r\n# generator a resnet block\r\ndef resnet_block(n_filters, input_layer):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# first layer convolutional layer\r\n\tg = Conv2D(n_filters, (3,3), padding='same', kernel_initializer=init)(input_layer)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# second convolutional layer\r\n\tg = Conv2D(n_filters, (3,3), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\t# concatenate merge channel-wise with input layer\r\n\tg = Concatenate()([g, input_layer])\r\n\treturn g\r\n\r\n# define the standalone generator model\r\ndef define_generator(image_shape, n_resnet=9):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# image input\r\n\tin_image = Input(shape=image_shape)\r\n\t# c7s1-64\r\n\tg = Conv2D(64, (7,7), padding='same', kernel_initializer=init)(in_image)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# d128\r\n\tg = Conv2D(128, (3,3), strides=(2,2), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# d256\r\n\tg = Conv2D(256, (3,3), strides=(2,2), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# R256\r\n\tfor _ in range(n_resnet):\r\n\t\tg = resnet_block(256, g)\r\n\t# u128\r\n\tg = Conv2DTranspose(128, (3,3), strides=(2,2), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# u64\r\n\tg = Conv2DTranspose(64, (3,3), strides=(2,2), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tg = Activation('relu')(g)\r\n\t# c7s1-3\r\n\tg = Conv2D(3, (7,7), padding='same', kernel_initializer=init)(g)\r\n\tg = InstanceNormalization(axis=-1)(g)\r\n\tout_image = Activation('tanh')(g)\r\n\t# define model\r\n\tmodel = Model(in_image, out_image)\r\n\treturn model\r\n\r\n# define a composite model for updating generators by adversarial and cycle loss\r\ndef define_composite_model(g_model_1, d_model, g_model_2, image_shape):\r\n\t# ensure the model we're updating is trainable\r\n\tg_model_1.trainable = True\r\n\t# mark discriminator as not trainable\r\n\td_model.trainable = False\r\n\t# mark other generator model as not trainable\r\n\tg_model_2.trainable = False\r\n\t# discriminator element\r\n\tinput_gen = Input(shape=image_shape)\r\n\tgen1_out = g_model_1(input_gen)\r\n\toutput_d = d_model(gen1_out)\r\n\t# identity element\r\n\tinput_id = Input(shape=image_shape)\r\n\toutput_id = g_model_1(input_id)\r\n\t# forward cycle\r\n\toutput_f = g_model_2(gen1_out)\r\n\t# backward cycle\r\n\tgen2_out = g_model_2(input_id)\r\n\toutput_b = g_model_1(gen2_out)\r\n\t# define model graph\r\n\tmodel = Model([input_gen, input_id], [output_d, output_id, output_f, output_b])\r\n\t# define optimization algorithm configuration\r\n\topt = Adam(lr=0.0002, beta_1=0.5)\r\n\t# compile model with weighting of least squares loss and L1 loss\r\n\tmodel.compile(loss=['mse', 'mae', 'mae', 'mae'], loss_weights=[1, 5, 10, 10], optimizer=opt)\r\n\treturn model\r\n\r\n# load and prepare training images\r\ndef load_real_samples(filename):\r\n\t# load the dataset\r\n\tdata = load(filename)\r\n\t# unpack arrays\r\n\tX1, X2 = data['arr_0'], data['arr_1']\r\n\t# scale from [0,255] to [-1,1]\r\n\tX1 = (X1 - 127.5) \/ 127.5\r\n\tX2 = (X2 - 127.5) \/ 127.5\r\n\treturn [X1, X2]\r\n\r\n# select a batch of random samples, returns images and target\r\ndef generate_real_samples(dataset, n_samples, patch_shape):\r\n\t# choose random instances\r\n\tix = randint(0, dataset.shape[0], n_samples)\r\n\t# retrieve selected images\r\n\tX = dataset[ix]\r\n\t# generate 'real' class labels (1)\r\n\ty = ones((n_samples, patch_shape, patch_shape, 1))\r\n\treturn X, y\r\n\r\n# generate a batch of images, returns images and targets\r\ndef generate_fake_samples(g_model, dataset, patch_shape):\r\n\t# generate fake instance\r\n\tX = g_model.predict(dataset)\r\n\t# create 'fake' class labels (0)\r\n\ty = zeros((len(X), patch_shape, patch_shape, 1))\r\n\treturn X, y\r\n\r\n# save the generator models to file\r\ndef save_models(step, g_model_AtoB, g_model_BtoA):\r\n\t# save the first generator model\r\n\tfilename1 = 'g_model_AtoB_%06d.h5' % (step+1)\r\n\tg_model_AtoB.save(filename1)\r\n\t# save the second generator model\r\n\tfilename2 = 'g_model_BtoA_%06d.h5' % (step+1)\r\n\tg_model_BtoA.save(filename2)\r\n\tprint('>Saved: %s and %s' % (filename1, filename2))\r\n\r\n# generate samples and save as a plot and save the model\r\ndef summarize_performance(step, g_model, trainX, name, n_samples=5):\r\n\t# select a sample of input images\r\n\tX_in, _ = generate_real_samples(trainX, n_samples, 0)\r\n\t# generate translated images\r\n\tX_out, _ = generate_fake_samples(g_model, X_in, 0)\r\n\t# scale all pixels from [-1,1] to [0,1]\r\n\tX_in = (X_in + 1) \/ 2.0\r\n\tX_out = (X_out + 1) \/ 2.0\r\n\t# plot real images\r\n\tfor i in range(n_samples):\r\n\t\tpyplot.subplot(2, n_samples, 1 + i)\r\n\t\tpyplot.axis('off')\r\n\t\tpyplot.imshow(X_in[i])\r\n\t# plot translated image\r\n\tfor i in range(n_samples):\r\n\t\tpyplot.subplot(2, n_samples, 1 + n_samples + i)\r\n\t\tpyplot.axis('off')\r\n\t\tpyplot.imshow(X_out[i])\r\n\t# save plot to file\r\n\tfilename1 = '%s_generated_plot_%06d.png' % (name, (step+1))\r\n\tpyplot.savefig(filename1)\r\n\tpyplot.close()\r\n\r\n# update image pool for fake images\r\ndef update_image_pool(pool, images, max_size=50):\r\n\tselected = list()\r\n\tfor image in images:\r\n\t\tif len(pool) < max_size:\r\n\t\t\t# stock the pool\r\n\t\t\tpool.append(image)\r\n\t\t\tselected.append(image)\r\n\t\telif random() < 0.5:\r\n\t\t\t# use image, but don't add it to the pool\r\n\t\t\tselected.append(image)\r\n\t\telse:\r\n\t\t\t# replace an existing image and use replaced image\r\n\t\t\tix = randint(0, len(pool))\r\n\t\t\tselected.append(pool[ix])\r\n\t\t\tpool[ix] = image\r\n\treturn asarray(selected)\r\n\r\n# train cyclegan models\r\ndef train(d_model_A, d_model_B, g_model_AtoB, g_model_BtoA, c_model_AtoB, c_model_BtoA, dataset):\r\n\t# define properties of the training run\r\n\tn_epochs, n_batch, = 100, 1\r\n\t# determine the output square shape of the discriminator\r\n\tn_patch = d_model_A.output_shape[1]\r\n\t# unpack dataset\r\n\ttrainA, trainB = dataset\r\n\t# prepare image pool for fakes\r\n\tpoolA, poolB = list(), list()\r\n\t# calculate the number of batches per training epoch\r\n\tbat_per_epo = int(len(trainA) \/ n_batch)\r\n\t# calculate the number of training iterations\r\n\tn_steps = bat_per_epo * n_epochs\r\n\t# manually enumerate epochs\r\n\tfor i in range(n_steps):\r\n\t\t# select a batch of real samples\r\n\t\tX_realA, y_realA = generate_real_samples(trainA, n_batch, n_patch)\r\n\t\tX_realB, y_realB = generate_real_samples(trainB, n_batch, n_patch)\r\n\t\t# generate a batch of fake samples\r\n\t\tX_fakeA, y_fakeA = generate_fake_samples(g_model_BtoA, X_realB, n_patch)\r\n\t\tX_fakeB, y_fakeB = generate_fake_samples(g_model_AtoB, X_realA, n_patch)\r\n\t\t# update fakes from pool\r\n\t\tX_fakeA = update_image_pool(poolA, X_fakeA)\r\n\t\tX_fakeB = update_image_pool(poolB, X_fakeB)\r\n\t\t# update generator B->A via adversarial and cycle loss\r\n\t\tg_loss2, _, _, _, _  = c_model_BtoA.train_on_batch([X_realB, X_realA], [y_realA, X_realA, X_realB, X_realA])\r\n\t\t# update discriminator for A -> [real\/fake]\r\n\t\tdA_loss1 = d_model_A.train_on_batch(X_realA, y_realA)\r\n\t\tdA_loss2 = d_model_A.train_on_batch(X_fakeA, y_fakeA)\r\n\t\t# update generator A->B via adversarial and cycle loss\r\n\t\tg_loss1, _, _, _, _ = c_model_AtoB.train_on_batch([X_realA, X_realB], [y_realB, X_realB, X_realA, X_realB])\r\n\t\t# update discriminator for B -> [real\/fake]\r\n\t\tdB_loss1 = d_model_B.train_on_batch(X_realB, y_realB)\r\n\t\tdB_loss2 = d_model_B.train_on_batch(X_fakeB, y_fakeB)\r\n\t\t# summarize performance\r\n\t\tprint('>%d, dA[%.3f,%.3f] dB[%.3f,%.3f] g[%.3f,%.3f]' % (i+1, dA_loss1,dA_loss2, dB_loss1,dB_loss2, g_loss1,g_loss2))\r\n\t\t# evaluate the model performance every so often\r\n\t\tif (i+1) % (bat_per_epo * 1) == 0:\r\n\t\t\t# plot A->B translation\r\n\t\t\tsummarize_performance(i, g_model_AtoB, trainA, 'AtoB')\r\n\t\t\t# plot B->A translation\r\n\t\t\tsummarize_performance(i, g_model_BtoA, trainB, 'BtoA')\r\n\t\tif (i+1) % (bat_per_epo * 5) == 0:\r\n\t\t\t# save the models\r\n\t\t\tsave_models(i, g_model_AtoB, g_model_BtoA)\r\n\r\n# load image data\r\ndataset = load_real_samples('horse2zebra_256.npz')\r\nprint('Loaded', dataset[0].shape, dataset[1].shape)\r\n# define input shape based on the loaded dataset\r\nimage_shape = dataset[0].shape[1:]\r\n# generator: A -> B\r\ng_model_AtoB = define_generator(image_shape)\r\n# generator: B -> A\r\ng_model_BtoA = define_generator(image_shape)\r\n# discriminator: A -> [real\/fake]\r\nd_model_A = define_discriminator(image_shape)\r\n# discriminator: B -> [real\/fake]\r\nd_model_B = define_discriminator(image_shape)\r\n# composite: A -> B -> [real\/fake, A]\r\nc_model_AtoB = define_composite_model(g_model_AtoB, d_model_B, g_model_BtoA, image_shape)\r\n# composite: B -> A -> [real\/fake, B]\r\nc_model_BtoA = define_composite_model(g_model_BtoA, d_model_A, g_model_AtoB, image_shape)\r\n# train models\r\ntrain(d_model_A, d_model_B, g_model_AtoB, g_model_BtoA, c_model_AtoB, c_model_BtoA, dataset)<\/pre>\n<p>The example can be run on CPU hardware, although GPU hardware is recommended.<\/p>\n<p>The example might take a number of hours to run on modern GPU hardware.<\/p>\n<p>If needed, you can access cheap GPU hardware via Amazon EC2; see the tutorial:<\/p>\n<ul>\n<li><a href=\"https:\/\/machinelearningmastery.com\/develop-evaluate-large-deep-learning-models-keras-amazon-web-services\/\">How to Setup Amazon AWS EC2 GPUs to Train Keras Deep Learning Models (step-by-step)<\/a><\/li>\n<\/ul>\n<p><strong>Note<\/strong>: your specific results may vary given the stochastic nature of the learning algorithm. Consider running the example a few times.<\/p>\n<p>The loss is reported each training iteration, including the Discriminator-A loss on real and fake examples (<em>dA<\/em>), Discriminator-B loss on real and fake examples (<em>dB<\/em>), and Generator-AtoB and Generator-BtoA loss, each of which is a weighted average of adversarial, identity, forward, and backward cycle loss (<em>g<\/em>).<\/p>\n<p>If loss for the discriminator goes to zero and stays there for a long time, consider re-starting the training run as it is an example of a training failure.<\/p>\n<pre class=\"crayon-plain-tag\">>1, dA[2.284,0.678] dB[1.422,0.918] g[18.747,18.452]\r\n>2, dA[2.129,1.226] dB[1.039,1.331] g[19.469,22.831]\r\n>3, dA[1.644,3.909] dB[1.097,1.680] g[19.192,23.757]\r\n>4, dA[1.427,1.757] dB[1.236,3.493] g[20.240,18.390]\r\n>5, dA[1.737,0.808] dB[1.662,2.312] g[16.941,14.915]\r\n...\r\n>118696, dA[0.004,0.016] dB[0.001,0.001] g[2.623,2.359]\r\n>118697, dA[0.001,0.028] dB[0.003,0.002] g[3.045,3.194]\r\n>118698, dA[0.002,0.008] dB[0.001,0.002] g[2.685,2.071]\r\n>118699, dA[0.010,0.010] dB[0.001,0.001] g[2.430,2.345]\r\n>118700, dA[0.002,0.008] dB[0.000,0.004] g[2.487,2.169]\r\n>Saved: g_model_AtoB_118700.h5 and g_model_BtoA_118700.h5<\/pre>\n<p>Plots of generated images are saved at the end of every epoch or after every 1,187 training iterations and the iteration number is used in the filename.<\/p>\n<pre class=\"crayon-plain-tag\">AtoB_generated_plot_001187.png\r\nAtoB_generated_plot_002374.png\r\n...\r\nBtoA_generated_plot_001187.png\r\nBtoA_generated_plot_002374.png<\/pre>\n<p>Models are saved after every five epochs or (1187 * 5) 5,935 training iterations, and again the iteration number is used in the filenames.<\/p>\n<pre class=\"crayon-plain-tag\">g_model_AtoB_053415.h5\r\ng_model_AtoB_059350.h5\r\n...\r\ng_model_BtoA_053415.h5\r\ng_model_BtoA_059350.h5<\/pre>\n<p>The plots of generated images can be used to choose a model and more training iterations may not necessarily mean better quality generated images.<\/p>\n<p>Horses to Zebras translation starts to become reliable after about 50 epochs.<\/p>\n<div id=\"attachment_8401\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8401\" class=\"size-full wp-image-8401\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-Source-Photographs-of-Horses-top-row-and-Translated-Photographs-of-Zebra-bottom-row-after-53415-Training-Iterations.png\" alt=\"Plot of Source Photographs of Horses (top row) and Translated Photographs of Zebras (bottom row) After 53,415 Training Iterations\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-Source-Photographs-of-Horses-top-row-and-Translated-Photographs-of-Zebra-bottom-row-after-53415-Training-Iterations.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-Source-Photographs-of-Horses-top-row-and-Translated-Photographs-of-Zebra-bottom-row-after-53415-Training-Iterations-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8401\" class=\"wp-caption-text\">Plot of Source Photographs of Horses (top row) and Translated Photographs of Zebras (bottom row) After 53,415 Training Iterations<\/p>\n<\/div>\n<p>The translation from Zebras to Horses appears to be more challenging for the model to learn, although somewhat plausible translations also begin to be generated after 50 to 60 epochs.<\/p>\n<p>I suspect that better quality results could be achieved with an additional 100 training epochs with weight decay, as is used in the paper, and perhaps with a data generator that systematically works through each dataset rather than randomly sampling.<\/p>\n<div id=\"attachment_8402\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8402\" class=\"size-full wp-image-8402\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-Source-Photographs-of-Zebra-top-row-and-Translated-Photographs-of-Horses-bottom-row-after-90212-Training-Iterations.png\" alt=\"Plot of Source Photographs of Zebras (top row) and Translated Photographs of Horses (bottom row) After 90,212 Training Iterations\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-Source-Photographs-of-Zebra-top-row-and-Translated-Photographs-of-Horses-bottom-row-after-90212-Training-Iterations.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-Source-Photographs-of-Zebra-top-row-and-Translated-Photographs-of-Horses-bottom-row-after-90212-Training-Iterations-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8402\" class=\"wp-caption-text\">Plot of Source Photographs of Zebras (top row) and Translated Photographs of Horses (bottom row) After 90,212 Training Iterations<\/p>\n<\/div>\n<p>Now that we have fit our CycleGAN generators, we can use them to translate photographs in an ad hoc manner.<\/p>\n<h2>How to Perform Image Translation With CycleGAN Generators<\/h2>\n<p>The saved generator models can be loaded and used for ad hoc image translation.<\/p>\n<p>The first step is to load the dataset. We can use the same <em>load_real_samples()<\/em> function as we developed in the previous section.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# load dataset\r\nA_data, B_data = load_real_samples('horse2zebra_256.npz')\r\nprint('Loaded', A_data.shape, B_data.shape)<\/pre>\n<p>Review the plots of generated images and select a pair of models that we can use for image generation. In this case, we will use the model saved around epoch 89 (training iteration 89,025). Our generator models used a custom layer from the <em>keras_contrib<\/em> library, specifically the <em>InstanceNormalization<\/em> layer. Therefore, we need to specify how to load this layer when loading each generator model.<\/p>\n<p>This can be achieved by specifying a dictionary mapping of the layer name to the object and passing this as an argument to the <em>load_model()<\/em> keras function.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# load the models\r\ncust = {'InstanceNormalization': InstanceNormalization}\r\nmodel_AtoB = load_model('g_model_AtoB_089025.h5', cust)\r\nmodel_BtoA = load_model('g_model_BtoA_089025.h5', cust)<\/pre>\n<p>We can use the <em>select_sample()<\/em> function that we developed in the previous section to select a random photo from the dataset.<\/p>\n<pre class=\"crayon-plain-tag\"># select a random sample of images from the dataset\r\ndef select_sample(dataset, n_samples):\r\n\t# choose random instances\r\n\tix = randint(0, dataset.shape[0], n_samples)\r\n\t# retrieve selected images\r\n\tX = dataset[ix]\r\n\treturn X<\/pre>\n<p>Next, we can use the Generator-AtoB model, first by selecting a random image from Domain-A (horses) as input, using Generator-AtoB to translate it to Domain-B (zebras), then use the Generator-BtoA model to reconstruct the original image (horse).<\/p>\n<pre class=\"crayon-plain-tag\"># plot A->B->A\r\nA_real = select_sample(A_data, 1)\r\nB_generated  = model_AtoB.predict(A_real)\r\nA_reconstructed = model_BtoA.predict(B_generated)<\/pre>\n<p>We can then plot the three photos side by side as the original or real photo, the translated photo, and the reconstruction of the original photo. The <em>show_plot()<\/em> function below implements this.<\/p>\n<pre class=\"crayon-plain-tag\"># plot the image, the translation, and the reconstruction\r\ndef show_plot(imagesX, imagesY1, imagesY2):\r\n\timages = vstack((imagesX, imagesY1, imagesY2))\r\n\ttitles = ['Real', 'Generated', 'Reconstructed']\r\n\t# scale from [-1,1] to [0,1]\r\n\timages = (images + 1) \/ 2.0\r\n\t# plot images row by row\r\n\tfor i in range(len(images)):\r\n\t\t# define subplot\r\n\t\tpyplot.subplot(1, len(images), 1 + i)\r\n\t\t# turn off axis\r\n\t\tpyplot.axis('off')\r\n\t\t# plot raw pixel data\r\n\t\tpyplot.imshow(images[i])\r\n\t\t# title\r\n\t\tpyplot.title(titles[i])\r\n\tpyplot.show()<\/pre>\n<p>We can then call this function to plot our real and generated photos.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\nshow_plot(A_real, B_generated, A_reconstructed)<\/pre>\n<p>This is a good test of both models, however, we can also perform the same operation in reverse.<\/p>\n<p>Specifically, a real photo from Domain-B (zebra) translated to Domain-A (horse), then reconstructed as Domain-B (zebra).<\/p>\n<pre class=\"crayon-plain-tag\"># plot B->A->B\r\nB_real = select_sample(B_data, 1)\r\nA_generated  = model_BtoA.predict(B_real)\r\nB_reconstructed = model_AtoB.predict(A_generated)\r\nshow_plot(B_real, A_generated, B_reconstructed)<\/pre>\n<p>Tying all of this together, the complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of using saved cyclegan models for image translation\r\nfrom keras.models import load_model\r\nfrom numpy import load\r\nfrom numpy import vstack\r\nfrom matplotlib import pyplot\r\nfrom numpy.random import randint\r\nfrom keras_contrib.layers.normalization.instancenormalization import InstanceNormalization\r\n\r\n# load and prepare training images\r\ndef load_real_samples(filename):\r\n\t# load the dataset\r\n\tdata = load(filename)\r\n\t# unpack arrays\r\n\tX1, X2 = data['arr_0'], data['arr_1']\r\n\t# scale from [0,255] to [-1,1]\r\n\tX1 = (X1 - 127.5) \/ 127.5\r\n\tX2 = (X2 - 127.5) \/ 127.5\r\n\treturn [X1, X2]\r\n\r\n# select a random sample of images from the dataset\r\ndef select_sample(dataset, n_samples):\r\n\t# choose random instances\r\n\tix = randint(0, dataset.shape[0], n_samples)\r\n\t# retrieve selected images\r\n\tX = dataset[ix]\r\n\treturn X\r\n\r\n# plot the image, the translation, and the reconstruction\r\ndef show_plot(imagesX, imagesY1, imagesY2):\r\n\timages = vstack((imagesX, imagesY1, imagesY2))\r\n\ttitles = ['Real', 'Generated', 'Reconstructed']\r\n\t# scale from [-1,1] to [0,1]\r\n\timages = (images + 1) \/ 2.0\r\n\t# plot images row by row\r\n\tfor i in range(len(images)):\r\n\t\t# define subplot\r\n\t\tpyplot.subplot(1, len(images), 1 + i)\r\n\t\t# turn off axis\r\n\t\tpyplot.axis('off')\r\n\t\t# plot raw pixel data\r\n\t\tpyplot.imshow(images[i])\r\n\t\t# title\r\n\t\tpyplot.title(titles[i])\r\n\tpyplot.show()\r\n\r\n# load dataset\r\nA_data, B_data = load_real_samples('horse2zebra_256.npz')\r\nprint('Loaded', A_data.shape, B_data.shape)\r\n# load the models\r\ncust = {'InstanceNormalization': InstanceNormalization}\r\nmodel_AtoB = load_model('g_model_AtoB_089025.h5', cust)\r\nmodel_BtoA = load_model('g_model_BtoA_089025.h5', cust)\r\n# plot A->B->A\r\nA_real = select_sample(A_data, 1)\r\nB_generated  = model_AtoB.predict(A_real)\r\nA_reconstructed = model_BtoA.predict(B_generated)\r\nshow_plot(A_real, B_generated, A_reconstructed)\r\n# plot B->A->B\r\nB_real = select_sample(B_data, 1)\r\nA_generated  = model_BtoA.predict(B_real)\r\nB_reconstructed = model_AtoB.predict(A_generated)\r\nshow_plot(B_real, A_generated, B_reconstructed)<\/pre>\n<p>Running the example first selects a random photo of a horse, translates it, and then tries to reconstruct the original photo.<\/p>\n<div id=\"attachment_8403\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8403\" class=\"size-full wp-image-8403\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-a-Real-Photo-of-a-Horse-Translation-to-Zebra-and-Reconstructed-Photo-of-a-Horse-using-CycleGAN.png\" alt=\"Plot of a Real Photo of a Horse, Translation to Zebra, and Reconstructed Photo of a Horse Using CycleGAN.\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-a-Real-Photo-of-a-Horse-Translation-to-Zebra-and-Reconstructed-Photo-of-a-Horse-using-CycleGAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-a-Real-Photo-of-a-Horse-Translation-to-Zebra-and-Reconstructed-Photo-of-a-Horse-using-CycleGAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8403\" class=\"wp-caption-text\">Plot of a Real Photo of a Horse, Translation to Zebra, and Reconstructed Photo of a Horse Using CycleGAN.<\/p>\n<\/div>\n<p>Then a similar process is performed in reverse, selecting a random photo of a zebra, translating it to a horse, then reconstructing the original photo of the zebra.<\/p>\n<div id=\"attachment_8404\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8404\" class=\"size-full wp-image-8404\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-a-Real-Photo-of-a-Zebra-Translation-to-Horse-and-Reconstructed-Photo-of-a-Zebra-using-CycleGAN.png\" alt=\"Plot of a Real Photo of a Zebra, Translation to Horse, and Reconstructed Photo of a Zebra Using CycleGAN.\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-a-Real-Photo-of-a-Zebra-Translation-to-Horse-and-Reconstructed-Photo-of-a-Zebra-using-CycleGAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-a-Real-Photo-of-a-Zebra-Translation-to-Horse-and-Reconstructed-Photo-of-a-Zebra-using-CycleGAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8404\" class=\"wp-caption-text\">Plot of a Real Photo of a Zebra, Translation to Horse, and Reconstructed Photo of a Zebra Using CycleGAN.<\/p>\n<\/div>\n<p><strong>Note<\/strong>: your results will vary given the stochastic training of the CycleGAN model and choice of a random photograph. Try running the example a few times.<\/p>\n<p>The models are not perfect, especially the zebra to horse model, so you may want to generate many translated examples to review.<\/p>\n<p>It also seems that both models are more effective when reconstructing an image, which is interesting as they are essentially performing the same translation task as when operating on real photographs. This may be a sign that the adversarial loss is not strong enough during training.<\/p>\n<p>We may also want to use a generator model in a standalone way on individual photograph files.<\/p>\n<p>First, we can select a photo from the training dataset. In this case, we will use \u201c<em>horse2zebra\/trainA\/n02381460_541.jpg<\/em>\u201c.<\/p>\n<div id=\"attachment_8405\" style=\"width: 266px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8405\" class=\"size-full wp-image-8405\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Photograph-of-a-Horse.jpg\" alt=\"Photograph of a Horse\" width=\"256\" height=\"256\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Photograph-of-a-Horse.jpg 256w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Photograph-of-a-Horse-150x150.jpg 150w\" sizes=\"(max-width: 256px) 100vw, 256px\"><\/p>\n<p id=\"caption-attachment-8405\" class=\"wp-caption-text\">Photograph of a Horse<\/p>\n<\/div>\n<p>We can develop a function to load this image and scale it to the preferred size of 256\u00d7256, scale pixel values to the range [-1,1], and convert the array of pixels to a single sample.<\/p>\n<p>The <em>load_image()<\/em> function below implements this.<\/p>\n<pre class=\"crayon-plain-tag\">def load_image(filename, size=(256,256)):\r\n\t# load and resize the image\r\n\tpixels = load_img(filename, target_size=size)\r\n\t# convert to numpy array\r\n\tpixels = img_to_array(pixels)\r\n\t# transform in a sample\r\n\tpixels = expand_dims(pixels, 0)\r\n\t# scale from [0,255] to [-1,1]\r\n\tpixels = (pixels - 127.5) \/ 127.5\r\n\treturn pixels<\/pre>\n<p>We can then load our selected image as well as the AtoB generator model, as we did before.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# load the image\r\nimage_src = load_image('horse2zebra\/trainA\/n02381460_541.jpg')\r\n# load the model\r\ncust = {'InstanceNormalization': InstanceNormalization}\r\nmodel_AtoB = load_model('g_model_AtoB_089025.h5', cust)<\/pre>\n<p>We can then translate the loaded image, scale the pixel values back to the expected range, and plot the result.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# translate image\r\nimage_tar = model_AtoB.predict(image_src)\r\n# scale from [-1,1] to [0,1]\r\nimage_tar = (image_tar + 1) \/ 2.0\r\n# plot the translated image\r\npyplot.imshow(image_tar[0])\r\npyplot.show()<\/pre>\n<p>Tying this all together, the complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of using saved cyclegan models for image translation\r\nfrom numpy import load\r\nfrom numpy import expand_dims\r\nfrom keras.models import load_model\r\nfrom keras_contrib.layers.normalization.instancenormalization import InstanceNormalization\r\nfrom keras.preprocessing.image import img_to_array\r\nfrom keras.preprocessing.image import load_img\r\nfrom matplotlib import pyplot\r\n\r\n# load an image to the preferred size\r\ndef load_image(filename, size=(256,256)):\r\n\t# load and resize the image\r\n\tpixels = load_img(filename, target_size=size)\r\n\t# convert to numpy array\r\n\tpixels = img_to_array(pixels)\r\n\t# transform in a sample\r\n\tpixels = expand_dims(pixels, 0)\r\n\t# scale from [0,255] to [-1,1]\r\n\tpixels = (pixels - 127.5) \/ 127.5\r\n\treturn pixels\r\n\r\n# load the image\r\nimage_src = load_image('horse2zebra\/trainA\/n02381460_541.jpg')\r\n# load the model\r\ncust = {'InstanceNormalization': InstanceNormalization}\r\nmodel_AtoB = load_model('g_model_AtoB_100895.h5', cust)\r\n# translate image\r\nimage_tar = model_AtoB.predict(image_src)\r\n# scale from [-1,1] to [0,1]\r\nimage_tar = (image_tar + 1) \/ 2.0\r\n# plot the translated image\r\npyplot.imshow(image_tar[0])\r\npyplot.show()<\/pre>\n<p>Running the example loads the selected image, loads the generator model, translates the photograph of a horse to a zebra, and plots the results.<\/p>\n<div id=\"attachment_8406\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8406\" class=\"size-large wp-image-8406\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Photograph-of-a-Horse-Translated-to-a-Photograph-of-a-Zebra-using-CycleGAN-1024x768.png\" alt=\"Photograph of a Horse Translated to a Photograph of a Zebra using CycleGAN\" width=\"1024\" height=\"768\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Photograph-of-a-Horse-Translated-to-a-Photograph-of-a-Zebra-using-CycleGAN-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Photograph-of-a-Horse-Translated-to-a-Photograph-of-a-Zebra-using-CycleGAN-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Photograph-of-a-Horse-Translated-to-a-Photograph-of-a-Zebra-using-CycleGAN-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Photograph-of-a-Horse-Translated-to-a-Photograph-of-a-Zebra-using-CycleGAN.png 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/p>\n<p id=\"caption-attachment-8406\" class=\"wp-caption-text\">Photograph of a Horse Translated to a Photograph of a Zebra using CycleGAN<\/p>\n<\/div>\n<h2>Extensions<\/h2>\n<p>This section lists some ideas for extending the tutorial that you may wish to explore.<\/p>\n<ul>\n<li><strong>Smaller Image Size<\/strong>. Update the example to use a smaller image size, such as 128\u00d7128, and adjust the size of the generator model to use 6 ResNet layers as is used in the cycleGAN paper.<\/li>\n<li><strong>Different Dataset<\/strong>. Update the example to use the apples to oranges dataset.<\/li>\n<li><strong>Without Identity Mapping<\/strong>. Update the example to train the generator models without the identity mapping and compare results.<\/li>\n<\/ul>\n<p>If you explore any of these extensions, I\u2019d love to know.<br \/>\nPost your findings in the comments below.<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Papers<\/h3>\n<ul>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1703.10593\">Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks<\/a>, 2017.<\/li>\n<\/ul>\n<h3>Projects<\/h3>\n<ul>\n<li><a href=\"https:\/\/github.com\/junyanz\/CycleGAN\/\">CycleGAN Project (official), GitHub.<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/junyanz\/pytorch-CycleGAN-and-pix2pix\">pytorch-CycleGAN-and-pix2pix (official), GitHub<\/a>.<\/li>\n<li><a href=\"https:\/\/junyanz.github.io\/CycleGAN\/\">CycleGAN Project Page (official)<\/a><\/li>\n<\/ul>\n<h3>API<\/h3>\n<ul>\n<li><a href=\"https:\/\/keras.io\/datasets\/\">Keras Datasets API<\/a>.<\/li>\n<li><a href=\"https:\/\/keras.io\/models\/sequential\/\">Keras Sequential Model API<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/layers\/convolutional\/\">Keras Convolutional Layers API<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/getting-started\/faq\/#how-can-i-freeze-keras-layers\">How can I \u201cfreeze\u201d Keras layers?<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/keras-team\/keras-contrib\">Keras Contrib Project<\/a><\/li>\n<\/ul>\n<h3>Articles<\/h3>\n<ul>\n<li><a href=\"https:\/\/people.eecs.berkeley.edu\/~taesung_park\/CycleGAN\/datasets\">CycleGAN Dataset<\/a><\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to develop a CycleGAN model to translate photos of horses to zebras, and back again.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to load and prepare the horses to zebra image translation dataset for modeling.<\/li>\n<li>How to train a pair of CycleGAN generator models for translating horses to zebra and zebra to horses.<\/li>\n<li>How to load saved CycleGAN models and use them to translate photographs.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/cyclegan-tutorial-with-keras\/\">How to Develop a CycleGAN for Image-to-Image Translation with Keras<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/cyclegan-tutorial-with-keras\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee The Cycle Generative Adversarial Network, or CycleGAN, is an approach to training a deep convolutional neural network for image-to-image translation tasks. Unlike [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/08\/how-to-develop-a-cyclegan-for-image-to-image-translation-with-keras\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":2447,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2446"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2446"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2446\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/2447"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2446"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2446"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2446"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}