{"id":2467,"date":"2019-08-15T19:00:43","date_gmt":"2019-08-15T19:00:43","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/15\/how-to-train-a-progressive-growing-gan-in-keras-for-synthesizing-faces\/"},"modified":"2019-08-15T19:00:43","modified_gmt":"2019-08-15T19:00:43","slug":"how-to-train-a-progressive-growing-gan-in-keras-for-synthesizing-faces","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/15\/how-to-train-a-progressive-growing-gan-in-keras-for-synthesizing-faces\/","title":{"rendered":"How to Train a Progressive Growing GAN in Keras for Synthesizing Faces"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/what-are-generative-adversarial-networks-gans\/\">Generative adversarial networks<\/a>, or GANs, are effective at generating high-quality synthetic images.<\/p>\n<p>A limitation of GANs is that the are only capable of generating relatively small images, such as 64\u00d764 pixels.<\/p>\n<p>The Progressive Growing GAN is an extension to the GAN training procedure that involves training a GAN to generate very small images, such as 4\u00d74, and incrementally increasing the size of the generated images to 8\u00d78, 16\u00d716, until the desired output size is met. This has allowed the progressive GAN to generate photorealistic synthetic faces with 1024\u00d71024 pixel resolution.<\/p>\n<p>The key innovation of the progressive growing GAN is the two-phase training procedure that involves the fading-in of new blocks to support higher-resolution images followed by fine-tuning.<\/p>\n<p>In this tutorial, you will discover how to implement and train a progressive growing generative adversarial network for generating celebrity faces.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to prepare the celebrity faces dataset for training a progressive growing GAN model.<\/li>\n<li>How to define and train the progressive growing GAN on the celebrity faces dataset.<\/li>\n<li>How to load saved generator models and use them for generating ad hoc synthetic celebrity faces.<\/li>\n<\/ul>\n<p>Discover how to develop DCGANs, conditional GANs, Pix2Pix, CycleGANs, and more with Keras <a href=\"https:\/\/machinelearningmastery.com\/generative_adversarial_networks\/\" rel=\"nofollow\">in my new GANs book<\/a>, with 29 step-by-step tutorials and full source code.<\/p>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_8474\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8474\" class=\"size-full wp-image-8474\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/08\/How-to-Train-a-Progressive-Growing-GAN-in-Keras-for-Synthesizing-Faces.jpg\" alt=\"How to Train a Progressive Growing GAN in Keras for Synthesizing Faces\" width=\"640\" height=\"382\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/How-to-Train-a-Progressive-Growing-GAN-in-Keras-for-Synthesizing-Faces.jpg 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/How-to-Train-a-Progressive-Growing-GAN-in-Keras-for-Synthesizing-Faces-300x179.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8474\" class=\"wp-caption-text\">How to Train a Progressive Growing GAN in Keras for Synthesizing Faces.<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/weyes\/14273137213\/\">Alessandro Caproni<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into five parts; they are:<\/p>\n<ol>\n<li>What Is the Progressive Growing GAN<\/li>\n<li>How to Prepare the Celebrity Faces Dataset<\/li>\n<li>How to Develop Progressive Growing GAN Models<\/li>\n<li>How to Train Progressive Growing GAN Models<\/li>\n<li>How to Synthesize Images With a Progressive Growing GAN Model<\/li>\n<\/ol>\n<h2>What Is the Progressive Growing GAN<\/h2>\n<p>GANs are effective at generating crisp synthetic images, although are typically limited in the size of the images that can be generated.<\/p>\n<p>The Progressive Growing GAN is an extension to the GAN that allows the training generator models to be capable of generating large high-quality images, such as photorealistic faces with the size 1024\u00d71024 pixels. It was described in the 2017 paper by <a href=\"https:\/\/research.nvidia.com\/person\/tero-karras\">Tero Karras<\/a>, et al. from Nvidia titled \u201c<a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>.\u201d<\/p>\n<p>The key innovation of the Progressive Growing GAN is the incremental increase in the size of images output by the generator, starting with a 4\u00d74 pixel image and doubling to 8\u00d78, 16\u00d716, and so on until the desired output resolution.<\/p>\n<p>This is achieved by a training procedure that involves periods of fine-tuning the model with a given output resolution, and periods of slowly phasing in a new model with a larger resolution. All layers remain trainable during the training process, including existing layers when new layers are added.<\/p>\n<p>Progressive Growing GAN involves using a generator and discriminator model with the same general structure and starting with very small images. During training, new blocks of convolutional layers are systematically added to both the generator model and the discriminator models.<\/p>\n<p>The incremental addition of the layers allows the models to effectively learn coarse-level detail and later learn ever-finer detail, both on the generator and discriminator sides.<\/p>\n<p>This incremental nature allows the training to first discover large-scale structure of the image distribution and then shift attention to increasingly finer-scale detail, instead of having to learn all scales simultaneously.<\/p>\n<p>The next step is to select a dataset to use for developing a Progressive Growing GAN.<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Want to Develop GANs from Scratch?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/162526e1b172a2%3A164f8be4f346dc\/5926953912500224\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\" rel=\"noopener noreferrer\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"162526e1b172a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/162526e1b172a2%3A164f8be4f346dc\/5926953912500224\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1562872266.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h2>How to Prepare the Celebrity Faces Dataset<\/h2>\n<p>In this tutorial, we will use the <a href=\"http:\/\/mmlab.ie.cuhk.edu.hk\/projects\/CelebA.html\">Large-scale Celebrity Faces Attributes Dataset<\/a>, referred to as CelebA.<\/p>\n<p>This dataset was developed and published by <a href=\"https:\/\/liuziwei7.github.io\/\">Ziwei Liu<\/a>, et al. for their 2015 paper tilted \u201c<a href=\"https:\/\/arxiv.org\/abs\/1509.06451\">From Facial Parts Responses to Face Detection: A Deep Learning Approach<\/a>.\u201d<\/p>\n<p>The dataset provides about 200,000 photographs of celebrity faces along with annotations for what appears in given photos, such as glasses, face shape, hats, hair type, etc. As part of the dataset, the authors provide a version of each photo centered on the face and cropped to the portrait with varying sizes around 150 pixels wide and 200 pixels tall. We will use this as the basis for developing our GAN model.<\/p>\n<p>The dataset can be easily downloaded from the Kaggle webpage. Note: this may require an account with Kaggle.<\/p>\n<ul>\n<li><a href=\"https:\/\/www.kaggle.com\/jessicali9530\/celeba-dataset\">CelebFaces Attributes (CelebA) Dataset<\/a><\/li>\n<\/ul>\n<p>Specifically, download the file \u201c<em>img_align_celeba.zip<\/em>\u201c, which is about 1.3 gigabytes. To do this, click on the filename on the Kaggle website and then click the download icon.<\/p>\n<p>The download might take a while depending on the speed of your internet connection.<\/p>\n<p>After downloading, unzip the archive.<\/p>\n<p>This will create a new directory named \u201c<em>img_align_celeba<\/em>\u201d that contains all of the images with filenames like <em>202599.jpg<\/em> and <em>202598.jpg<\/em>.<\/p>\n<p>When working with a GAN, it is easier to model a dataset if all of the images are small and square in shape.<\/p>\n<p>Further, as we are only interested in the face in each photo and not the background, we can perform face detection and extract only the face before resizing the result to a fixed size.<\/p>\n<p>There are many ways to perform face detection. In this case, we will use a pre-trained <a href=\"https:\/\/machinelearningmastery.com\/how-to-perform-face-detection-with-classical-and-deep-learning-methods-in-python-with-keras\/\">Multi-Task Cascaded Convolutional Neural Network<\/a>, or MTCNN. This is a state-of-the-art deep learning model for face detection, described in the 2016 paper titled \u201c<a href=\"https:\/\/arxiv.org\/abs\/1604.02878\">Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks<\/a>.\u201d<\/p>\n<p>We will use the implementation provided by <a href=\"https:\/\/github.com\/ipazc\">Iv\u00e1n de Paz Centeno<\/a> in the <a href=\"https:\/\/github.com\/ipazc\/mtcnn\">ipazc\/mtcnn project<\/a>. This can also be installed via pip as follows:<\/p>\n<pre class=\"crayon-plain-tag\">sudo pip install mtcnn<\/pre>\n<p>We can confirm that the library was installed correctly by importing the library and printing the version; for example:<\/p>\n<pre class=\"crayon-plain-tag\"># confirm mtcnn was installed correctly\r\nimport mtcnn\r\n# print version\r\nprint(mtcnn.__version__)<\/pre>\n<p>Running the example prints the current version of the library.<\/p>\n<pre class=\"crayon-plain-tag\">0.0.8<\/pre>\n<p>The MTCNN model is very easy to use.<\/p>\n<p>First, an instance of the MTCNN model is created, then the <em>detect_faces()<\/em> function can be called passing in the pixel data for one image.<\/p>\n<p>The result a list of detected faces, with a bounding box defined in pixel offset values.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# prepare model\r\nmodel = MTCNN()\r\n# detect face in the image\r\nfaces = model.detect_faces(pixels)\r\n# extract details of the face\r\nx1, y1, width, height = faces[0]['box']<\/pre>\n<p>Although the progressive growing GAN supports the synthesis of large images, such as 1024\u00d71024, this requires enormous resources, such as a single top of the line GPU training the model for a month.<\/p>\n<p>Instead, we will reduce the size of the generated images to 128\u00d7128 which will, in turn, allow us to <a href=\"https:\/\/machinelearningmastery.com\/develop-evaluate-large-deep-learning-models-keras-amazon-web-services\/\">train a reasonable model on a GPU<\/a> in a few hours and still discover how the progressive growing model can be implemented, trained, and used.<\/p>\n<p>As such, we can develop a function to load a file and extract the face from the photo, then and resize the extracted face pixels to a predefined size. In this case, we will use the square shape of 128\u00d7128 pixels.<\/p>\n<p>The <em>load_image()<\/em> function below will load a given photo file name as a NumPy array of pixels.<\/p>\n<pre class=\"crayon-plain-tag\"># load an image as an rgb numpy array\r\ndef load_image(filename):\r\n\t# load image from file\r\n\timage = Image.open(filename)\r\n\t# convert to RGB, if needed\r\n\timage = image.convert('RGB')\r\n\t# convert to array\r\n\tpixels = asarray(image)\r\n\treturn pixels<\/pre>\n<p>The <em>extract_face()<\/em> function below takes the MTCNN model and pixel values for a single photograph as arguments and returns a 128x128x3 array of pixel values with just the face, or <em>None<\/em> if no face was detected (which can happen rarely).<\/p>\n<pre class=\"crayon-plain-tag\"># extract the face from a loaded image and resize\r\ndef extract_face(model, pixels, required_size=(128, 128)):\r\n\t# detect face in the image\r\n\tfaces = model.detect_faces(pixels)\r\n\t# skip cases where we could not detect a face\r\n\tif len(faces) == 0:\r\n\t\treturn None\r\n\t# extract details of the face\r\n\tx1, y1, width, height = faces[0]['box']\r\n\t# force detected pixel values to be positive (bug fix)\r\n\tx1, y1 = abs(x1), abs(y1)\r\n\t# convert into coordinates\r\n\tx2, y2 = x1 + width, y1 + height\r\n\t# retrieve face pixels\r\n\tface_pixels = pixels[y1:y2, x1:x2]\r\n\t# resize pixels to the model size\r\n\timage = Image.fromarray(face_pixels)\r\n\timage = image.resize(required_size)\r\n\tface_array = asarray(image)\r\n\treturn face_array<\/pre>\n<p>The <em>load_faces()<\/em> function below enumerates all photograph files in a directory and extracts and resizes the face from each and returns a NumPy array of faces.<\/p>\n<p>We limit the total number of faces loaded via the <em>n_faces<\/em> argument, as we don\u2019t need them all.<\/p>\n<pre class=\"crayon-plain-tag\"># load images and extract faces for all images in a directory\r\ndef load_faces(directory, n_faces):\r\n\t# prepare model\r\n\tmodel = MTCNN()\r\n\tfaces = list()\r\n\t# enumerate files\r\n\tfor filename in listdir(directory):\r\n\t\t# load the image\r\n\t\tpixels = load_image(directory + filename)\r\n\t\t# get face\r\n\t\tface = extract_face(model, pixels)\r\n\t\tif face is None:\r\n\t\t\tcontinue\r\n\t\t# store\r\n\t\tfaces.append(face)\r\n\t\tprint(len(faces), face.shape)\r\n\t\t# stop once we have enough\r\n\t\tif len(faces) >= n_faces:\r\n\t\t\tbreak\r\n\treturn asarray(faces)<\/pre>\n<p>Tying this together, the complete example of preparing a dataset of celebrity faces for training a GAN model is listed below.<\/p>\n<p>In this case, we increase the total number of loaded faces to 50,000 to provide a good training dataset for our GAN model.<\/p>\n<pre class=\"crayon-plain-tag\"># example of extracting and resizing faces into a new dataset\r\nfrom os import listdir\r\nfrom numpy import asarray\r\nfrom numpy import savez_compressed\r\nfrom PIL import Image\r\nfrom mtcnn.mtcnn import MTCNN\r\nfrom matplotlib import pyplot\r\n\r\n# load an image as an rgb numpy array\r\ndef load_image(filename):\r\n\t# load image from file\r\n\timage = Image.open(filename)\r\n\t# convert to RGB, if needed\r\n\timage = image.convert('RGB')\r\n\t# convert to array\r\n\tpixels = asarray(image)\r\n\treturn pixels\r\n\r\n# extract the face from a loaded image and resize\r\ndef extract_face(model, pixels, required_size=(128, 128)):\r\n\t# detect face in the image\r\n\tfaces = model.detect_faces(pixels)\r\n\t# skip cases where we could not detect a face\r\n\tif len(faces) == 0:\r\n\t\treturn None\r\n\t# extract details of the face\r\n\tx1, y1, width, height = faces[0]['box']\r\n\t# force detected pixel values to be positive (bug fix)\r\n\tx1, y1 = abs(x1), abs(y1)\r\n\t# convert into coordinates\r\n\tx2, y2 = x1 + width, y1 + height\r\n\t# retrieve face pixels\r\n\tface_pixels = pixels[y1:y2, x1:x2]\r\n\t# resize pixels to the model size\r\n\timage = Image.fromarray(face_pixels)\r\n\timage = image.resize(required_size)\r\n\tface_array = asarray(image)\r\n\treturn face_array\r\n\r\n# load images and extract faces for all images in a directory\r\ndef load_faces(directory, n_faces):\r\n\t# prepare model\r\n\tmodel = MTCNN()\r\n\tfaces = list()\r\n\t# enumerate files\r\n\tfor filename in listdir(directory):\r\n\t\t# load the image\r\n\t\tpixels = load_image(directory + filename)\r\n\t\t# get face\r\n\t\tface = extract_face(model, pixels)\r\n\t\tif face is None:\r\n\t\t\tcontinue\r\n\t\t# store\r\n\t\tfaces.append(face)\r\n\t\tprint(len(faces), face.shape)\r\n\t\t# stop once we have enough\r\n\t\tif len(faces) >= n_faces:\r\n\t\t\tbreak\r\n\treturn asarray(faces)\r\n\r\n# directory that contains all images\r\ndirectory = 'img_align_celeba\/'\r\n# load and extract all faces\r\nall_faces = load_faces(directory, 50000)\r\nprint('Loaded: ', all_faces.shape)\r\n# save in compressed format\r\nsavez_compressed('img_align_celeba_128.npz', all_faces)<\/pre>\n<p>Running the example may take a few minutes given the larger number of faces to be loaded.<\/p>\n<p>At the end of the run, the array of extracted and resized faces is saved as a compressed NumPy array with the filename \u2018<em>img_align_celeba_128.npz<\/em>\u2018.<\/p>\n<p>The prepared dataset can then be loaded any time, as follows.<\/p>\n<pre class=\"crayon-plain-tag\"># load the prepared dataset\r\nfrom numpy import load\r\n# load the face dataset\r\ndata = load('img_align_celeba_128.npz')\r\nfaces = data['arr_0']\r\nprint('Loaded: ', faces.shape)<\/pre>\n<p>Loading the dataset summarizes the shape of the array, showing 50K images with the size of 128\u00d7128 pixels and three color channels.<\/p>\n<pre class=\"crayon-plain-tag\">Loaded: (50000, 128, 128, 3)<\/pre>\n<p>We can elaborate on this example and plot the first 100 faces in the dataset as a 10\u00d710 grid. The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># load the prepared dataset\r\nfrom numpy import load\r\nfrom matplotlib import pyplot\r\n\r\n# plot a list of loaded faces\r\ndef plot_faces(faces, n):\r\n\tfor i in range(n * n):\r\n\t\t# define subplot\r\n\t\tpyplot.subplot(n, n, 1 + i)\r\n\t\t# turn off axis\r\n\t\tpyplot.axis('off')\r\n\t\t# plot raw pixel data\r\n\t\tpyplot.imshow(faces[i].astype('uint8'))\r\n\tpyplot.show()\r\n\r\n# load the face dataset\r\ndata = load('img_align_celeba_128.npz')\r\nfaces = data['arr_0']\r\nprint('Loaded: ', faces.shape)\r\nplot_faces(faces, 10)<\/pre>\n<p>Running the example loads the dataset and creates a plot of the first 100 images.<\/p>\n<p>We can see that each image only contains the face and all faces have the same square shape. Our goal is to generate new faces with the same general properties.<\/p>\n<div id=\"attachment_8464\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8464\" class=\"size-full wp-image-8464\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-100-Celebrity-Faces-in-a-10x10-Grid.png\" alt=\"Plot of 100 Celebrity Faces in a 10x10 Grid\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-100-Celebrity-Faces-in-a-10x10-Grid.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-100-Celebrity-Faces-in-a-10x10-Grid-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-100-Celebrity-Faces-in-a-10x10-Grid-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-100-Celebrity-Faces-in-a-10x10-Grid-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-8464\" class=\"wp-caption-text\">Plot of 100 Celebrity Faces in a 10\u00d710 Grid<\/p>\n<\/div>\n<p>We are now ready to develop a GAN model to generate faces using this dataset.<\/p>\n<h2>How to Develop Progressive Growing GAN Models<\/h2>\n<p>There are many ways to implement the progressive growing GAN models.<\/p>\n<p>In this tutorial, we will develop and implement each phase of growth as a separate Keras model and each model will share the same layers and weights.<\/p>\n<p>This approach allows for the convenient training of each model, just like a normal Keras model, although it requires a slightly complicated model construction process to ensure that the layers are reused correctly.<\/p>\n<p>First, we will define some custom layers required in the definition of the generator and discriminator models, then proceed to define functions to create and grow the discriminator and generator models themselves.<\/p>\n<h3>Progressive Growing Custom Layers<\/h3>\n<p>There are three custom layers required to implement the progressive growing generative adversarial network.<\/p>\n<p>They are the layers:<\/p>\n<ul>\n<li><strong>WeightedSum<\/strong>: Used to control the weighted sum of the old and new layers during a growth phase.<\/li>\n<li><strong>MinibatchStdev<\/strong>: Used to summarize statistics for a batch of images in the discriminator.<\/li>\n<li><strong>PixelNormalization<\/strong>: Used to normalize activation maps in the generator model.<\/li>\n<\/ul>\n<p>Additionally, a weight constraint is used in the paper referred to as \u201c<em>equalized learning rate<\/em>\u201c. This too would need to be implemented as a custom layer. In the interest of brevity, we won\u2019t use equalized learning rate in this tutorial and instead we use a simple max norm weight constraint.<\/p>\n<h4>WeightedSum Layer<\/h4>\n<p>The <em>WeightedSum<\/em> layer is a merge layer that combines the activations from two input layers, such as two input paths in a discriminator or two output paths in a generator model. It uses a variable called <em>alpha<\/em> that controls how much to weight the first and second inputs.<\/p>\n<p>It is used during the growth phase of training when the model is in transition from one image size to a new image size with double the width and height (quadruple the area), such as from 4\u00d74 to 8\u00d78 pixels.<\/p>\n<p>During the growth phase, the alpha parameter is linearly scaled from 0.0 at the beginning to 1.0 at the end, allowing the output of the layer to transition from giving full weight to the old layers to giving full weight to the new layers (second input).<\/p>\n<ul>\n<li>weighted sum = ((1.0 \u2013 alpha) * input1) + (alpha * input2)<\/li>\n<\/ul>\n<p>The <em>WeightedSum<\/em> class is defined below as an extension to the <em>Add<\/em> merge layer.<\/p>\n<pre class=\"crayon-plain-tag\"># weighted sum output\r\nclass WeightedSum(Add):\r\n\t# init with default value\r\n\tdef __init__(self, alpha=0.0, **kwargs):\r\n\t\tsuper(WeightedSum, self).__init__(**kwargs)\r\n\t\tself.alpha = backend.variable(alpha, name='ws_alpha')\r\n\r\n\t# output a weighted sum of inputs\r\n\tdef _merge_function(self, inputs):\r\n\t\t# only supports a weighted sum of two inputs\r\n\t\tassert (len(inputs) == 2)\r\n\t\t# ((1-a) * input1) + (a * input2)\r\n\t\toutput = ((1.0 - self.alpha) * inputs[0]) + (self.alpha * inputs[1])\r\n\t\treturn output<\/pre>\n<\/p>\n<h4>MinibatchStdev<\/h4>\n<p>The mini-batch standard deviation layer, or <em>MinibatchStdev<\/em>, is only used in the output block of the discriminator layer.<\/p>\n<p>The objective of the layer is to provide a statistical summary of the batch of activations. The discriminator can then learn to better detect batches of fake samples from batches of real samples. This, in turn, encourages the generator that is trained via the discriminator to create batches of samples with realistic batch statistics.<\/p>\n<p>It is implemented as calculating the standard deviation for each pixel value in the activation maps across the batch, calculating the average of this value, and then creating a new activation map (one channel) that is appended to the list of activation maps provided as input.<\/p>\n<p>The <em>MinibatchStdev<\/em> layer is defined below.<\/p>\n<pre class=\"crayon-plain-tag\"># mini-batch standard deviation layer\r\nclass MinibatchStdev(Layer):\r\n\t# initialize the layer\r\n\tdef __init__(self, **kwargs):\r\n\t\tsuper(MinibatchStdev, self).__init__(**kwargs)\r\n\r\n\t# perform the operation\r\n\tdef call(self, inputs):\r\n\t\t# calculate the mean value for each pixel across channels\r\n\t\tmean = backend.mean(inputs, axis=0, keepdims=True)\r\n\t\t# calculate the squared differences between pixel values and mean\r\n\t\tsqu_diffs = backend.square(inputs - mean)\r\n\t\t# calculate the average of the squared differences (variance)\r\n\t\tmean_sq_diff = backend.mean(squ_diffs, axis=0, keepdims=True)\r\n\t\t# add a small value to avoid a blow-up when we calculate stdev\r\n\t\tmean_sq_diff += 1e-8\r\n\t\t# square root of the variance (stdev)\r\n\t\tstdev = backend.sqrt(mean_sq_diff)\r\n\t\t# calculate the mean standard deviation across each pixel coord\r\n\t\tmean_pix = backend.mean(stdev, keepdims=True)\r\n\t\t# scale this up to be the size of one input feature map for each sample\r\n\t\tshape = backend.shape(inputs)\r\n\t\toutput = backend.tile(mean_pix, (shape[0], shape[1], shape[2], 1))\r\n\t\t# concatenate with the output\r\n\t\tcombined = backend.concatenate([inputs, output], axis=-1)\r\n\t\treturn combined\r\n\r\n\t# define the output shape of the layer\r\n\tdef compute_output_shape(self, input_shape):\r\n\t\t# create a copy of the input shape as a list\r\n\t\tinput_shape = list(input_shape)\r\n\t\t# add one to the channel dimension (assume channels-last)\r\n\t\tinput_shape[-1] += 1\r\n\t\t# convert list to a tuple\r\n\t\treturn tuple(input_shape)<\/pre>\n<\/p>\n<h4>PixelNormalization<\/h4>\n<p>The generator and discriminator models don\u2019t use <a href=\"https:\/\/machinelearningmastery.com\/how-to-accelerate-learning-of-deep-neural-networks-with-batch-normalization\/\">batch normalization<\/a> like other GAN models; instead, each pixel in the activation maps is normalized to unit length.<\/p>\n<p>This is a variation of local response normalization and is referred to in the paper as pixelwise feature vector normalization. Also, unlike other GAN models, normalization is only used in the generator model, not the discriminator.<\/p>\n<p>This is a type of activity regularization and could be implemented as an activity constraint, although it is easily implemented as a new layer that scales the activations of the prior layer.<\/p>\n<p>The <em>PixelNormalization<\/em> class below implements this and can be used after each <a href=\"https:\/\/machinelearningmastery.com\/convolutional-layers-for-deep-learning-neural-networks\/\">Convolution layer<\/a> in the generator, but before any activation function.<\/p>\n<pre class=\"crayon-plain-tag\"># pixel-wise feature vector normalization layer\r\nclass PixelNormalization(Layer):\r\n\t# initialize the layer\r\n\tdef __init__(self, **kwargs):\r\n\t\tsuper(PixelNormalization, self).__init__(**kwargs)\r\n\r\n\t# perform the operation\r\n\tdef call(self, inputs):\r\n\t\t# calculate square pixel values\r\n\t\tvalues = inputs**2.0\r\n\t\t# calculate the mean pixel values\r\n\t\tmean_values = backend.mean(values, axis=-1, keepdims=True)\r\n\t\t# ensure the mean is not zero\r\n\t\tmean_values += 1.0e-8\r\n\t\t# calculate the sqrt of the mean squared value (L2 norm)\r\n\t\tl2 = backend.sqrt(mean_values)\r\n\t\t# normalize values by the l2 norm\r\n\t\tnormalized = inputs \/ l2\r\n\t\treturn normalized\r\n\r\n\t# define the output shape of the layer\r\n\tdef compute_output_shape(self, input_shape):\r\n\t\treturn input_shape<\/pre>\n<p>We now have all of the custom layers required and can define our models.<\/p>\n<h3>Progressive Growing Discriminator Model<\/h3>\n<p>The discriminator model is defined as a deep convolutional neural network that expects a 4\u00d74 color image as input and predicts whether it is real or fake.<\/p>\n<p>The first hidden layer is a 1\u00d71 convolutional layer. The output block involves a <em>MinibatchStdev<\/em>, 3\u00d73, and 4\u00d74 convolutional layers, and a fully connected layer that outputs a prediction. <a href=\"https:\/\/machinelearningmastery.com\/rectified-linear-activation-function-for-deep-learning-neural-networks\/\">Leaky ReLU activation functions<\/a> are used after all layers and the output layers use a linear activation function.<\/p>\n<p>This model is trained for normal interval then the model undergoes a growth phase to 8\u00d78. This involves adding a block of two 3\u00d73 convolutional layers and an <a href=\"https:\/\/machinelearningmastery.com\/pooling-layers-for-convolutional-neural-networks\/\">average pooling downsample layer<\/a>. The input image passes through the new block with a new <a href=\"https:\/\/machinelearningmastery.com\/introduction-to-1x1-convolutions-to-reduce-the-complexity-of-convolutional-neural-networks\/\">1\u00d71 convolutional hidden layer<\/a>. The input image is also passed through a downsample layer and through the old 1\u00d71 convolutional hidden layer. The output of the old 1\u00d71 convolution layer and the new block are then combined via a <em>WeightedSum<\/em> layer.<\/p>\n<p>After an interval of training transitioning the <em>WeightedSum\u2019s<\/em> alpha parameter from 0.0 (all old) to 1.0 (all new), another training phase is run to tune the new model with the old layer and pathway removed.<\/p>\n<p>This process repeats until the desired image size is met, in our case, 128\u00d7128 pixel images.<\/p>\n<p>We can achieve this with two functions: the <em>define_discriminator()<\/em> function that defines the base model that accepts 4\u00d74 images and the <em>add_discriminator_block()<\/em> function that takes a model and creates a growth version of the model with two pathways and the <em>WeightedSum<\/em> and a second version of the model with the same layers\/weights but without the old 1\u00d71 layer and <em>WeightedSum<\/em> layers. The <em>define_discriminator()<\/em> function can then call the <em>add_discriminator_block()<\/em> function as many times as is needed to create the models up to the desired level of growth.<\/p>\n<p>All layers are initialized with small <a href=\"https:\/\/machinelearningmastery.com\/how-to-generate-random-numbers-in-python\/\">Gaussian random numbers<\/a> with a standard deviation of 0.02, which is common for GAN models. A <a href=\"https:\/\/machinelearningmastery.com\/how-to-reduce-overfitting-in-deep-neural-networks-with-weight-constraints-in-keras\/\">maxnorm weight constraint<\/a> is used with a value of 1.0, instead of the more elaborate \u2018<em>equalized learning rate<\/em>\u2018 weight constraint used in the paper.<\/p>\n<p>The paper defines a number of filters that increases with the depth of the model from 16 to 32, 64, all the way up to 512. This requires projection of the number of feature maps during the growth phase so that the weighted sum can be calculated correctly. To avoid this complication, we fix the number of filters to be the same in all layers.<\/p>\n<p>Each model is compiled and will be fit. In this case, we will use Wasserstein loss (or WGAN loss) and the <a href=\"https:\/\/machinelearningmastery.com\/adam-optimization-algorithm-for-deep-learning\/\">Adam version of stochastic gradient descent<\/a> configured as is specified in the paper. The authors of the paper recommend exploring using both WGAN-GP loss and least squares loss and found that the former performed slightly better. Nevertheless, we will use Wasserstein loss as it greatly simplifies the implementation.<\/p>\n<p>First, we must define the loss function as the average predicted value multiplied by the target value. The target value will be 1 for real images and -1 for fake images. This means that weight updates will seek to increase the divide between real and fake images.<\/p>\n<pre class=\"crayon-plain-tag\"># calculate wasserstein loss\r\ndef wasserstein_loss(y_true, y_pred):\r\n\treturn backend.mean(y_true * y_pred)<\/pre>\n<p>The functions for defining and creating the growth versions of the discriminator models are listed below.<\/p>\n<p>We make careful use of the <a href=\"https:\/\/machinelearningmastery.com\/keras-functional-api-deep-learning\/\">functional API<\/a> and knowledge of the model structure to create the two models for each growth phase. The growth phase also always doubles the expected input shape.<\/p>\n<pre class=\"crayon-plain-tag\"># add a discriminator block\r\ndef add_discriminator_block(old_model, n_input_layers=3):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# weight constraint\r\n\tconst = max_norm(1.0)\r\n\t# get shape of existing model\r\n\tin_shape = list(old_model.input.shape)\r\n\t# define new input shape as double the size\r\n\tinput_shape = (in_shape[-2].value*2, in_shape[-2].value*2, in_shape[-1].value)\r\n\tin_image = Input(shape=input_shape)\r\n\t# define new input processing layer\r\n\td = Conv2D(128, (1,1), padding='same', kernel_initializer=init, kernel_constraint=const)(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# define new block\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = AveragePooling2D()(d)\r\n\tblock_new = d\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel1 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel1.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# downsample the new larger image\r\n\tdownsample = AveragePooling2D()(in_image)\r\n\t# connect old input processing to downsampled new input\r\n\tblock_old = old_model.layers[1](downsample)\r\n\tblock_old = old_model.layers[2](block_old)\r\n\t# fade in output of old model input layer with new input\r\n\td = WeightedSum()([block_old, block_new])\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel2 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel2.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\treturn [model1, model2]\r\n\r\n# define the discriminator models for each image resolution\r\ndef define_discriminator(n_blocks, input_shape=(4,4,3)):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# weight constraint\r\n\tconst = max_norm(1.0)\r\n\tmodel_list = list()\r\n\t# base model input\r\n\tin_image = Input(shape=input_shape)\r\n\t# conv 1x1\r\n\td = Conv2D(128, (1,1), padding='same', kernel_initializer=init, kernel_constraint=const)(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 3x3 (output block)\r\n\td = MinibatchStdev()(d)\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 4x4\r\n\td = Conv2D(128, (4,4), padding='same', kernel_initializer=init, kernel_constraint=const)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# dense output layer\r\n\td = Flatten()(d)\r\n\tout_class = Dense(1)(d)\r\n\t# define model\r\n\tmodel = Model(in_image, out_class)\r\n\t# compile model\r\n\tmodel.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_discriminator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list<\/pre>\n<p>The <em>define_discriminator()<\/em> function is called by specifying the number of blocks to create.<\/p>\n<p>We will create 6 blocks, which will create 6 pairs of models that expect the input image sizes of 4\u00d74, 8\u00d78, 16\u00d716, 32\u00d732, 64\u00d764, 128\u00d7128.<\/p>\n<p>The function returns a list where each element in the list contains two models. The first model is the \u2018<em>normal model<\/em>\u2018 or straight through model, and the second is the version of the model that includes the old 1\u00d71 and new block with the weighted sum, used for the transition or growth phase of training.<\/p>\n<h3>Progressive Growing Generator Model<\/h3>\n<p>The generator model takes a random point from the latent space as input and generates a synthetic image.<\/p>\n<p>The generator models are defined in the same way as the discriminator models.<\/p>\n<p>Specifically, a base model for generating 4\u00d74 images is defined and growth versions of the model are created for the large image output size.<\/p>\n<p>The main difference is that during the growth phase, the output of the model is the output of the <em>WeightedSum<\/em> layer. The growth phase version of the model involves first adding a nearest neighbor upsampling layer; this is then connected to the new block with the new output layer and to the old old output layer. The old and new output layers are then combined via a <em>WeightedSum<\/em> output layer.<\/p>\n<p>The base model has an input block defined with a fully connected layer with a sufficient number of activations to create a given number of 4\u00d74 feature maps. This is followed by <a href=\"https:\/\/machinelearningmastery.com\/convolutional-layers-for-deep-learning-neural-networks\/\">4\u00d74 and 3\u00d73 convolution layers<\/a> and a <a href=\"https:\/\/machinelearningmastery.com\/introduction-to-1x1-convolutions-to-reduce-the-complexity-of-convolutional-neural-networks\/\">1\u00d71 output layer<\/a> that generates color images. New blocks are added with an upsample layer and two 3\u00d73 convolutional layers.<\/p>\n<p>The <em>LeakyReLU<\/em> activation function is used and the <em>PixelNormalization<\/em> layer is used after each convolutional layer. A linear activation function is used in the output layer, instead of the more common tanh function, yet real images are still scaled to the range [-1,1], which is common for most GAN models.<\/p>\n<p>The paper defines the number of feature maps decreasing with the depth of the model from 512 to 16. As with the discriminator, the difference in the number of feature maps across blocks introduces a challenge for the <em>WeightedSum<\/em>, so for simplicity, we fix all layers to have the same number of filters.<\/p>\n<p>Also like the discriminator model, weights are initialized with <a href=\"https:\/\/machinelearningmastery.com\/how-to-generate-random-numbers-in-python\/\">Gaussian random numbers<\/a> with a standard deviation of 0.02 and the <a href=\"https:\/\/machinelearningmastery.com\/introduction-to-weight-constraints-to-reduce-generalization-error-in-deep-learning\/\">maxnorm weight constraint<\/a> is used with a value of 1.0, instead of the equalized learning rate weight constraint used in the paper.<\/p>\n<p>The functions for defining and growing the generator models are defined below.<\/p>\n<pre class=\"crayon-plain-tag\"># add a generator block\r\ndef add_generator_block(old_model):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# weight constraint\r\n\tconst = max_norm(1.0)\r\n\t# get the end of the last block\r\n\tblock_end = old_model.layers[-2].output\r\n\t# upsample, and define new block\r\n\tupsampling = UpSampling2D()(block_end)\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(upsampling)\r\n\tg = PixelNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\tg = PixelNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# add new output layer\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\t# define model\r\n\tmodel1 = Model(old_model.input, out_image)\r\n\t# get the output layer from old model\r\n\tout_old = old_model.layers[-1]\r\n\t# connect the upsampling to the old output layer\r\n\tout_image2 = out_old(upsampling)\r\n\t# define new output image as the weighted sum of the old and new models\r\n\tmerged = WeightedSum()([out_image2, out_image])\r\n\t# define model\r\n\tmodel2 = Model(old_model.input, merged)\r\n\treturn [model1, model2]\r\n\r\n# define generator models\r\ndef define_generator(latent_dim, n_blocks, in_dim=4):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# weight constraint\r\n\tconst = max_norm(1.0)\r\n\tmodel_list = list()\r\n\t# base model latent input\r\n\tin_latent = Input(shape=(latent_dim,))\r\n\t# linear scale up to activation maps\r\n\tg  = Dense(128 * in_dim * in_dim, kernel_initializer=init, kernel_constraint=const)(in_latent)\r\n\tg = Reshape((in_dim, in_dim, 128))(g)\r\n\t# conv 4x4, input block\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\tg = PixelNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 3x3\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\tg = PixelNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 1x1, output block\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\t# define model\r\n\tmodel = Model(in_latent, out_image)\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_generator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list<\/pre>\n<p>Calling the <em>define_generator()<\/em> function requires that the size of the latent space be defined.<\/p>\n<p>Like the discriminator, we will set the <em>n_blocks<\/em> argument to 6 to create six pairs of models.<\/p>\n<p>The function returns a list of models where each item in the list contains the normal or straight-through version of each generator and the growth version for phasing in the new block at the larger output image size.<\/p>\n<h3>Composite Models for Training the Generators<\/h3>\n<p>The generator models are not compiled as they are not trained directly.<\/p>\n<p>Instead, the generator models are trained via the discriminator models using Wasserstein loss.<\/p>\n<p>This involves presenting generated images to the discriminator as real images and calculating the loss that is then used to update the generator models.<\/p>\n<p>A given generator model must be paired with a given discriminator model both in terms of the same image size (e.g. 4\u00d74 or 8\u00d78) and in terms of the same phase of training, such as growth phase (introducing the new block) or fine-tuning phase (normal or straight-through).<\/p>\n<p>We can achieve this by creating a new model for each pair of models that stacks the generator on top of the discriminator so that the synthetic image feeds directly into the discriminator model to be deemed real or fake. This composite model can then be used to train the generator via the discriminator and the weights of the discriminator can be marked as not trainable (only in this model) to ensure they are not changed during this misleading process.<\/p>\n<p>As such, we can create pairs of composite models, e.g. six pairs for the six levels of image growth, where each pair is comprised of a composite model for the normal or straight-through model, and the growth version of the model.<\/p>\n<p>The <em>define_composite()<\/em> function implements this and is defined below.<\/p>\n<pre class=\"crayon-plain-tag\"># define composite models for training generators via discriminators\r\ndef define_composite(discriminators, generators):\r\n\tmodel_list = list()\r\n\t# create composite models\r\n\tfor i in range(len(discriminators)):\r\n\t\tg_models, d_models = generators[i], discriminators[i]\r\n\t\t# straight-through model\r\n\t\td_models[0].trainable = False\r\n\t\tmodel1 = Sequential()\r\n\t\tmodel1.add(g_models[0])\r\n\t\tmodel1.add(d_models[0])\r\n\t\tmodel1.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t\t# fade-in model\r\n\t\td_models[1].trainable = False\r\n\t\tmodel2 = Sequential()\r\n\t\tmodel2.add(g_models[1])\r\n\t\tmodel2.add(d_models[1])\r\n\t\tmodel2.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t\t# store\r\n\t\tmodel_list.append([model1, model2])\r\n\treturn model_list<\/pre>\n<p>Now that we have seen how to define the generator and discriminator models, let\u2019s look at how we can fit these models on the celebrity faces dataset.<\/p>\n<h2>How to Train Progressive Growing GAN Models<\/h2>\n<p>First, we need to define some convenience functions for working with samples of data.<\/p>\n<p>The <em>load_real_samples()<\/em> function below loads our prepared celebrity faces dataset, then converts the pixels to floating point values and scales them to the range [-1,1], common to most GAN implementations.<\/p>\n<pre class=\"crayon-plain-tag\"># load dataset\r\ndef load_real_samples(filename):\r\n\t# load dataset\r\n\tdata = load(filename)\r\n\t# extract numpy array\r\n\tX = data['arr_0']\r\n\t# convert from ints to floats\r\n\tX = X.astype('float32')\r\n\t# scale from [0,255] to [-1,1]\r\n\tX = (X - 127.5) \/ 127.5\r\n\treturn X<\/pre>\n<p>Next, we need to be able to retrieve a random sample of images used to update the discriminator.<\/p>\n<p>The <em>generate_real_samples()<\/em> function below implements this, returning a random sample of images from the loaded dataset and their corresponding target value of <em>class=1<\/em> to indicate that the images are real.<\/p>\n<pre class=\"crayon-plain-tag\"># select real samples\r\ndef generate_real_samples(dataset, n_samples):\r\n\t# choose random instances\r\n\tix = randint(0, dataset.shape[0], n_samples)\r\n\t# select images\r\n\tX = dataset[ix]\r\n\t# generate class labels\r\n\ty = ones((n_samples, 1))\r\n\treturn X, y<\/pre>\n<p>Next, we need a sample of latent points used to create synthetic images with the generator model.<\/p>\n<p>The <em>generate_latent_points()<\/em> function below implements this, returning a batch of latent points with the required dimensionality.<\/p>\n<pre class=\"crayon-plain-tag\"># generate points in latent space as input for the generator\r\ndef generate_latent_points(latent_dim, n_samples):\r\n\t# generate points in the latent space\r\n\tx_input = randn(latent_dim * n_samples)\r\n\t# reshape into a batch of inputs for the network\r\n\tx_input = x_input.reshape(n_samples, latent_dim)\r\n\treturn x_input<\/pre>\n<p>The latent points can be used as input to the generator to create a batch of synthetic images.<\/p>\n<p>This is required to update the discriminator model. It is also required to update the generator model via the discriminator model with the composite models defined in the previous section.<\/p>\n<p>The <em>generate_fake_samples()<\/em> function below takes a generator model and generates and returns a batch of synthetic images and the corresponding target for the discriminator of <em>class=-1<\/em> to indicate that the images are fake. The <em>generate_latent_points()<\/em> function is called to create the required batch worth of random latent points.<\/p>\n<pre class=\"crayon-plain-tag\"># use the generator to generate n fake examples, with class labels\r\ndef generate_fake_samples(generator, latent_dim, n_samples):\r\n\t# generate points in latent space\r\n\tx_input = generate_latent_points(latent_dim, n_samples)\r\n\t# predict outputs\r\n\tX = generator.predict(x_input)\r\n\t# create class labels\r\n\ty = -ones((n_samples, 1))\r\n\treturn X, y<\/pre>\n<p>Training the models occurs in two phases: a fade-in phase that involves the transition from a lower-resolution to a higher-resolution image, and the normal phase that involves the fine-tuning of the models at a given higher resolution image.<\/p>\n<p>During the phase-in, the <em>alpha<\/em> value of the <em>WeightedSum<\/em> layers in the discriminator and generator model at a given level requires linear transition from 0.0 to 1.0 based on the training step. The <em>update_fadein()<\/em> function below implements this; given a list of models (such as the generator, discriminator, and composite model), the function locates the <em>WeightedSum<\/em> layer in each and sets the value for the alpha attribute based on the current training step number.<\/p>\n<p>Importantly, this alpha attribute is not a constant but is instead defined as a changeable variable in the <em>WeightedSum<\/em> class and whose value can be changed using the Keras backend <em>set_value()<\/em> function.<\/p>\n<p>This is a clumsy but effective approach to changing the <em>alpha<\/em> values. Perhaps a cleaner implementation would involve a Keras Callback and is left as an exercise for the reader.<\/p>\n<pre class=\"crayon-plain-tag\"># update the alpha value on each instance of WeightedSum\r\ndef update_fadein(models, step, n_steps):\r\n\t# calculate current alpha (linear from 0 to 1)\r\n\talpha = step \/ float(n_steps - 1)\r\n\t# update the alpha for each model\r\n\tfor model in models:\r\n\t\tfor layer in model.layers:\r\n\t\t\tif isinstance(layer, WeightedSum):\r\n\t\t\t\tbackend.set_value(layer.alpha, alpha)<\/pre>\n<p>Next, we can define the procedure for training the models for a given training phase.<\/p>\n<p>A training phase takes one generator, discriminator, and composite model and updates them on the dataset for a given number of training epochs. The training phase may be a fade-in transition to a higher resolution, in which case the <em>update_fadein()<\/em> must be called each iteration, or it may be a normal tuning training phase, in which case there are no <em>WeightedSum<\/em> layers present.<\/p>\n<p>The <em>train_epochs()<\/em> function below implements the training of the discriminator and generator models for a single training phase.<\/p>\n<p>A single training iteration involves first selecting a half batch of real images from the dataset and generating a half batch of fake images from the current state of the generator model. These samples are then used to update the discriminator model.<\/p>\n<p>Next, the generator model is updated via the discriminator with the composite model, indicating that the generated images are, in fact, real, and updating generator weights in an effort to better fool the discriminator.<\/p>\n<p>A summary of model performance is printed at the end of each training iteration, summarizing the loss of the discriminator on the real (d1) and fake (d2) images and the loss of the generator (g).<\/p>\n<pre class=\"crayon-plain-tag\"># train a generator and discriminator\r\ndef train_epochs(g_model, d_model, gan_model, dataset, n_epochs, n_batch, fadein=False):\r\n\t# calculate the number of batches per training epoch\r\n\tbat_per_epo = int(dataset.shape[0] \/ n_batch)\r\n\t# calculate the number of training iterations\r\n\tn_steps = bat_per_epo * n_epochs\r\n\t# calculate the size of half a batch of samples\r\n\thalf_batch = int(n_batch \/ 2)\r\n\t# manually enumerate epochs\r\n\tfor i in range(n_steps):\r\n\t\t# update alpha for all WeightedSum layers when fading in new blocks\r\n\t\tif fadein:\r\n\t\t\tupdate_fadein([g_model, d_model, gan_model], i, n_steps)\r\n\t\t# prepare real and fake samples\r\n\t\tX_real, y_real = generate_real_samples(dataset, half_batch)\r\n\t\tX_fake, y_fake = generate_fake_samples(g_model, latent_dim, half_batch)\r\n\t\t# update discriminator model\r\n\t\td_loss1 = d_model.train_on_batch(X_real, y_real)\r\n\t\td_loss2 = d_model.train_on_batch(X_fake, y_fake)\r\n\t\t# update the generator via the discriminator's error\r\n\t\tz_input = generate_latent_points(latent_dim, n_batch)\r\n\t\ty_real2 = ones((n_batch, 1))\r\n\t\tg_loss = gan_model.train_on_batch(z_input, y_real2)\r\n\t\t# summarize loss on this batch\r\n\t\tprint('>%d, d1=%.3f, d2=%.3f g=%.3f' % (i+1, d_loss1, d_loss2, g_loss))<\/pre>\n<p>Next, we need to call the <em>train_epochs()<\/em> function for each training phase.<\/p>\n<p>This involves first scaling the training dataset to the required pixel dimensions, such as 4\u00d74 or 8\u00d78. The <em>scale_dataset()<\/em> function below implements this, taking the dataset and returning a scaled version.<\/p>\n<p>These scaled versions of the dataset could be pre-computed and loaded instead of re-scaled on each run. This might be a nice extension if you intend to run the example many times.<\/p>\n<pre class=\"crayon-plain-tag\"># scale images to preferred size\r\ndef scale_dataset(images, new_shape):\r\n\timages_list = list()\r\n\tfor image in images:\r\n\t\t# resize with nearest neighbor interpolation\r\n\t\tnew_image = resize(image, new_shape, 0)\r\n\t\t# store\r\n\t\timages_list.append(new_image)\r\n\treturn asarray(images_list)<\/pre>\n<p>After each training run, we also need to save a plot of generated images and the current state of the generator model.<\/p>\n<p>This is useful so that at the end of the run we can see the progression of the capability and quality of the model, and load and use a generator model at any point during the training process. A generator model could be used to create ad hoc images, or used as the starting point for continued training.<\/p>\n<p>The <em>summarize_performance()<\/em> function below implements this, given a status string such as \u201c<em>faded<\/em>\u201d or \u201c<em>tuned<\/em>\u201c, a generator model, and the size of the latent space. The function will proceed to create a unique name for the state of the system using the \u201c<em>status<\/em>\u201d string such as \u201c<em>04\u00d704-faded<\/em>\u201c, then create a plot of 25 generated images and save the plot and the generator model to file using the defined name.<\/p>\n<pre class=\"crayon-plain-tag\"># generate samples and save as a plot and save the model\r\ndef summarize_performance(status, g_model, latent_dim, n_samples=25):\r\n\t# devise name\r\n\tgen_shape = g_model.output_shape\r\n\tname = '%03dx%03d-%s' % (gen_shape[1], gen_shape[2], status)\r\n\t# generate images\r\n\tX, _ = generate_fake_samples(g_model, latent_dim, n_samples)\r\n\t# normalize pixel values to the range [0,1]\r\n\tX = (X - X.min()) \/ (X.max() - X.min())\r\n\t# plot real images\r\n\tsquare = int(sqrt(n_samples))\r\n\tfor i in range(n_samples):\r\n\t\tpyplot.subplot(square, square, 1 + i)\r\n\t\tpyplot.axis('off')\r\n\t\tpyplot.imshow(X[i])\r\n\t# save plot to file\r\n\tfilename1 = 'plot_%s.png' % (name)\r\n\tpyplot.savefig(filename1)\r\n\tpyplot.close()\r\n\t# save the generator model\r\n\tfilename2 = 'model_%s.h5' % (name)\r\n\tg_model.save(filename2)\r\n\tprint('>Saved: %s and %s' % (filename1, filename2))<\/pre>\n<p>The <em>train()<\/em> function below pulls this together, taking the lists of defined models as input as well as the list of batch sizes and the number of training epochs for the normal and fade-in phases at each level of growth for the model.<\/p>\n<p>The first generator and discriminator model for 4\u00d74 images are fit by calling <em>train_epochs()<\/em> and saved by calling <em>summarize_performance()<\/em>.<\/p>\n<p>Then the steps of growth are enumerated, involving first scaling the image dataset to the preferred size, training and saving the fade-in model for the new image size, then training and saving the normal or fine-tuned model for the new image size.<\/p>\n<pre class=\"crayon-plain-tag\"># train the generator and discriminator\r\ndef train(g_models, d_models, gan_models, dataset, latent_dim, e_norm, e_fadein, n_batch):\r\n\t# fit the baseline model\r\n\tg_normal, d_normal, gan_normal = g_models[0][0], d_models[0][0], gan_models[0][0]\r\n\t# scale dataset to appropriate size\r\n\tgen_shape = g_normal.output_shape\r\n\tscaled_data = scale_dataset(dataset, gen_shape[1:])\r\n\tprint('Scaled Data', scaled_data.shape)\r\n\t# train normal or straight-through models\r\n\ttrain_epochs(g_normal, d_normal, gan_normal, scaled_data, e_norm[0], n_batch[0])\r\n\tsummarize_performance('tuned', g_normal, latent_dim)\r\n\t# process each level of growth\r\n\tfor i in range(1, len(g_models)):\r\n\t\t# retrieve models for this level of growth\r\n\t\t[g_normal, g_fadein] = g_models[i]\r\n\t\t[d_normal, d_fadein] = d_models[i]\r\n\t\t[gan_normal, gan_fadein] = gan_models[i]\r\n\t\t# scale dataset to appropriate size\r\n\t\tgen_shape = g_normal.output_shape\r\n\t\tscaled_data = scale_dataset(dataset, gen_shape[1:])\r\n\t\tprint('Scaled Data', scaled_data.shape)\r\n\t\t# train fade-in models for next level of growth\r\n\t\ttrain_epochs(g_fadein, d_fadein, gan_fadein, scaled_data, e_fadein[i], n_batch[i], True)\r\n\t\tsummarize_performance('faded', g_normal, latent_dim)\r\n\t\t# train normal or straight-through models\r\n\t\ttrain_epochs(g_normal, d_normal, gan_normal, scaled_data, e_norm[i], n_batch[i])\r\n\t\tsummarize_performance('tuned', g_normal, latent_dim)<\/pre>\n<p>We can then define the configuration, models, and call <em>train()<\/em> to start the training process.<\/p>\n<p>The paper recommends using a batch size of 16 for images sized between 4\u00d74 and 128\u00d7128 before reducing the size. It also recommends training each phase for about 800K images. The paper also recommends a latent space of 512 dimensions.<\/p>\n<p>The models are defined with six levels of growth to meet the 128\u00d7128 pixel size of our dataset. We also shrink the latent space accordingly to 100 dimensions.<\/p>\n<p>Instead of keeping the <a href=\"https:\/\/machinelearningmastery.com\/how-to-control-the-speed-and-stability-of-training-neural-networks-with-gradient-descent-batch-size\/\">batch size and number of epochs<\/a> constant, we vary it to speed up the training process, using larger batch sizes for early training phases and smaller batch sizes for later training phases for fine-tuning and stability. Additionally, fewer training epochs are used for the smaller models and more epochs for the larger models.<\/p>\n<p>The choice of batch sizes and training epochs is somewhat arbitrary and you may want to experiment with different values and review their effects.<\/p>\n<pre class=\"crayon-plain-tag\"># number of growth phases, e.g. 6 == [4, 8, 16, 32, 64, 128]\r\nn_blocks = 6\r\n# size of the latent space\r\nlatent_dim = 100\r\n# define models\r\nd_models = define_discriminator(n_blocks)\r\n# define models\r\ng_models = define_generator(latent_dim, n_blocks)\r\n# define composite models\r\ngan_models = define_composite(d_models, g_models)\r\n# load image data\r\ndataset = load_real_samples('img_align_celeba_128.npz')\r\nprint('Loaded', dataset.shape)\r\n# train model\r\nn_batch = [16, 16, 16, 8, 4, 4]\r\n# 10 epochs == 500K images per training phase\r\nn_epochs = [5, 8, 8, 10, 10, 10]\r\ntrain(g_models, d_models, gan_models, dataset, latent_dim, n_epochs, n_epochs, n_batch)<\/pre>\n<p>We can tie all of this together.<\/p>\n<p>The complete example of training a progressive growing generative adversarial network on the celebrity faces dataset is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of progressive growing gan on celebrity faces dataset\r\nfrom math import sqrt\r\nfrom numpy import load\r\nfrom numpy import asarray\r\nfrom numpy import zeros\r\nfrom numpy import ones\r\nfrom numpy.random import randn\r\nfrom numpy.random import randint\r\nfrom skimage.transform import resize\r\nfrom keras.optimizers import Adam\r\nfrom keras.models import Sequential\r\nfrom keras.models import Model\r\nfrom keras.layers import Input\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers import Reshape\r\nfrom keras.layers import Conv2D\r\nfrom keras.layers import UpSampling2D\r\nfrom keras.layers import AveragePooling2D\r\nfrom keras.layers import LeakyReLU\r\nfrom keras.layers import Layer\r\nfrom keras.layers import Add\r\nfrom keras.constraints import max_norm\r\nfrom keras.initializers import RandomNormal\r\nfrom keras import backend\r\nfrom matplotlib import pyplot\r\n\r\n# pixel-wise feature vector normalization layer\r\nclass PixelNormalization(Layer):\r\n\t# initialize the layer\r\n\tdef __init__(self, **kwargs):\r\n\t\tsuper(PixelNormalization, self).__init__(**kwargs)\r\n\r\n\t# perform the operation\r\n\tdef call(self, inputs):\r\n\t\t# calculate square pixel values\r\n\t\tvalues = inputs**2.0\r\n\t\t# calculate the mean pixel values\r\n\t\tmean_values = backend.mean(values, axis=-1, keepdims=True)\r\n\t\t# ensure the mean is not zero\r\n\t\tmean_values += 1.0e-8\r\n\t\t# calculate the sqrt of the mean squared value (L2 norm)\r\n\t\tl2 = backend.sqrt(mean_values)\r\n\t\t# normalize values by the l2 norm\r\n\t\tnormalized = inputs \/ l2\r\n\t\treturn normalized\r\n\r\n\t# define the output shape of the layer\r\n\tdef compute_output_shape(self, input_shape):\r\n\t\treturn input_shape\r\n\r\n# mini-batch standard deviation layer\r\nclass MinibatchStdev(Layer):\r\n\t# initialize the layer\r\n\tdef __init__(self, **kwargs):\r\n\t\tsuper(MinibatchStdev, self).__init__(**kwargs)\r\n\r\n\t# perform the operation\r\n\tdef call(self, inputs):\r\n\t\t# calculate the mean value for each pixel across channels\r\n\t\tmean = backend.mean(inputs, axis=0, keepdims=True)\r\n\t\t# calculate the squared differences between pixel values and mean\r\n\t\tsqu_diffs = backend.square(inputs - mean)\r\n\t\t# calculate the average of the squared differences (variance)\r\n\t\tmean_sq_diff = backend.mean(squ_diffs, axis=0, keepdims=True)\r\n\t\t# add a small value to avoid a blow-up when we calculate stdev\r\n\t\tmean_sq_diff += 1e-8\r\n\t\t# square root of the variance (stdev)\r\n\t\tstdev = backend.sqrt(mean_sq_diff)\r\n\t\t# calculate the mean standard deviation across each pixel coord\r\n\t\tmean_pix = backend.mean(stdev, keepdims=True)\r\n\t\t# scale this up to be the size of one input feature map for each sample\r\n\t\tshape = backend.shape(inputs)\r\n\t\toutput = backend.tile(mean_pix, (shape[0], shape[1], shape[2], 1))\r\n\t\t# concatenate with the output\r\n\t\tcombined = backend.concatenate([inputs, output], axis=-1)\r\n\t\treturn combined\r\n\r\n\t# define the output shape of the layer\r\n\tdef compute_output_shape(self, input_shape):\r\n\t\t# create a copy of the input shape as a list\r\n\t\tinput_shape = list(input_shape)\r\n\t\t# add one to the channel dimension (assume channels-last)\r\n\t\tinput_shape[-1] += 1\r\n\t\t# convert list to a tuple\r\n\t\treturn tuple(input_shape)\r\n\r\n# weighted sum output\r\nclass WeightedSum(Add):\r\n\t# init with default value\r\n\tdef __init__(self, alpha=0.0, **kwargs):\r\n\t\tsuper(WeightedSum, self).__init__(**kwargs)\r\n\t\tself.alpha = backend.variable(alpha, name='ws_alpha')\r\n\r\n\t# output a weighted sum of inputs\r\n\tdef _merge_function(self, inputs):\r\n\t\t# only supports a weighted sum of two inputs\r\n\t\tassert (len(inputs) == 2)\r\n\t\t# ((1-a) * input1) + (a * input2)\r\n\t\toutput = ((1.0 - self.alpha) * inputs[0]) + (self.alpha * inputs[1])\r\n\t\treturn output\r\n\r\n# calculate wasserstein loss\r\ndef wasserstein_loss(y_true, y_pred):\r\n\treturn backend.mean(y_true * y_pred)\r\n\r\n# add a discriminator block\r\ndef add_discriminator_block(old_model, n_input_layers=3):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# weight constraint\r\n\tconst = max_norm(1.0)\r\n\t# get shape of existing model\r\n\tin_shape = list(old_model.input.shape)\r\n\t# define new input shape as double the size\r\n\tinput_shape = (in_shape[-2].value*2, in_shape[-2].value*2, in_shape[-1].value)\r\n\tin_image = Input(shape=input_shape)\r\n\t# define new input processing layer\r\n\td = Conv2D(128, (1,1), padding='same', kernel_initializer=init, kernel_constraint=const)(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# define new block\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = AveragePooling2D()(d)\r\n\tblock_new = d\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel1 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel1.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# downsample the new larger image\r\n\tdownsample = AveragePooling2D()(in_image)\r\n\t# connect old input processing to downsampled new input\r\n\tblock_old = old_model.layers[1](downsample)\r\n\tblock_old = old_model.layers[2](block_old)\r\n\t# fade in output of old model input layer with new input\r\n\td = WeightedSum()([block_old, block_new])\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel2 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel2.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\treturn [model1, model2]\r\n\r\n# define the discriminator models for each image resolution\r\ndef define_discriminator(n_blocks, input_shape=(4,4,3)):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# weight constraint\r\n\tconst = max_norm(1.0)\r\n\tmodel_list = list()\r\n\t# base model input\r\n\tin_image = Input(shape=input_shape)\r\n\t# conv 1x1\r\n\td = Conv2D(128, (1,1), padding='same', kernel_initializer=init, kernel_constraint=const)(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 3x3 (output block)\r\n\td = MinibatchStdev()(d)\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 4x4\r\n\td = Conv2D(128, (4,4), padding='same', kernel_initializer=init, kernel_constraint=const)(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# dense output layer\r\n\td = Flatten()(d)\r\n\tout_class = Dense(1)(d)\r\n\t# define model\r\n\tmodel = Model(in_image, out_class)\r\n\t# compile model\r\n\tmodel.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_discriminator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list\r\n\r\n# add a generator block\r\ndef add_generator_block(old_model):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# weight constraint\r\n\tconst = max_norm(1.0)\r\n\t# get the end of the last block\r\n\tblock_end = old_model.layers[-2].output\r\n\t# upsample, and define new block\r\n\tupsampling = UpSampling2D()(block_end)\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(upsampling)\r\n\tg = PixelNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\tg = PixelNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# add new output layer\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\t# define model\r\n\tmodel1 = Model(old_model.input, out_image)\r\n\t# get the output layer from old model\r\n\tout_old = old_model.layers[-1]\r\n\t# connect the upsampling to the old output layer\r\n\tout_image2 = out_old(upsampling)\r\n\t# define new output image as the weighted sum of the old and new models\r\n\tmerged = WeightedSum()([out_image2, out_image])\r\n\t# define model\r\n\tmodel2 = Model(old_model.input, merged)\r\n\treturn [model1, model2]\r\n\r\n# define generator models\r\ndef define_generator(latent_dim, n_blocks, in_dim=4):\r\n\t# weight initialization\r\n\tinit = RandomNormal(stddev=0.02)\r\n\t# weight constraint\r\n\tconst = max_norm(1.0)\r\n\tmodel_list = list()\r\n\t# base model latent input\r\n\tin_latent = Input(shape=(latent_dim,))\r\n\t# linear scale up to activation maps\r\n\tg  = Dense(128 * in_dim * in_dim, kernel_initializer=init, kernel_constraint=const)(in_latent)\r\n\tg = Reshape((in_dim, in_dim, 128))(g)\r\n\t# conv 4x4, input block\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\tg = PixelNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 3x3\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\tg = PixelNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 1x1, output block\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer=init, kernel_constraint=const)(g)\r\n\t# define model\r\n\tmodel = Model(in_latent, out_image)\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_generator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list\r\n\r\n# define composite models for training generators via discriminators\r\ndef define_composite(discriminators, generators):\r\n\tmodel_list = list()\r\n\t# create composite models\r\n\tfor i in range(len(discriminators)):\r\n\t\tg_models, d_models = generators[i], discriminators[i]\r\n\t\t# straight-through model\r\n\t\td_models[0].trainable = False\r\n\t\tmodel1 = Sequential()\r\n\t\tmodel1.add(g_models[0])\r\n\t\tmodel1.add(d_models[0])\r\n\t\tmodel1.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t\t# fade-in model\r\n\t\td_models[1].trainable = False\r\n\t\tmodel2 = Sequential()\r\n\t\tmodel2.add(g_models[1])\r\n\t\tmodel2.add(d_models[1])\r\n\t\tmodel2.compile(loss=wasserstein_loss, optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t\t# store\r\n\t\tmodel_list.append([model1, model2])\r\n\treturn model_list\r\n\r\n# load dataset\r\ndef load_real_samples(filename):\r\n\t# load dataset\r\n\tdata = load(filename)\r\n\t# extract numpy array\r\n\tX = data['arr_0']\r\n\t# convert from ints to floats\r\n\tX = X.astype('float32')\r\n\t# scale from [0,255] to [-1,1]\r\n\tX = (X - 127.5) \/ 127.5\r\n\treturn X\r\n\r\n# select real samples\r\ndef generate_real_samples(dataset, n_samples):\r\n\t# choose random instances\r\n\tix = randint(0, dataset.shape[0], n_samples)\r\n\t# select images\r\n\tX = dataset[ix]\r\n\t# generate class labels\r\n\ty = ones((n_samples, 1))\r\n\treturn X, y\r\n\r\n# generate points in latent space as input for the generator\r\ndef generate_latent_points(latent_dim, n_samples):\r\n\t# generate points in the latent space\r\n\tx_input = randn(latent_dim * n_samples)\r\n\t# reshape into a batch of inputs for the network\r\n\tx_input = x_input.reshape(n_samples, latent_dim)\r\n\treturn x_input\r\n\r\n# use the generator to generate n fake examples, with class labels\r\ndef generate_fake_samples(generator, latent_dim, n_samples):\r\n\t# generate points in latent space\r\n\tx_input = generate_latent_points(latent_dim, n_samples)\r\n\t# predict outputs\r\n\tX = generator.predict(x_input)\r\n\t# create class labels\r\n\ty = -ones((n_samples, 1))\r\n\treturn X, y\r\n\r\n# update the alpha value on each instance of WeightedSum\r\ndef update_fadein(models, step, n_steps):\r\n\t# calculate current alpha (linear from 0 to 1)\r\n\talpha = step \/ float(n_steps - 1)\r\n\t# update the alpha for each model\r\n\tfor model in models:\r\n\t\tfor layer in model.layers:\r\n\t\t\tif isinstance(layer, WeightedSum):\r\n\t\t\t\tbackend.set_value(layer.alpha, alpha)\r\n\r\n# train a generator and discriminator\r\ndef train_epochs(g_model, d_model, gan_model, dataset, n_epochs, n_batch, fadein=False):\r\n\t# calculate the number of batches per training epoch\r\n\tbat_per_epo = int(dataset.shape[0] \/ n_batch)\r\n\t# calculate the number of training iterations\r\n\tn_steps = bat_per_epo * n_epochs\r\n\t# calculate the size of half a batch of samples\r\n\thalf_batch = int(n_batch \/ 2)\r\n\t# manually enumerate epochs\r\n\tfor i in range(n_steps):\r\n\t\t# update alpha for all WeightedSum layers when fading in new blocks\r\n\t\tif fadein:\r\n\t\t\tupdate_fadein([g_model, d_model, gan_model], i, n_steps)\r\n\t\t# prepare real and fake samples\r\n\t\tX_real, y_real = generate_real_samples(dataset, half_batch)\r\n\t\tX_fake, y_fake = generate_fake_samples(g_model, latent_dim, half_batch)\r\n\t\t# update discriminator model\r\n\t\td_loss1 = d_model.train_on_batch(X_real, y_real)\r\n\t\td_loss2 = d_model.train_on_batch(X_fake, y_fake)\r\n\t\t# update the generator via the discriminator's error\r\n\t\tz_input = generate_latent_points(latent_dim, n_batch)\r\n\t\ty_real2 = ones((n_batch, 1))\r\n\t\tg_loss = gan_model.train_on_batch(z_input, y_real2)\r\n\t\t# summarize loss on this batch\r\n\t\tprint('>%d, d1=%.3f, d2=%.3f g=%.3f' % (i+1, d_loss1, d_loss2, g_loss))\r\n\r\n# scale images to preferred size\r\ndef scale_dataset(images, new_shape):\r\n\timages_list = list()\r\n\tfor image in images:\r\n\t\t# resize with nearest neighbor interpolation\r\n\t\tnew_image = resize(image, new_shape, 0)\r\n\t\t# store\r\n\t\timages_list.append(new_image)\r\n\treturn asarray(images_list)\r\n\r\n# generate samples and save as a plot and save the model\r\ndef summarize_performance(status, g_model, latent_dim, n_samples=25):\r\n\t# devise name\r\n\tgen_shape = g_model.output_shape\r\n\tname = '%03dx%03d-%s' % (gen_shape[1], gen_shape[2], status)\r\n\t# generate images\r\n\tX, _ = generate_fake_samples(g_model, latent_dim, n_samples)\r\n\t# normalize pixel values to the range [0,1]\r\n\tX = (X - X.min()) \/ (X.max() - X.min())\r\n\t# plot real images\r\n\tsquare = int(sqrt(n_samples))\r\n\tfor i in range(n_samples):\r\n\t\tpyplot.subplot(square, square, 1 + i)\r\n\t\tpyplot.axis('off')\r\n\t\tpyplot.imshow(X[i])\r\n\t# save plot to file\r\n\tfilename1 = 'plot_%s.png' % (name)\r\n\tpyplot.savefig(filename1)\r\n\tpyplot.close()\r\n\t# save the generator model\r\n\tfilename2 = 'model_%s.h5' % (name)\r\n\tg_model.save(filename2)\r\n\tprint('>Saved: %s and %s' % (filename1, filename2))\r\n\r\n# train the generator and discriminator\r\ndef train(g_models, d_models, gan_models, dataset, latent_dim, e_norm, e_fadein, n_batch):\r\n\t# fit the baseline model\r\n\tg_normal, d_normal, gan_normal = g_models[0][0], d_models[0][0], gan_models[0][0]\r\n\t# scale dataset to appropriate size\r\n\tgen_shape = g_normal.output_shape\r\n\tscaled_data = scale_dataset(dataset, gen_shape[1:])\r\n\tprint('Scaled Data', scaled_data.shape)\r\n\t# train normal or straight-through models\r\n\ttrain_epochs(g_normal, d_normal, gan_normal, scaled_data, e_norm[0], n_batch[0])\r\n\tsummarize_performance('tuned', g_normal, latent_dim)\r\n\t# process each level of growth\r\n\tfor i in range(1, len(g_models)):\r\n\t\t# retrieve models for this level of growth\r\n\t\t[g_normal, g_fadein] = g_models[i]\r\n\t\t[d_normal, d_fadein] = d_models[i]\r\n\t\t[gan_normal, gan_fadein] = gan_models[i]\r\n\t\t# scale dataset to appropriate size\r\n\t\tgen_shape = g_normal.output_shape\r\n\t\tscaled_data = scale_dataset(dataset, gen_shape[1:])\r\n\t\tprint('Scaled Data', scaled_data.shape)\r\n\t\t# train fade-in models for next level of growth\r\n\t\ttrain_epochs(g_fadein, d_fadein, gan_fadein, scaled_data, e_fadein[i], n_batch[i], True)\r\n\t\tsummarize_performance('faded', g_normal, latent_dim)\r\n\t\t# train normal or straight-through models\r\n\t\ttrain_epochs(g_normal, d_normal, gan_normal, scaled_data, e_norm[i], n_batch[i])\r\n\t\tsummarize_performance('tuned', g_normal, latent_dim)\r\n\r\n# number of growth phases, e.g. 6 == [4, 8, 16, 32, 64, 128]\r\nn_blocks = 6\r\n# size of the latent space\r\nlatent_dim = 100\r\n# define models\r\nd_models = define_discriminator(n_blocks)\r\n# define models\r\ng_models = define_generator(latent_dim, n_blocks)\r\n# define composite models\r\ngan_models = define_composite(d_models, g_models)\r\n# load image data\r\ndataset = load_real_samples('img_align_celeba_128.npz')\r\nprint('Loaded', dataset.shape)\r\n# train model\r\nn_batch = [16, 16, 16, 8, 4, 4]\r\n# 10 epochs == 500K images per training phase\r\nn_epochs = [5, 8, 8, 10, 10, 10]\r\ntrain(g_models, d_models, gan_models, dataset, latent_dim, n_epochs, n_epochs, n_batch)<\/pre>\n<p><strong>Note<\/strong>: The example can be run on the CPU, although a <a href=\"https:\/\/machinelearningmastery.com\/develop-evaluate-large-deep-learning-models-keras-amazon-web-services\/\">GPU is recommended<\/a>.<\/p>\n<p>Running the example may take a number of hours to complete on modern GPU hardware.<\/p>\n<p><strong>Note<\/strong>: Your specific results will vary given the stochastic nature of the learning algorithm. Consider running the example a few times.<\/p>\n<p>If loss values during the training iterations go to zero or very large\/small numbers, this may be an example of a failure mode and may require a restart of the training process.<\/p>\n<p>Running the example first reports the successful loading of the prepared dataset and the scaling of the dataset to the first image size, then reports the loss of each model for each step of the training process.<\/p>\n<pre class=\"crayon-plain-tag\">Loaded (50000, 128, 128, 3)\r\nScaled Data (50000, 4, 4, 3)\r\n>1, d1=0.993, d2=0.001 g=0.951\r\n>2, d1=0.861, d2=0.118 g=0.982\r\n>3, d1=0.829, d2=0.126 g=0.875\r\n>4, d1=0.774, d2=0.202 g=0.912\r\n>5, d1=0.687, d2=0.035 g=0.911\r\n...<\/pre>\n<p>Plots of generated images and the generator model are saved after each fade-in training phase with filenames like:<\/p>\n<ul>\n<li><em>plot_008x008-faded.png<\/em><\/li>\n<li><em>model_008x008-faded.h5<\/em><\/li>\n<\/ul>\n<p>Plots and models are also saved after each tuning phase, with filenames like:<\/p>\n<ul>\n<li><em>plot_008x008-tuned.png<\/em><\/li>\n<li><em>model_008x008-tuned.h5<\/em><\/li>\n<\/ul>\n<p>Reviewing plots of the generated images at each point helps to see the progression both in the size of supported images and their quality before and after the tuning phase.<\/p>\n<p>For example, below is a sample of images generated after the first 4\u00d74 training phase (<em>plot_004x004-tuned.png<\/em>). At this point, we cannot see much at all.<\/p>\n<div id=\"attachment_8465\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8465\" class=\"size-full wp-image-8465\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-4x4-Resolution-Generated-by-the-Progressive-Growing-GAN.png\" alt=\"Synthetic Celebrity Faces at 4x4 Resolution Generated by the Progressive Growing GAN\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-4x4-Resolution-Generated-by-the-Progressive-Growing-GAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-4x4-Resolution-Generated-by-the-Progressive-Growing-GAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8465\" class=\"wp-caption-text\">Synthetic Celebrity Faces at 4\u00d74 Resolution Generated by the Progressive Growing GAN<\/p>\n<\/div>\n<p>Reviewing generated images after the fade-in training phase for 8\u00d78 images shows more structure (<em>plot_008x008-faded.png<\/em>). The images are blocky but we can see faces.<\/p>\n<div id=\"attachment_8466\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8466\" class=\"size-full wp-image-8466\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-8x8-Resolution-After-Fade-In-Generated-by-the-Progressive-Growing-GAN.png\" alt=\"Synthetic Celebrity Faces at 8x8 Resolution After Fade-In Generated by the Progressive Growing GAN\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-8x8-Resolution-After-Fade-In-Generated-by-the-Progressive-Growing-GAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-8x8-Resolution-After-Fade-In-Generated-by-the-Progressive-Growing-GAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8466\" class=\"wp-caption-text\">Synthetic Celebrity Faces at 8\u00d78 Resolution After Fade-In Generated by the Progressive Growing GAN<\/p>\n<\/div>\n<p>Next, we can contrast the generated images for 16\u00d716 after the fade-in training phase (<em>plot_016x016-faded.png<\/em>) and after the tuning training phase (<em>plot_016x016-tuned.png<\/em>).<\/p>\n<p>We can see that the images are clearly faces and we can see that the fine-tuning phase appears to improve the coloring or tone of the faces and perhaps the structure.<\/p>\n<div id=\"attachment_8467\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8467\" class=\"size-full wp-image-8467\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-16x16-Resolution-After-Fade-In-Generated-by-the-Progressive-Growing-GAN.png\" alt=\"Synthetic Celebrity Faces at 16x16 Resolution After Fade-In Generated by the Progressive Growing GAN\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-16x16-Resolution-After-Fade-In-Generated-by-the-Progressive-Growing-GAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-16x16-Resolution-After-Fade-In-Generated-by-the-Progressive-Growing-GAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8467\" class=\"wp-caption-text\">Synthetic Celebrity Faces at 16\u00d716 Resolution After Fade-In Generated by the Progressive Growing GAN<\/p>\n<\/div>\n<div id=\"attachment_8468\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8468\" class=\"size-full wp-image-8468\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-16x16-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN.png\" alt=\"Synthetic Celebrity Faces at 16x16 Resolution After Tuning Generated by the Progressive Growing GAN\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-16x16-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-16x16-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8468\" class=\"wp-caption-text\">Synthetic Celebrity Faces at 16\u00d716 Resolution After Tuning Generated by the Progressive Growing GAN<\/p>\n<\/div>\n<p>Finally, we can review generated faces after tuning for the remaining 32\u00d732, 64\u00d764, and 128\u00d7128 resolutions. We can see that each step in resolution, the image quality is improved, allowing the model to fill in more structure and detail.<\/p>\n<p>Although not perfect, the generated images show that the progressive growing GAN is capable of not only generating plausible human faces at different resolutions, but it is able to scale building upon what was learned at lower resolutions to generate plausible faces at higher resolutions.<\/p>\n<div id=\"attachment_8469\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8469\" class=\"size-full wp-image-8469\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-32x32-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN.png\" alt=\"Synthetic Celebrity Faces at 32x32 Resolution After Tuning Generated by the Progressive Growing GAN\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-32x32-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-32x32-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8469\" class=\"wp-caption-text\">Synthetic Celebrity Faces at 32\u00d732 Resolution After Tuning Generated by the Progressive Growing GAN<\/p>\n<\/div>\n<div id=\"attachment_8470\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8470\" class=\"size-full wp-image-8470\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-64x64-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN.png\" alt=\"Synthetic Celebrity Faces at 64x64 Resolution After Tuning Generated by the Progressive Growing GAN\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-64x64-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-64x64-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8470\" class=\"wp-caption-text\">Synthetic Celebrity Faces at 64\u00d764 Resolution After Tuning Generated by the Progressive Growing GAN<\/p>\n<\/div>\n<div id=\"attachment_8471\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8471\" class=\"size-full wp-image-8471\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-128x128-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN.png\" alt=\"Synthetic Celebrity Faces at 128x128 Resolution After Tuning Generated by the Progressive Growing GAN\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-128x128-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN.png 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Synthetic-Celebrity-Faces-at-128x128-Resolution-After-Tuning-Generated-by-the-Progressive-Growing-GAN-300x225.png 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8471\" class=\"wp-caption-text\">Synthetic Celebrity Faces at 128\u00d7128 Resolution After Tuning Generated by the Progressive Growing GAN<\/p>\n<\/div>\n<p>Now that we have seen how the generator models can be fit, next we can see how we might load and use a saved generator model.<\/p>\n<h2>How to Synthesize Images With a Progressive Growing GAN Model<\/h2>\n<p>In this section, we will explore how to load a generator model and use it to generate synthetic images on demand.<\/p>\n<p>The <a href=\"https:\/\/machinelearningmastery.com\/save-load-keras-deep-learning-models\/\">saved Keras models can be loaded<\/a> via the <em>load_model()<\/em> function.<\/p>\n<p>Because the generator models use custom layers, we must specify how to load the custom layers. This is achieved by providing a dict to the load_model() function that maps each of the custom layer names to the appropriate class.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# load model\r\ncust = {'PixelNormalization': PixelNormalization, 'MinibatchStdev': MinibatchStdev, 'WeightedSum': WeightedSum}\r\nmodel = load_model('model_016x016-tuned.h5', cust)<\/pre>\n<p>We can then use the <em>generate_latent_points()<\/em> function from the previous section to generate points in latent space as input for the generator model.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# size of the latent space\r\nlatent_dim = 100\r\n# number of images to generate\r\nn_images = 25\r\n# generate images\r\nlatent_points = generate_latent_points(latent_dim, n_images)\r\n# generate images\r\nX = model.predict(latent_points)<\/pre>\n<p>We can then plot the results by first scaling the pixel values to the range [0,1] and plotting each image, in this case in a square grid pattern.<\/p>\n<pre class=\"crayon-plain-tag\"># create a plot of generated images\r\ndef plot_generated(images, n_images):\r\n\t# plot images\r\n\tsquare = int(sqrt(n_images))\r\n\t# normalize pixel values to the range [0,1]\r\n\timages = (images - images.min()) \/ (images.max() - images.min())\r\n\tfor i in range(n_images):\r\n\t\t# define subplot\r\n\t\tpyplot.subplot(square, square, 1 + i)\r\n\t\t# turn off axis\r\n\t\tpyplot.axis('off')\r\n\t\t# plot raw pixel data\r\n\t\tpyplot.imshow(images[i])\r\n\tpyplot.show()<\/pre>\n<p>Tying this together, the complete example of loading a saved progressive growing GAN generator model and using it to generate new faces is listed below.<\/p>\n<p>In this case, we demonstrate loading the tuned model for generating 16\u00d716 faces.<\/p>\n<pre class=\"crayon-plain-tag\"># example of loading the generator model and generating images\r\nfrom math import sqrt\r\nfrom numpy import asarray\r\nfrom numpy.random import randn\r\nfrom numpy.random import randint\r\nfrom keras.layers import Layer\r\nfrom keras.layers import Add\r\nfrom keras import backend\r\nfrom keras.models import load_model\r\nfrom matplotlib import pyplot\r\n\r\n# pixel-wise feature vector normalization layer\r\nclass PixelNormalization(Layer):\r\n\t# initialize the layer\r\n\tdef __init__(self, **kwargs):\r\n\t\tsuper(PixelNormalization, self).__init__(**kwargs)\r\n\r\n\t# perform the operation\r\n\tdef call(self, inputs):\r\n\t\t# calculate square pixel values\r\n\t\tvalues = inputs**2.0\r\n\t\t# calculate the mean pixel values\r\n\t\tmean_values = backend.mean(values, axis=-1, keepdims=True)\r\n\t\t# ensure the mean is not zero\r\n\t\tmean_values += 1.0e-8\r\n\t\t# calculate the sqrt of the mean squared value (L2 norm)\r\n\t\tl2 = backend.sqrt(mean_values)\r\n\t\t# normalize values by the l2 norm\r\n\t\tnormalized = inputs \/ l2\r\n\t\treturn normalized\r\n\r\n\t# define the output shape of the layer\r\n\tdef compute_output_shape(self, input_shape):\r\n\t\treturn input_shape\r\n\r\n# mini-batch standard deviation layer\r\nclass MinibatchStdev(Layer):\r\n\t# initialize the layer\r\n\tdef __init__(self, **kwargs):\r\n\t\tsuper(MinibatchStdev, self).__init__(**kwargs)\r\n\r\n\t# perform the operation\r\n\tdef call(self, inputs):\r\n\t\t# calculate the mean value for each pixel across channels\r\n\t\tmean = backend.mean(inputs, axis=0, keepdims=True)\r\n\t\t# calculate the squared differences between pixel values and mean\r\n\t\tsqu_diffs = backend.square(inputs - mean)\r\n\t\t# calculate the average of the squared differences (variance)\r\n\t\tmean_sq_diff = backend.mean(squ_diffs, axis=0, keepdims=True)\r\n\t\t# add a small value to avoid a blow-up when we calculate stdev\r\n\t\tmean_sq_diff += 1e-8\r\n\t\t# square root of the variance (stdev)\r\n\t\tstdev = backend.sqrt(mean_sq_diff)\r\n\t\t# calculate the mean standard deviation across each pixel coord\r\n\t\tmean_pix = backend.mean(stdev, keepdims=True)\r\n\t\t# scale this up to be the size of one input feature map for each sample\r\n\t\tshape = backend.shape(inputs)\r\n\t\toutput = backend.tile(mean_pix, (shape[0], shape[1], shape[2], 1))\r\n\t\t# concatenate with the output\r\n\t\tcombined = backend.concatenate([inputs, output], axis=-1)\r\n\t\treturn combined\r\n\r\n\t# define the output shape of the layer\r\n\tdef compute_output_shape(self, input_shape):\r\n\t\t# create a copy of the input shape as a list\r\n\t\tinput_shape = list(input_shape)\r\n\t\t# add one to the channel dimension (assume channels-last)\r\n\t\tinput_shape[-1] += 1\r\n\t\t# convert list to a tuple\r\n\t\treturn tuple(input_shape)\r\n\r\n# weighted sum output\r\nclass WeightedSum(Add):\r\n\t# init with default value\r\n\tdef __init__(self, alpha=0.0, **kwargs):\r\n\t\tsuper(WeightedSum, self).__init__(**kwargs)\r\n\t\tself.alpha = backend.variable(alpha, name='ws_alpha')\r\n\r\n\t# output a weighted sum of inputs\r\n\tdef _merge_function(self, inputs):\r\n\t\t# only supports a weighted sum of two inputs\r\n\t\tassert (len(inputs) == 2)\r\n\t\t# ((1-a) * input1) + (a * input2)\r\n\t\toutput = ((1.0 - self.alpha) * inputs[0]) + (self.alpha * inputs[1])\r\n\t\treturn output\r\n\r\n# generate points in latent space as input for the generator\r\ndef generate_latent_points(latent_dim, n_samples):\r\n\t# generate points in the latent space\r\n\tx_input = randn(latent_dim * n_samples)\r\n\t# reshape into a batch of inputs for the network\r\n\tz_input = x_input.reshape(n_samples, latent_dim)\r\n\treturn z_input\r\n\r\n# create a plot of generated images\r\ndef plot_generated(images, n_images):\r\n\t# plot images\r\n\tsquare = int(sqrt(n_images))\r\n\t# normalize pixel values to the range [0,1]\r\n\timages = (images - images.min()) \/ (images.max() - images.min())\r\n\tfor i in range(n_images):\r\n\t\t# define subplot\r\n\t\tpyplot.subplot(square, square, 1 + i)\r\n\t\t# turn off axis\r\n\t\tpyplot.axis('off')\r\n\t\t# plot raw pixel data\r\n\t\tpyplot.imshow(images[i])\r\n\tpyplot.show()\r\n\r\n# load model\r\ncust = {'PixelNormalization': PixelNormalization, 'MinibatchStdev': MinibatchStdev, 'WeightedSum': WeightedSum}\r\nmodel = load_model('model_016x016-tuned.h5', cust)\r\n# size of the latent space\r\nlatent_dim = 100\r\n# number of images to generate\r\nn_images = 25\r\n# generate images\r\nlatent_points = generate_latent_points(latent_dim, n_images)\r\n# generate images\r\nX  = model.predict(latent_points)\r\n# plot the result\r\nplot_generated(X, n_images)<\/pre>\n<p>Running the example loads the model and generates 25 faces that are plotted in a 5\u00d75 grid.<\/p>\n<div id=\"attachment_8472\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8472\" class=\"size-full wp-image-8472\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-16x16-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model.png\" alt=\"Plot of 25 Synthetic Faces with 16x16 Resolution Generated With a Final Progressive Growing GAN Model\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-16x16-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-16x16-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-16x16-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-16x16-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-8472\" class=\"wp-caption-text\">Plot of 25 Synthetic Faces with 16\u00d716 Resolution Generated With a Final Progressive Growing GAN Model<\/p>\n<\/div>\n<p>We can then change the filename to a different model, such as the tuned model for generating 128\u00d7128 faces.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\nmodel = load_model('model_128x128-tuned.h5', cust)<\/pre>\n<p>Re-running the example generates a plot of higher-resolution synthetic faces.<\/p>\n<div id=\"attachment_8473\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8473\" class=\"size-full wp-image-8473\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-128x128-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model.png\" alt=\"Plot of 25 Synthetic Faces With 128x128 Resolution Generated With a Final Progressive Growing GAN Model\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-128x128-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-128x128-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-128x128-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-25-Synthetic-Faces-with-128x128-Resolution-Generated-with-a-Final-Progressive-Growing-GAN-Model-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-8473\" class=\"wp-caption-text\">Plot of 25 Synthetic Faces With 128\u00d7128 Resolution Generated With a Final Progressive Growing GAN Model<\/p>\n<\/div>\n<h2>Extensions<\/h2>\n<p>This section lists some ideas for extending the tutorial that you may wish to explore.<\/p>\n<ul>\n<li><strong>Change Alpha via Callback<\/strong>. Update the example to use a Keras callback to update the alpha value for the WeightedSum layers during fade-in training.<\/li>\n<li><strong>Pre-Scale Dataset<\/strong>. Update the example to pre-scale each dataset and save each version to file to be loaded when needed during training.<\/li>\n<li><strong>Equalized Learning Rate<\/strong>. Update the example to implement the equalized learning rate weight scaling method described in the paper.<\/li>\n<li><strong>Progression in Number of Filters<\/strong>. Update the example to decrease the number of filters with depth in the generator and increase the number of filters with depth in the discriminator to match the configuration in the paper.<\/li>\n<li><strong>Larger Image Size<\/strong>. Update the example to generate large image sizes, such as 512\u00d7512.<\/li>\n<\/ul>\n<p>If you explore any of these extensions, I\u2019d love to know.<br \/>\nPost your findings in the comments below.<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Official<\/h3>\n<ul>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>, 2017.<\/li>\n<li><a href=\"https:\/\/research.nvidia.com\/publication\/2017-10_Progressive-Growing-of\">Progressive Growing of GANs for Improved Quality, Stability, and Variation, Official<\/a>.<\/li>\n<li><a href=\"https:\/\/github.com\/tkarras\/progressive_growing_of_gans\">progressive_growing_of_gans Project (official), GitHub<\/a>.<\/li>\n<li><a href=\"https:\/\/openreview.net\/forum?id=Hk99zCeAb&#038;noteId=Hk99zCeAb\">Progressive Growing of GANs for Improved Quality, Stability, and Variation. Open Review<\/a>.<\/li>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=G06dEcZ-QTg\">Progressive Growing of GANs for Improved Quality, Stability, and Variation, YouTube<\/a>.<\/li>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=ReZiqCybQPA\">Progressive growing of GANs for improved quality, stability and variation, KeyNote, YouTube<\/a>.<\/li>\n<\/ul>\n<h3>API<\/h3>\n<ul>\n<li><a href=\"https:\/\/keras.io\/datasets\/\">Keras Datasets API<\/a>.<\/li>\n<li><a href=\"https:\/\/keras.io\/models\/sequential\/\">Keras Sequential Model API<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/layers\/convolutional\/\">Keras Convolutional Layers API<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/getting-started\/faq\/#how-can-i-freeze-keras-layers\">How can I \u201cfreeze\u201d Keras layers?<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/keras-team\/keras-contrib\">Keras Contrib Project<\/a><\/li>\n<li><a href=\"https:\/\/scikit-image.org\/docs\/dev\/api\/skimage.transform.html#skimage.transform.resize\">skimage.transform.resize API<\/a><\/li>\n<\/ul>\n<h3>Articles<\/h3>\n<ul>\n<li><a href=\"https:\/\/github.com\/MSC-BUAA\/Keras-progressive_growing_of_gans\">Keras-progressive_growing_of_gans Project, GitHub<\/a>.<\/li>\n<li><a href=\"https:\/\/github.com\/PacktPublishing\/Hands-On-Generative-Adversarial-Networks-with-Keras\">Hands-On-Generative-Adversarial-Networks-with-Keras Project, GitHub<\/a>.<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to implement and train a progressive growing generative adversarial network for generating celebrity faces.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to prepare the celebrity faces dataset for training a progressive growing GAN model.<\/li>\n<li>How to define and train the progressive growing GAN on the celebrity faces dataset.<\/li>\n<li>How to load saved generator models and use them for generating ad hoc synthetic celebrity faces.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/how-to-train-a-progressive-growing-gan-in-keras-for-synthesizing-faces\/\">How to Train a Progressive Growing GAN in Keras for Synthesizing Faces<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/how-to-train-a-progressive-growing-gan-in-keras-for-synthesizing-faces\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee Generative adversarial networks, or GANs, are effective at generating high-quality synthetic images. A limitation of GANs is that the are only capable [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/15\/how-to-train-a-progressive-growing-gan-in-keras-for-synthesizing-faces\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":2468,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2467"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2467"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2467\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/2468"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2467"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2467"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2467"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}