{"id":2460,"date":"2019-08-13T19:00:10","date_gmt":"2019-08-13T19:00:10","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/13\/how-to-implement-progressive-growing-gan-models-in-keras\/"},"modified":"2019-08-13T19:00:10","modified_gmt":"2019-08-13T19:00:10","slug":"how-to-implement-progressive-growing-gan-models-in-keras","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/13\/how-to-implement-progressive-growing-gan-models-in-keras\/","title":{"rendered":"How to Implement Progressive Growing GAN Models in Keras"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>The progressive growing generative adversarial network is an approach for training a deep convolutional neural network model for generating synthetic images.<\/p>\n<p>It is an extension of the more traditional GAN architecture that involves incrementally growing the size of the generated image during training, starting with a very small image, such as a 4\u00d74 pixels. This allows the stable training and growth of GAN models capable of generating very large high-quality images, such as images of synthetic celebrity faces with the size of 1024\u00d71024 pixels.<\/p>\n<p>In this tutorial, you will discover how to develop progressive growing generative adversarial network models from scratch with Keras.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to develop pre-defined discriminator and generator models at each level of output image growth.<\/li>\n<li>How to define composite models for training the generator models via the discriminator models.<\/li>\n<li>How to cycle the training of fade-in version and normal versions of models at each level of output image growth.<\/li>\n<\/ul>\n<p>Discover how to develop DCGANs, conditional GANs, Pix2Pix, CycleGANs, and more with Keras <a href=\"https:\/\/machinelearningmastery.com\/generative_adversarial_networks\/\" rel=\"nofollow\">in my new GANs book<\/a>, with 29 step-by-step tutorials and full source code.<\/p>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_8434\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8434\" class=\"size-full wp-image-8434\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/08\/How-to-Implement-Progressive-Growing-GAN-Models-in-Keras.jpg\" alt=\"How to Implement Progressive Growing GAN Models in Keras\" width=\"640\" height=\"360\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/How-to-Implement-Progressive-Growing-GAN-Models-in-Keras.jpg 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/How-to-Implement-Progressive-Growing-GAN-Models-in-Keras-300x169.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-8434\" class=\"wp-caption-text\">How to Implement Progressive Growing GAN Models in Keras<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/dsantoss\/32782524633\/\">Diogo Santos Silva<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into five parts; they are:<\/p>\n<ol>\n<li>What Is the Progressive Growing GAN Architecture?<\/li>\n<li>How to Implement the Progressive Growing GAN Discriminator Model<\/li>\n<li>How to Implement the Progressive Growing GAN Generator Model<\/li>\n<li>How to Implement Composite Models for Updating the Generator<\/li>\n<li>How to Train Discriminator and Generator Models<\/li>\n<\/ol>\n<h2>What Is the Progressive Growing GAN Architecture?<\/h2>\n<p>GANs are effective at generating crisp synthetic images, although are typically limited in the size of the images that can be generated.<\/p>\n<p>The Progressive Growing GAN is an extension to the GAN that allows the training of generator models capable of outputting large high-quality images, such as photorealistic faces with the size 1024\u00d71024 pixels. It was described in the 2017 paper by <a href=\"https:\/\/research.nvidia.com\/person\/tero-karras\">Tero Karras<\/a>, et al. from Nvidia titled \u201c<a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>.\u201d<\/p>\n<p>The key innovation of the Progressive Growing GAN is the incremental increase in the size of images output by the generator starting with a 4\u00d74 pixel image and double to 8\u00d78, 16\u00d716, and so on until the desired output resolution.<\/p>\n<blockquote>\n<p>Our primary contribution is a training methodology for GANs where we start with low-resolution images, and then progressively increase the resolution by adding layers to the networks.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>, 2017.<\/p>\n<p>This is achieved by a training procedure that involves periods of fine-tuning the model with a given output resolution, and periods of slowly phasing in a new model with a larger resolution.<\/p>\n<blockquote>\n<p>When doubling the resolution of the generator (G) and discriminator (D) we fade in the new layers smoothly<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>, 2017.<\/p>\n<p>All layers remain trainable during the training process, including existing layers when new layers are added.<\/p>\n<blockquote>\n<p>All existing layers in both networks remain trainable throughout the training process.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>, 2017.<\/p>\n<p>Progressive Growing GAN involves using a generator and discriminator model with the same general structure and starting with very small images. During training, new blocks of convolutional layers are systematically added to both the generator model and the discriminator models.<\/p>\n<div id=\"attachment_8429\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8429\" class=\"size-large wp-image-8429\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Example-of-Progressively-Adding-Layers-to-Generator-and-Discriminator-Models.-1024x623.png\" alt=\"Example of Progressively Adding Layers to Generator and Discriminator Models.\" width=\"1024\" height=\"623\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Example-of-Progressively-Adding-Layers-to-Generator-and-Discriminator-Models.-1024x623.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Example-of-Progressively-Adding-Layers-to-Generator-and-Discriminator-Models.-300x183.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Example-of-Progressively-Adding-Layers-to-Generator-and-Discriminator-Models.-768x467.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Example-of-Progressively-Adding-Layers-to-Generator-and-Discriminator-Models..png 1144w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/p>\n<p id=\"caption-attachment-8429\" class=\"wp-caption-text\">Example of Progressively Adding Layers to Generator and Discriminator Models.<br \/>Taken from: Progressive Growing of GANs for Improved Quality, Stability, and Variation.<\/p>\n<\/div>\n<p>The incremental addition of the layers allows the models to effectively learn coarse-level detail and later learn ever finer detail, both on the generator and discriminator side.<\/p>\n<blockquote>\n<p>This incremental nature allows the training to first discover the large-scale structure of the image distribution and then shift attention to increasingly finer-scale detail, instead of having to learn all scales simultaneously.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>, 2017.<\/p>\n<p>The model architecture is complex and cannot be implemented directly.<\/p>\n<p>In this tutorial, we will focus on how the progressive growing GAN can be implemented using the Keras deep learning library.<\/p>\n<p>We will step through how each of the discriminator and generator models can be defined, how the generator can be trained via the discriminator model, and how each model can be updated during the training process.<\/p>\n<p>These implementation details will provide the basis for you developing a progressive growing GAN for your own applications.<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Want to Develop GANs from Scratch?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/162526e1b172a2%3A164f8be4f346dc\/5926953912500224\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\" rel=\"noopener noreferrer\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"162526e1b172a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/162526e1b172a2%3A164f8be4f346dc\/5926953912500224\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1562872266.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h2>How to Implement the Progressive Growing GAN Discriminator Model<\/h2>\n<p>The discriminator model is given images as input and must classify them as either real (from the dataset) or fake (generated).<\/p>\n<p>During the training process, the discriminator must grow to support images with ever-increasing size, starting with 4\u00d74 pixel color images and doubling to 8\u00d78, 16\u00d716, 32\u00d732, and so on.<\/p>\n<p>This is achieved by inserting a new input layer to support the larger input image followed by a new block of layers. The output of this new block is then downsampled. Additionally, the new image is also downsampled directly and passed through the old input processing layer before it is combined with the output of the new block.<\/p>\n<p>During the transition from a lower resolution to a higher resolution, e.g. 16\u00d716 to 32\u00d732, the discriminator model will have two input pathways as follows:<\/p>\n<ul>\n<li>[32\u00d732 Image] -> [fromRGB Conv] -> [NewBlock] -> [Downsample] -><\/li>\n<li>[32\u00d732 Image] -> [Downsample] -> [fromRGB Conv] -><\/li>\n<\/ul>\n<p>The output of the new block that is downsampled and the output of the old input processing layer are combined using a weighted average, where the weighting is controlled by a new hyperparameter called <em>alpha<\/em>. The weighted sum is calculated as follows:<\/p>\n<ul>\n<li>Output = ((1 \u2013 alpha) * fromRGB) + (alpha * NewBlock)<\/li>\n<\/ul>\n<p>The weighted average of the two pathways is then fed into the rest of the existing model.<\/p>\n<p>Initially, the weighting is completely biased towards the old input processing layer (<em>alpha=0<\/em>) and is linearly increased over training iterations so that the new block is given more weight until eventually, the output is entirely the product of the new block (<em>alpha=1<\/em>). At this time, the old pathway can be removed.<\/p>\n<p>This can be summarized with the following figure taken from the paper showing a model before growing (a), during the phase-in of the larger resolution (b), and the model after the phase-in (c).<\/p>\n<div id=\"attachment_8430\" style=\"width: 1014px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8430\" class=\"size-full wp-image-8430\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Discriminator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution.png\" alt=\"Figure Showing the Growing of the Discriminator Model, Before (a) During (b) and After (c) the Phase-In of a High Resolution\" width=\"1004\" height=\"310\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Discriminator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution.png 1004w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Discriminator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution-300x93.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Discriminator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution-768x237.png 768w\" sizes=\"(max-width: 1004px) 100vw, 1004px\"><\/p>\n<p id=\"caption-attachment-8430\" class=\"wp-caption-text\">Figure Showing the Growing of the Discriminator Model, Before (a) During (b) and After (c) the Phase-In of a High Resolution.<br \/>Taken from: Progressive Growing of GANs for Improved Quality, Stability, and Variation.<\/p>\n<\/div>\n<p>The <em>fromRGB<\/em> layers are implemented as a <a href=\"https:\/\/machinelearningmastery.com\/convolutional-layers-for-deep-learning-neural-networks\/\">1\u00d71 convolutional layer<\/a>. A block is comprised of two convolutional layers with 3\u00d73 sized filters and the leaky ReLU activation function with a slope of 0.2, followed by a downsampling layer. Average pooling is used for downsampling, which is unlike most other GAN models that use transpose convolutional layers.<\/p>\n<p>The output of the model involves two convolutional layers with 3\u00d73 and 4\u00d74 sized filters and Leaky ReLU activation, followed by a fully connected layer that outputs the single value prediction. The model uses a linear activation function instead of a sigmoid activation function like other discriminator models and is trained directly either by Wasserstein loss (specifically WGAN-GP) or least squares loss; we will use the latter in this tutorial. Model weights are initialized using He Gaussian (he_normal), which is very similar to the method used in the paper.<\/p>\n<p>The model uses a custom layer called Minibatch standard deviation at the beginning of the output block, and instead of batch normalization, each layer uses local response normalization, referred to as pixel-wise normalization in the paper. We will leave out the minibatch normalization and use batch normalization in this tutorial for brevity.<\/p>\n<p>One approach to implementing the progressive growing GAN would be to manually expand a model on demand during training. Another approach is to pre-define all of the models prior to training and carefully use the Keras functional API to ensure that layers are shared across the models and continue training.<\/p>\n<p>I believe the latter approach might be easier and is the approach we will use in this tutorial.<\/p>\n<p>First, we must define a custom layer that we can use when fading in a new higher-resolution input image and block. This new layer must take two sets of activation maps with the same dimensions (width, height, channels) and add them together using a weighted sum.<\/p>\n<p>We can implement this as a new layer called <em>WeightedSum<\/em> that extends the <em>Add<\/em> merge layer and uses a hyperparameter \u2018<em>alpha<\/em>\u2018 to control the contribution of each input. This new class is defined below. The layer assumes only two inputs: the first for the output of the old or existing layers and the second for the newly added layers. The new hyperparameter is defined as a backend variable, meaning that we can change it any time via changing the value of the variable.<\/p>\n<pre class=\"crayon-plain-tag\"># weighted sum output\r\nclass WeightedSum(Add):\r\n\t# init with default value\r\n\tdef __init__(self, alpha=0.0, **kwargs):\r\n\t\tsuper(WeightedSum, self).__init__(**kwargs)\r\n\t\tself.alpha = backend.variable(alpha, name='ws_alpha')\r\n\r\n\t# output a weighted sum of inputs\r\n\tdef _merge_function(self, inputs):\r\n\t\t# only supports a weighted sum of two inputs\r\n\t\tassert (len(inputs) == 2)\r\n\t\t# ((1-a) * input1) + (a * input2)\r\n\t\toutput = ((1.0 - self.alpha) * inputs[0]) + (self.alpha * inputs[1])\r\n\t\treturn output<\/pre>\n<p>The discriminator model is by far more complex than the generator to grow because we have to change the model input, so let\u2019s step through this slowly.<\/p>\n<p>Firstly, we can define a discriminator model that takes a 4\u00d74 color image as input and outputs a prediction of whether the image is real or fake. The model is comprised of a 1\u00d71 input processing layer (fromRGB) and an output block.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# base model input\r\nin_image = Input(shape=(4,4,3))\r\n# conv 1x1\r\ng = Conv2D(64, (1,1), padding='same', kernel_initializer='he_normal')(in_image)\r\ng = LeakyReLU(alpha=0.2)(g)\r\n# conv 3x3 (output block)\r\ng = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\ng = BatchNormalization()(g)\r\ng = LeakyReLU(alpha=0.2)(g)\r\n# conv 4x4\r\ng = Conv2D(128, (4,4), padding='same', kernel_initializer='he_normal')(g)\r\ng = BatchNormalization()(g)\r\ng = LeakyReLU(alpha=0.2)(g)\r\n# dense output layer\r\ng = Flatten()(g)\r\nout_class = Dense(1)(g)\r\n# define model\r\nmodel = Model(in_image, out_class)\r\n# compile model\r\nmodel.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))<\/pre>\n<p>Next, we need to define a new model that handles the intermediate stage between this model and a new discriminator model that takes 8\u00d78 color images as input.<\/p>\n<p>The existing input processing layer must receive a downsampled version of the new 8\u00d78 image. A new input process layer must be defined that takes the 8\u00d78 input image and passes it through a new block of two convolutional layers and a downsampling layer. The output of the new block after downsampling and the old input processing layer must be added together using a weighted sum via our new <em>WeightedSum<\/em> layer and then must reuse the same output block (two convolutional layers and the output layer).<\/p>\n<p>Given the first defined model and our knowledge about this model (e.g. the number of layers in the input processing layer is 2 for the Conv2D and LeakyReLU), we can construct this new intermediate or fade-in model using layer indexes from the old model.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\nold_model = model\r\n# get shape of existing model\r\nin_shape = list(old_model.input.shape)\r\n# define new input shape as double the size\r\ninput_shape = (in_shape[-2].value*2, in_shape[-2].value*2, in_shape[-1].value)\r\nin_image = Input(shape=input_shape)\r\n# define new input processing layer\r\ng = Conv2D(64, (1,1), padding='same', kernel_initializer='he_normal')(in_image)\r\ng = LeakyReLU(alpha=0.2)(g)\r\n# define new block\r\ng = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\ng = BatchNormalization()(g)\r\ng = LeakyReLU(alpha=0.2)(g)\r\ng = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\ng = BatchNormalization()(g)\r\ng = LeakyReLU(alpha=0.2)(g)\r\ng = AveragePooling2D()(g)\r\n# downsample the new larger image\r\ndownsample = AveragePooling2D()(in_image)\r\n# connect old input processing to downsampled new input\r\nblock_old = old_model.layers[1](downsample)\r\nblock_old = old_model.layers[2](block_old)\r\n# fade in output of old model input layer with new input\r\ng = WeightedSum()([block_old, g])\r\n# skip the input, 1x1 and activation for the old model\r\nfor i in range(3, len(old_model.layers)):\r\n\tg = old_model.layers[i](g)\r\n# define straight-through model\r\nmodel = Model(in_image, g)\r\n# compile model\r\nmodel.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))<\/pre>\n<p>So far, so good.<\/p>\n<p>We also need a version of the same model with the same layers without the fade-in of the input from the old model\u2019s input processing layers.<\/p>\n<p>This straight-through version is required for training before we fade-in the next doubling of the input image size.<\/p>\n<p>We can update the above example to create two versions of the model. First, the straight-through version as it is simpler, then the version used for the fade-in that reuses the layers from the new block and the output layers of the old model.<\/p>\n<p>The <em>add_discriminator_block()<\/em> function below implements this, returning a list of the two defined models (straight-through and fade-in), and takes the old model as an argument and defines the number of input layers as a default argument (3).<\/p>\n<p>To ensure that the <em>WeightedSum<\/em> layer works correctly, we have fixed all convolutional layers to always have 64 filters, and in turn, output 64 feature maps. If there is a mismatch between the old model\u2019s input processing layer and the new blocks output in terms of the number of feature maps (channels), then the weighted sum will fail.<\/p>\n<pre class=\"crayon-plain-tag\"># add a discriminator block\r\ndef add_discriminator_block(old_model, n_input_layers=3):\r\n\t# get shape of existing model\r\n\tin_shape = list(old_model.input.shape)\r\n\t# define new input shape as double the size\r\n\tinput_shape = (in_shape[-2].value*2, in_shape[-2].value*2, in_shape[-1].value)\r\n\tin_image = Input(shape=input_shape)\r\n\t# define new input processing layer\r\n\td = Conv2D(64, (1,1), padding='same', kernel_initializer='he_normal')(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# define new block\r\n\td = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = AveragePooling2D()(d)\r\n\tblock_new = d\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel1 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel1.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# downsample the new larger image\r\n\tdownsample = AveragePooling2D()(in_image)\r\n\t# connect old input processing to downsampled new input\r\n\tblock_old = old_model.layers[1](downsample)\r\n\tblock_old = old_model.layers[2](block_old)\r\n\t# fade in output of old model input layer with new input\r\n\td = WeightedSum()([block_old, block_new])\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel2 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel2.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\treturn [model1, model2]<\/pre>\n<p>It is not an elegant function as we have some repetition, but it is readable and will get the job done.<\/p>\n<p>We can then call this function again and again as we double the size of input images. Importantly, the function expects the straight-through version of the prior model as input.<\/p>\n<p>The example below defines a new function called <em>define_discriminator()<\/em> that defines our base model that expects a 4\u00d74 color image as input, then repeatedly adds blocks to create new versions of the discriminator model each time that expects images with quadruple the area.<\/p>\n<pre class=\"crayon-plain-tag\"># define the discriminator models for each image resolution\r\ndef define_discriminator(n_blocks, input_shape=(4,4,3)):\r\n\tmodel_list = list()\r\n\t# base model input\r\n\tin_image = Input(shape=input_shape)\r\n\t# conv 1x1\r\n\td = Conv2D(64, (1,1), padding='same', kernel_initializer='he_normal')(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 3x3 (output block)\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 4x4\r\n\td = Conv2D(128, (4,4), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# dense output layer\r\n\td = Flatten()(d)\r\n\tout_class = Dense(1)(d)\r\n\t# define model\r\n\tmodel = Model(in_image, out_class)\r\n\t# compile model\r\n\tmodel.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_discriminator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list<\/pre>\n<p>This function will return a list of models, where each item in the list is a two-element list that contains first the straight-through version of the model at that resolution, and second the fade-in version of the model for that resolution.<\/p>\n<p>We can tie all of this together and define a new \u201cdiscriminator model\u201d that will grow from 4\u00d74, through to 8\u00d78, and finally to 16\u00d716. This is achieved by passing he <em>n_blocks<\/em> argument to 3 when calling the <em>define_discriminator()<\/em> function, for the creation of three sets of models.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of defining discriminator models for the progressive growing gan\r\nfrom keras.optimizers import Adam\r\nfrom keras.models import Model\r\nfrom keras.layers import Input\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers import Conv2D\r\nfrom keras.layers import AveragePooling2D\r\nfrom keras.layers import LeakyReLU\r\nfrom keras.layers import BatchNormalization\r\nfrom keras.layers import Add\r\nfrom keras.utils.vis_utils import plot_model\r\nfrom keras import backend\r\n\r\n# weighted sum output\r\nclass WeightedSum(Add):\r\n\t# init with default value\r\n\tdef __init__(self, alpha=0.0, **kwargs):\r\n\t\tsuper(WeightedSum, self).__init__(**kwargs)\r\n\t\tself.alpha = backend.variable(alpha, name='ws_alpha')\r\n\r\n\t# output a weighted sum of inputs\r\n\tdef _merge_function(self, inputs):\r\n\t\t# only supports a weighted sum of two inputs\r\n\t\tassert (len(inputs) == 2)\r\n\t\t# ((1-a) * input1) + (a * input2)\r\n\t\toutput = ((1.0 - self.alpha) * inputs[0]) + (self.alpha * inputs[1])\r\n\t\treturn output\r\n\r\n# add a discriminator block\r\ndef add_discriminator_block(old_model, n_input_layers=3):\r\n\t# get shape of existing model\r\n\tin_shape = list(old_model.input.shape)\r\n\t# define new input shape as double the size\r\n\tinput_shape = (in_shape[-2].value*2, in_shape[-2].value*2, in_shape[-1].value)\r\n\tin_image = Input(shape=input_shape)\r\n\t# define new input processing layer\r\n\td = Conv2D(64, (1,1), padding='same', kernel_initializer='he_normal')(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# define new block\r\n\td = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = AveragePooling2D()(d)\r\n\tblock_new = d\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel1 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel1.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# downsample the new larger image\r\n\tdownsample = AveragePooling2D()(in_image)\r\n\t# connect old input processing to downsampled new input\r\n\tblock_old = old_model.layers[1](downsample)\r\n\tblock_old = old_model.layers[2](block_old)\r\n\t# fade in output of old model input layer with new input\r\n\td = WeightedSum()([block_old, block_new])\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel2 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel2.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\treturn [model1, model2]\r\n\r\n# define the discriminator models for each image resolution\r\ndef define_discriminator(n_blocks, input_shape=(4,4,3)):\r\n\tmodel_list = list()\r\n\t# base model input\r\n\tin_image = Input(shape=input_shape)\r\n\t# conv 1x1\r\n\td = Conv2D(64, (1,1), padding='same', kernel_initializer='he_normal')(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 3x3 (output block)\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 4x4\r\n\td = Conv2D(128, (4,4), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# dense output layer\r\n\td = Flatten()(d)\r\n\tout_class = Dense(1)(d)\r\n\t# define model\r\n\tmodel = Model(in_image, out_class)\r\n\t# compile model\r\n\tmodel.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_discriminator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list\r\n\r\n# define models\r\ndiscriminators = define_discriminator(3)\r\n# spot check\r\nm = discriminators[2][1]\r\nm.summary()\r\nplot_model(m, to_file='discriminator_plot.png', show_shapes=True, show_layer_names=True)<\/pre>\n<p>Running the example first summarizes the fade-in version of the third model showing the 16\u00d716 color image inputs and the single value output.<\/p>\n<pre class=\"crayon-plain-tag\">__________________________________________________________________________________________________\r\nLayer (type)                    Output Shape         Param #     Connected to\r\n==================================================================================================\r\ninput_3 (InputLayer)            (None, 16, 16, 3)    0\r\n__________________________________________________________________________________________________\r\nconv2d_7 (Conv2D)               (None, 16, 16, 64)   256         input_3[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_7 (LeakyReLU)       (None, 16, 16, 64)   0           conv2d_7[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_8 (Conv2D)               (None, 16, 16, 64)   36928       leaky_re_lu_7[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_5 (BatchNor (None, 16, 16, 64)   256         conv2d_8[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_8 (LeakyReLU)       (None, 16, 16, 64)   0           batch_normalization_5[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_9 (Conv2D)               (None, 16, 16, 64)   36928       leaky_re_lu_8[0][0]\r\n__________________________________________________________________________________________________\r\naverage_pooling2d_4 (AveragePoo (None, 8, 8, 3)      0           input_3[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_6 (BatchNor (None, 16, 16, 64)   256         conv2d_9[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_4 (Conv2D)               (None, 8, 8, 64)     256         average_pooling2d_4[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_9 (LeakyReLU)       (None, 16, 16, 64)   0           batch_normalization_6[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_4 (LeakyReLU)       (None, 8, 8, 64)     0           conv2d_4[1][0]\r\n__________________________________________________________________________________________________\r\naverage_pooling2d_3 (AveragePoo (None, 8, 8, 64)     0           leaky_re_lu_9[0][0]\r\n__________________________________________________________________________________________________\r\nweighted_sum_2 (WeightedSum)    (None, 8, 8, 64)     0           leaky_re_lu_4[1][0]\r\n                                                                 average_pooling2d_3[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_5 (Conv2D)               (None, 8, 8, 64)     36928       weighted_sum_2[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_3 (BatchNor (None, 8, 8, 64)     256         conv2d_5[2][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_5 (LeakyReLU)       (None, 8, 8, 64)     0           batch_normalization_3[2][0]\r\n__________________________________________________________________________________________________\r\nconv2d_6 (Conv2D)               (None, 8, 8, 64)     36928       leaky_re_lu_5[2][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_4 (BatchNor (None, 8, 8, 64)     256         conv2d_6[2][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_6 (LeakyReLU)       (None, 8, 8, 64)     0           batch_normalization_4[2][0]\r\n__________________________________________________________________________________________________\r\naverage_pooling2d_1 (AveragePoo (None, 4, 4, 64)     0           leaky_re_lu_6[2][0]\r\n__________________________________________________________________________________________________\r\nconv2d_2 (Conv2D)               (None, 4, 4, 128)    73856       average_pooling2d_1[2][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_1 (BatchNor (None, 4, 4, 128)    512         conv2d_2[4][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_2 (LeakyReLU)       (None, 4, 4, 128)    0           batch_normalization_1[4][0]\r\n__________________________________________________________________________________________________\r\nconv2d_3 (Conv2D)               (None, 4, 4, 128)    262272      leaky_re_lu_2[4][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_2 (BatchNor (None, 4, 4, 128)    512         conv2d_3[4][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_3 (LeakyReLU)       (None, 4, 4, 128)    0           batch_normalization_2[4][0]\r\n__________________________________________________________________________________________________\r\nflatten_1 (Flatten)             (None, 2048)         0           leaky_re_lu_3[4][0]\r\n__________________________________________________________________________________________________\r\ndense_1 (Dense)                 (None, 1)            2049        flatten_1[4][0]\r\n==================================================================================================\r\nTotal params: 488,449\r\nTrainable params: 487,425\r\nNon-trainable params: 1,024\r\n__________________________________________________________________________________________________<\/pre>\n<p>A plot of the same fade-in version of the model is created and saved to file.<\/p>\n<p><strong>Note<\/strong>: creating this plot assumes that the pygraphviz and pydot libraries are installed. If this is a problem, comment out the import statement and call to plot_model().<\/p>\n<p>The plot shows the 16\u00d716 input image that is downsampled and passed through the 8\u00d78 input processing layers from the prior model (left). It also shows the addition of the new block (right) and the weighted average that combines both streams of input, before using the existing model layers to continue processing and outputting a prediction.<\/p>\n<div id=\"attachment_8431\" style=\"width: 416px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8431\" class=\"size-large wp-image-8431\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Discriminator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Input-Images-406x1024.png\" alt=\"Plot of the Fade-In Discriminator Model For the Progressive Growing GAN Transitioning From 8x8 to 16x16 Input Images\" width=\"406\" height=\"1024\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Discriminator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Input-Images-406x1024.png 406w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Discriminator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Input-Images-119x300.png 119w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Discriminator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Input-Images-768x1939.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Discriminator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Input-Images.png 1125w\" sizes=\"(max-width: 406px) 100vw, 406px\"><\/p>\n<p id=\"caption-attachment-8431\" class=\"wp-caption-text\">Plot of the Fade-In Discriminator Model For the Progressive Growing GAN Transitioning From 8\u00d78 to 16\u00d716 Input Images<\/p>\n<\/div>\n<p>Now that we have seen how we can define the discriminator models, let\u2019s look at how we can define the generator models.<\/p>\n<h2>How to Implement the Progressive Growing GAN Generator Model<\/h2>\n<p>The generator models for the progressive growing GAN are easier to implement in Keras than the discriminator models.<\/p>\n<p>The reason for this is because each fade-in requires a minor change to the output of the model.<\/p>\n<p>Increasing the resolution of the generator involves first upsampling the output of the end of the last block. This is then connected to the new block and a new output layer for an image that is double the height and width dimensions or quadruple the area. During the phase-in, the upsampling is also connected to the output layer from the old model and the output from both output layers is merged using a weighted average.<\/p>\n<p>After the phase-in is complete, the old output layer is removed.<\/p>\n<p>This can be summarized with the following figure, taken from the paper showing a model before growing (a), during the phase-in of the larger resolution (b), and the model after the phase-in (c).<\/p>\n<div id=\"attachment_8432\" style=\"width: 1034px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8432\" class=\"size-large wp-image-8432\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Generator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution-1024x271.png\" alt=\"Figure Showing the Growing of the Generator Model, Before (a) During (b) and After (c) the Phase-In of a High Resolution\" width=\"1024\" height=\"271\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Generator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution-1024x271.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Generator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution-300x80.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Generator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution-768x204.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Figure-Showing-the-Growing-of-the-Generator-Model-Before-a-During-b-and-After-c-the-Phase-In-of-a-High-Resolution.png 1026w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/p>\n<p id=\"caption-attachment-8432\" class=\"wp-caption-text\">Figure Showing the Growing of the Generator Model, Before (a), During (b), and After (c) the Phase-In of a High Resolution.<br \/>Taken from: Progressive Growing of GANs for Improved Quality, Stability, and Variation.<\/p>\n<\/div>\n<p>The toRGB layer is a convolutional layer with 3 1\u00d71 filters, sufficient to output a color image.<\/p>\n<p>The model takes a point in the latent space as input, e.g. such as a 100-element or 512-element vector as described in the paper. This is scaled up to provided the basis for 4\u00d74 activation maps, followed by a convolutional layer with 4\u00d74 filters and another with 3\u00d73 filters. Like the discriminator, LeakyReLU activations are used, as is pixel normalization, which we will substitute with <a href=\"https:\/\/machinelearningmastery.com\/how-to-accelerate-learning-of-deep-neural-networks-with-batch-normalization\/\">batch normalization<\/a> for brevity.<\/p>\n<p>A block involves an upsample layer followed by two convolutional layers with 3\u00d73 filters. Upsampling is achieved using a nearest neighbor method (e.g. duplicating input rows and columns) via a UpSampling2D layer instead of the more common transpose convolutional layer.<\/p>\n<p>We can define the baseline model that will take a point in latent space as input and output a 4\u00d74 color image as follows:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# base model latent input\r\nin_latent = Input(shape=(100,))\r\n# linear scale up to activation maps\r\ng  = Dense(128 * 4 * 4, kernel_initializer='he_normal')(in_latent)\r\ng = Reshape((4, 4, 128))(g)\r\n# conv 4x4, input block\r\ng = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\ng = BatchNormalization()(g)\r\ng = LeakyReLU(alpha=0.2)(g)\r\n# conv 3x3\r\ng = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\ng = BatchNormalization()(g)\r\ng = LeakyReLU(alpha=0.2)(g)\r\n# conv 1x1, output block\r\nout_image = Conv2D(3, (1,1), padding='same', kernel_initializer='he_normal')(g)\r\n# define model\r\nmodel = Model(in_latent, out_image)<\/pre>\n<p>Next, we need to define a version of the model that uses all of the same input layers, although adds a new block (upsample and 2 convolutional layers) and a new output layer (a 1\u00d71 convolutional layer).<\/p>\n<p>This would be the model after the phase-in to the new output resolution. This can be achieved by using own knowledge about the baseline model and that the end of the last block is the second last layer, e.g. layer at index -2 in the model\u2019s list of layers.<\/p>\n<p>The new model with the addition of a new block and output layer is defined as follows:<\/p>\n<pre class=\"crayon-plain-tag\">...\r\nold_model = model\r\n# get the end of the last block\r\nblock_end = old_model.layers[-2].output\r\n# upsample, and define new block\r\nupsampling = UpSampling2D()(block_end)\r\ng = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(upsampling)\r\ng = BatchNormalization()(g)\r\ng = LeakyReLU(alpha=0.2)(g)\r\ng = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\ng = BatchNormalization()(g)\r\ng = LeakyReLU(alpha=0.2)(g)\r\n# add new output layer\r\nout_image = Conv2D(3, (1,1), padding='same', kernel_initializer='he_normal')(g)\r\n# define model\r\nmodel = Model(old_model.input, out_image)<\/pre>\n<p>That is pretty straightforward; we have chopped off the old output layer at the end of the last block and grafted on a new block and output layer.<\/p>\n<p>Now we need a version of this new model to use during the fade-in.<\/p>\n<p>This involves connecting the old output layer to the new upsampling layer at the start of the new block and using an instance of our WeightedSum layer defined in the previous section to combine the output of the old and new output layers.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# get the output layer from old model\r\nout_old = old_model.layers[-1]\r\n# connect the upsampling to the old output layer\r\nout_image2 = out_old(upsampling)\r\n# define new output image as the weighted sum of the old and new models\r\nmerged = WeightedSum()([out_image2, out_image])\r\n# define model\r\nmodel2 = Model(old_model.input, merged)<\/pre>\n<p>We can combine the definition of these two operations into a function named <em>add_generator_block()<\/em>, defined below, that will expand a given model and return both the new generator model with the added block (<em>model1<\/em>) and a version of the model with the fading in of the new block with the old output layer (<em>model2<\/em>).<\/p>\n<pre class=\"crayon-plain-tag\"># add a generator block\r\ndef add_generator_block(old_model):\r\n\t# get the end of the last block\r\n\tblock_end = old_model.layers[-2].output\r\n\t# upsample, and define new block\r\n\tupsampling = UpSampling2D()(block_end)\r\n\tg = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(upsampling)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\tg = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# add new output layer\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer='he_normal')(g)\r\n\t# define model\r\n\tmodel1 = Model(old_model.input, out_image)\r\n\t# get the output layer from old model\r\n\tout_old = old_model.layers[-1]\r\n\t# connect the upsampling to the old output layer\r\n\tout_image2 = out_old(upsampling)\r\n\t# define new output image as the weighted sum of the old and new models\r\n\tmerged = WeightedSum()([out_image2, out_image])\r\n\t# define model\r\n\tmodel2 = Model(old_model.input, merged)\r\n\treturn [model1, model2]<\/pre>\n<p>We can then call this function with our baseline model to create models with one added block and continue to call it with subsequent models to keep adding blocks.<\/p>\n<p>The <em>define_generator()<\/em> function below implements this, taking the size of the latent space and number of blocks to add (models to create).<\/p>\n<p>The baseline model is defined as outputting a color image with the shape 4\u00d74, controlled by the default argument <em>in_dim<\/em>.<\/p>\n<pre class=\"crayon-plain-tag\"># define generator models\r\ndef define_generator(latent_dim, n_blocks, in_dim=4):\r\n\tmodel_list = list()\r\n\t# base model latent input\r\n\tin_latent = Input(shape=(latent_dim,))\r\n\t# linear scale up to activation maps\r\n\tg  = Dense(128 * in_dim * in_dim, kernel_initializer='he_normal')(in_latent)\r\n\tg = Reshape((in_dim, in_dim, 128))(g)\r\n\t# conv 4x4, input block\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 3x3\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 1x1, output block\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer='he_normal')(g)\r\n\t# define model\r\n\tmodel = Model(in_latent, out_image)\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_generator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list<\/pre>\n<p>We can tie all of this together and define a baseline generator and the addition of two blocks, so three models in total, where a straight-through and fade-in version of each model is defined.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of defining generator models for the progressive growing gan\r\nfrom keras.models import Model\r\nfrom keras.layers import Input\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Reshape\r\nfrom keras.layers import Conv2D\r\nfrom keras.layers import UpSampling2D\r\nfrom keras.layers import LeakyReLU\r\nfrom keras.layers import BatchNormalization\r\nfrom keras.layers import Add\r\nfrom keras.utils.vis_utils import plot_model\r\nfrom keras import backend\r\n\r\n# weighted sum output\r\nclass WeightedSum(Add):\r\n\t# init with default value\r\n\tdef __init__(self, alpha=0.0, **kwargs):\r\n\t\tsuper(WeightedSum, self).__init__(**kwargs)\r\n\t\tself.alpha = backend.variable(alpha, name='ws_alpha')\r\n\r\n\t# output a weighted sum of inputs\r\n\tdef _merge_function(self, inputs):\r\n\t\t# only supports a weighted sum of two inputs\r\n\t\tassert (len(inputs) == 2)\r\n\t\t# ((1-a) * input1) + (a * input2)\r\n\t\toutput = ((1.0 - self.alpha) * inputs[0]) + (self.alpha * inputs[1])\r\n\t\treturn output\r\n\r\n# add a generator block\r\ndef add_generator_block(old_model):\r\n\t# get the end of the last block\r\n\tblock_end = old_model.layers[-2].output\r\n\t# upsample, and define new block\r\n\tupsampling = UpSampling2D()(block_end)\r\n\tg = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(upsampling)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\tg = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# add new output layer\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer='he_normal')(g)\r\n\t# define model\r\n\tmodel1 = Model(old_model.input, out_image)\r\n\t# get the output layer from old model\r\n\tout_old = old_model.layers[-1]\r\n\t# connect the upsampling to the old output layer\r\n\tout_image2 = out_old(upsampling)\r\n\t# define new output image as the weighted sum of the old and new models\r\n\tmerged = WeightedSum()([out_image2, out_image])\r\n\t# define model\r\n\tmodel2 = Model(old_model.input, merged)\r\n\treturn [model1, model2]\r\n\r\n# define generator models\r\ndef define_generator(latent_dim, n_blocks, in_dim=4):\r\n\tmodel_list = list()\r\n\t# base model latent input\r\n\tin_latent = Input(shape=(latent_dim,))\r\n\t# linear scale up to activation maps\r\n\tg  = Dense(128 * in_dim * in_dim, kernel_initializer='he_normal')(in_latent)\r\n\tg = Reshape((in_dim, in_dim, 128))(g)\r\n\t# conv 4x4, input block\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 3x3\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 1x1, output block\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer='he_normal')(g)\r\n\t# define model\r\n\tmodel = Model(in_latent, out_image)\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_generator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list\r\n\r\n# define models\r\ngenerators = define_generator(100, 3)\r\n# spot check\r\nm = generators[2][1]\r\nm.summary()\r\nplot_model(m, to_file='generator_plot.png', show_shapes=True, show_layer_names=True)<\/pre>\n<p>The example chooses the fade-in model for the last model to summarize.<\/p>\n<p>Running the example first summarizes a linear list of the layers in the model. We can see that the last model takes a point from the latent space and outputs a 16\u00d716 image.<\/p>\n<p>This matches as our expectations as the baseline model outputs a 4\u00d74 image, adding one block increases this to 8\u00d78, and adding one more block increases this to 16\u00d716.<\/p>\n<pre class=\"crayon-plain-tag\">__________________________________________________________________________________________________\r\nLayer (type)                    Output Shape         Param #     Connected to\r\n==================================================================================================\r\ninput_1 (InputLayer)            (None, 100)          0\r\n__________________________________________________________________________________________________\r\ndense_1 (Dense)                 (None, 2048)         206848      input_1[0][0]\r\n__________________________________________________________________________________________________\r\nreshape_1 (Reshape)             (None, 4, 4, 128)    0           dense_1[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_1 (Conv2D)               (None, 4, 4, 128)    147584      reshape_1[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_1 (BatchNor (None, 4, 4, 128)    512         conv2d_1[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_1 (LeakyReLU)       (None, 4, 4, 128)    0           batch_normalization_1[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_2 (Conv2D)               (None, 4, 4, 128)    147584      leaky_re_lu_1[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_2 (BatchNor (None, 4, 4, 128)    512         conv2d_2[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_2 (LeakyReLU)       (None, 4, 4, 128)    0           batch_normalization_2[0][0]\r\n__________________________________________________________________________________________________\r\nup_sampling2d_1 (UpSampling2D)  (None, 8, 8, 128)    0           leaky_re_lu_2[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_4 (Conv2D)               (None, 8, 8, 64)     73792       up_sampling2d_1[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_3 (BatchNor (None, 8, 8, 64)     256         conv2d_4[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_3 (LeakyReLU)       (None, 8, 8, 64)     0           batch_normalization_3[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_5 (Conv2D)               (None, 8, 8, 64)     36928       leaky_re_lu_3[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_4 (BatchNor (None, 8, 8, 64)     256         conv2d_5[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_4 (LeakyReLU)       (None, 8, 8, 64)     0           batch_normalization_4[0][0]\r\n__________________________________________________________________________________________________\r\nup_sampling2d_2 (UpSampling2D)  (None, 16, 16, 64)   0           leaky_re_lu_4[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_7 (Conv2D)               (None, 16, 16, 64)   36928       up_sampling2d_2[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_5 (BatchNor (None, 16, 16, 64)   256         conv2d_7[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_5 (LeakyReLU)       (None, 16, 16, 64)   0           batch_normalization_5[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_8 (Conv2D)               (None, 16, 16, 64)   36928       leaky_re_lu_5[0][0]\r\n__________________________________________________________________________________________________\r\nbatch_normalization_6 (BatchNor (None, 16, 16, 64)   256         conv2d_8[0][0]\r\n__________________________________________________________________________________________________\r\nleaky_re_lu_6 (LeakyReLU)       (None, 16, 16, 64)   0           batch_normalization_6[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_6 (Conv2D)               multiple             195         up_sampling2d_2[0][0]\r\n__________________________________________________________________________________________________\r\nconv2d_9 (Conv2D)               (None, 16, 16, 3)    195         leaky_re_lu_6[0][0]\r\n__________________________________________________________________________________________________\r\nweighted_sum_2 (WeightedSum)    (None, 16, 16, 3)    0           conv2d_6[1][0]\r\n                                                                 conv2d_9[0][0]\r\n==================================================================================================\r\nTotal params: 689,030\r\nTrainable params: 688,006\r\nNon-trainable params: 1,024\r\n__________________________________________________________________________________________________<\/pre>\n<p>A plot of the same fade-in version of the model is created and saved to file.<\/p>\n<p><strong>Note<\/strong>: creating this plot assumes that the pygraphviz and pydot libraries are installed. If this is a problem, comment out the import statement and call to <em>plot_model()<\/em>.<\/p>\n<p>We can see that the output from the last block passes through an UpSampling2D layer before feeding the added block and a new output layer as well as the old output layer before being merged via a weighted sum into the final output layer.<\/p>\n<div id=\"attachment_8433\" style=\"width: 367px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8433\" class=\"size-large wp-image-8433\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Generator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Output-Images-357x1024.png\" alt=\"Plot of the Fade-In Generator Model For the Progressive Growing GAN Transitioning From 8x8 to 16x16 Output Images\" width=\"357\" height=\"1024\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Generator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Output-Images-357x1024.png 357w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Generator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Output-Images-105x300.png 105w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Generator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Output-Images-768x2204.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/06\/Plot-of-the-Fade-In-Generator-Model-For-the-Progressive-Growing-GAN-Transitioning-from-8x8-to-16x16-Output-Images.png 951w\" sizes=\"(max-width: 357px) 100vw, 357px\"><\/p>\n<p id=\"caption-attachment-8433\" class=\"wp-caption-text\">Plot of the Fade-In Generator Model For the Progressive Growing GAN Transitioning From 8\u00d78 to 16\u00d716 Output Images<\/p>\n<\/div>\n<p>Now that we have seen how to define the generator models, we can review how the generator models may be updated via the discriminator models.<\/p>\n<h2>How to Implement Composite Models for Updating the Generator<\/h2>\n<p>The discriminator models are trained directly with real and fake images as input and a target value of 0 for fake and 1 for real.<\/p>\n<p>The generator models are not trained directly; instead, they are trained indirectly via the discriminator models, just like a normal GAN model.<\/p>\n<p>We can create a composite model for each level of growth of the model, e.g. pair 4\u00d74 generators and 4\u00d74 discriminators. We can also pair the straight-through models together, and the fade-in models together.<\/p>\n<p>For example, we can retrieve the generator and discriminator models for a given level of growth.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\ng_models, d_models = generators[0], discriminators[0]<\/pre>\n<p>Then we can use them to create a composite model for training the straight-through generator, where the output of the generator is fed directly to the discriminator in order to classify.<\/p>\n<pre class=\"crayon-plain-tag\"># straight-through model\r\nd_models[0].trainable = False\r\nmodel1 = Sequential()\r\nmodel1.add(g_models[0])\r\nmodel1.add(d_models[0])\r\nmodel1.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))<\/pre>\n<p>And do the same for the composite model for the fade-in generator.<\/p>\n<pre class=\"crayon-plain-tag\"># fade-in model\r\nd_models[1].trainable = False\r\nmodel2 = Sequential()\r\nmodel2.add(g_models[1])\r\nmodel2.add(d_models[1])\r\nmodel2.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))<\/pre>\n<p>The function below, named <em>define_composite()<\/em>, automates this; given a list of defined discriminator and generator models, it will create an appropriate composite model for training each generator model.<\/p>\n<pre class=\"crayon-plain-tag\"># define composite models for training generators via discriminators\r\ndef define_composite(discriminators, generators):\r\n\tmodel_list = list()\r\n\t# create composite models\r\n\tfor i in range(len(discriminators)):\r\n\t\tg_models, d_models = generators[i], discriminators[i]\r\n\t\t# straight-through model\r\n\t\td_models[0].trainable = False\r\n\t\tmodel1 = Sequential()\r\n\t\tmodel1.add(g_models[0])\r\n\t\tmodel1.add(d_models[0])\r\n\t\tmodel1.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t\t# fade-in model\r\n\t\td_models[1].trainable = False\r\n\t\tmodel2 = Sequential()\r\n\t\tmodel2.add(g_models[1])\r\n\t\tmodel2.add(d_models[1])\r\n\t\tmodel2.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t\t# store\r\n\t\tmodel_list.append([model1, model2])\r\n\treturn model_list<\/pre>\n<p>Tying this together with the definition of the discriminator and generator models above, the complete example of defining all models at each pre-defined level of growth is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of defining composite models for the progressive growing gan\r\nfrom keras.optimizers import Adam\r\nfrom keras.models import Sequential\r\nfrom keras.models import Model\r\nfrom keras.layers import Input\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers import Reshape\r\nfrom keras.layers import Conv2D\r\nfrom keras.layers import UpSampling2D\r\nfrom keras.layers import AveragePooling2D\r\nfrom keras.layers import LeakyReLU\r\nfrom keras.layers import BatchNormalization\r\nfrom keras.layers import Add\r\nfrom keras.utils.vis_utils import plot_model\r\nfrom keras import backend\r\n\r\n# weighted sum output\r\nclass WeightedSum(Add):\r\n\t# init with default value\r\n\tdef __init__(self, alpha=0.0, **kwargs):\r\n\t\tsuper(WeightedSum, self).__init__(**kwargs)\r\n\t\tself.alpha = backend.variable(alpha, name='ws_alpha')\r\n\r\n\t# output a weighted sum of inputs\r\n\tdef _merge_function(self, inputs):\r\n\t\t# only supports a weighted sum of two inputs\r\n\t\tassert (len(inputs) == 2)\r\n\t\t# ((1-a) * input1) + (a * input2)\r\n\t\toutput = ((1.0 - self.alpha) * inputs[0]) + (self.alpha * inputs[1])\r\n\t\treturn output\r\n\r\n# add a discriminator block\r\ndef add_discriminator_block(old_model, n_input_layers=3):\r\n\t# get shape of existing model\r\n\tin_shape = list(old_model.input.shape)\r\n\t# define new input shape as double the size\r\n\tinput_shape = (in_shape[-2].value*2, in_shape[-2].value*2, in_shape[-1].value)\r\n\tin_image = Input(shape=input_shape)\r\n\t# define new input processing layer\r\n\td = Conv2D(64, (1,1), padding='same', kernel_initializer='he_normal')(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# define new block\r\n\td = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\td = AveragePooling2D()(d)\r\n\tblock_new = d\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel1 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel1.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# downsample the new larger image\r\n\tdownsample = AveragePooling2D()(in_image)\r\n\t# connect old input processing to downsampled new input\r\n\tblock_old = old_model.layers[1](downsample)\r\n\tblock_old = old_model.layers[2](block_old)\r\n\t# fade in output of old model input layer with new input\r\n\td = WeightedSum()([block_old, block_new])\r\n\t# skip the input, 1x1 and activation for the old model\r\n\tfor i in range(n_input_layers, len(old_model.layers)):\r\n\t\td = old_model.layers[i](d)\r\n\t# define straight-through model\r\n\tmodel2 = Model(in_image, d)\r\n\t# compile model\r\n\tmodel2.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\treturn [model1, model2]\r\n\r\n# define the discriminator models for each image resolution\r\ndef define_discriminator(n_blocks, input_shape=(4,4,3)):\r\n\tmodel_list = list()\r\n\t# base model input\r\n\tin_image = Input(shape=input_shape)\r\n\t# conv 1x1\r\n\td = Conv2D(64, (1,1), padding='same', kernel_initializer='he_normal')(in_image)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 3x3 (output block)\r\n\td = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# conv 4x4\r\n\td = Conv2D(128, (4,4), padding='same', kernel_initializer='he_normal')(d)\r\n\td = BatchNormalization()(d)\r\n\td = LeakyReLU(alpha=0.2)(d)\r\n\t# dense output layer\r\n\td = Flatten()(d)\r\n\tout_class = Dense(1)(d)\r\n\t# define model\r\n\tmodel = Model(in_image, out_class)\r\n\t# compile model\r\n\tmodel.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_discriminator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list\r\n\r\n# add a generator block\r\ndef add_generator_block(old_model):\r\n\t# get the end of the last block\r\n\tblock_end = old_model.layers[-2].output\r\n\t# upsample, and define new block\r\n\tupsampling = UpSampling2D()(block_end)\r\n\tg = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(upsampling)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\tg = Conv2D(64, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# add new output layer\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer='he_normal')(g)\r\n\t# define model\r\n\tmodel1 = Model(old_model.input, out_image)\r\n\t# get the output layer from old model\r\n\tout_old = old_model.layers[-1]\r\n\t# connect the upsampling to the old output layer\r\n\tout_image2 = out_old(upsampling)\r\n\t# define new output image as the weighted sum of the old and new models\r\n\tmerged = WeightedSum()([out_image2, out_image])\r\n\t# define model\r\n\tmodel2 = Model(old_model.input, merged)\r\n\treturn [model1, model2]\r\n\r\n# define generator models\r\ndef define_generator(latent_dim, n_blocks, in_dim=4):\r\n\tmodel_list = list()\r\n\t# base model latent input\r\n\tin_latent = Input(shape=(latent_dim,))\r\n\t# linear scale up to activation maps\r\n\tg  = Dense(128 * in_dim * in_dim, kernel_initializer='he_normal')(in_latent)\r\n\tg = Reshape((in_dim, in_dim, 128))(g)\r\n\t# conv 4x4, input block\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 3x3\r\n\tg = Conv2D(128, (3,3), padding='same', kernel_initializer='he_normal')(g)\r\n\tg = BatchNormalization()(g)\r\n\tg = LeakyReLU(alpha=0.2)(g)\r\n\t# conv 1x1, output block\r\n\tout_image = Conv2D(3, (1,1), padding='same', kernel_initializer='he_normal')(g)\r\n\t# define model\r\n\tmodel = Model(in_latent, out_image)\r\n\t# store model\r\n\tmodel_list.append([model, model])\r\n\t# create submodels\r\n\tfor i in range(1, n_blocks):\r\n\t\t# get prior model without the fade-on\r\n\t\told_model = model_list[i - 1][0]\r\n\t\t# create new model for next resolution\r\n\t\tmodels = add_generator_block(old_model)\r\n\t\t# store model\r\n\t\tmodel_list.append(models)\r\n\treturn model_list\r\n\r\n# define composite models for training generators via discriminators\r\ndef define_composite(discriminators, generators):\r\n\tmodel_list = list()\r\n\t# create composite models\r\n\tfor i in range(len(discriminators)):\r\n\t\tg_models, d_models = generators[i], discriminators[i]\r\n\t\t# straight-through model\r\n\t\td_models[0].trainable = False\r\n\t\tmodel1 = Sequential()\r\n\t\tmodel1.add(g_models[0])\r\n\t\tmodel1.add(d_models[0])\r\n\t\tmodel1.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t\t# fade-in model\r\n\t\td_models[1].trainable = False\r\n\t\tmodel2 = Sequential()\r\n\t\tmodel2.add(g_models[1])\r\n\t\tmodel2.add(d_models[1])\r\n\t\tmodel2.compile(loss='mse', optimizer=Adam(lr=0.001, beta_1=0, beta_2=0.99, epsilon=10e-8))\r\n\t\t# store\r\n\t\tmodel_list.append([model1, model2])\r\n\treturn model_list\r\n\r\n# define models\r\ndiscriminators = define_discriminator(3)\r\n# define models\r\ngenerators = define_generator(100, 3)\r\n# define composite models\r\ncomposite = define_composite(discriminators, generators)<\/pre>\n<p>Now that we know how to define all of the models, we can review how the models might be updated during training.<\/p>\n<h2>How to Train Discriminator and Generator Models<\/h2>\n<p>Pre-defining the generator, discriminator, and composite models was the hard part; training the models is straight forward and much like training any other GAN.<\/p>\n<p>Importantly, in each training iteration the alpha variable in each <em>WeightedSum<\/em> layer must be set to a new value. This must be set for the layer in both the generator and discriminator models and allows for the smooth linear transition from the old model layers to the new model layers, e.g. alpha values set from 0 to 1 over a fixed number of training iterations.<\/p>\n<p>The <em>update_fadein()<\/em> function below implements this and will loop through a list of models and set the alpha value on each based on the current step in a given number of training steps. You may be able to implement this more elegantly using a callback.<\/p>\n<pre class=\"crayon-plain-tag\"># update the alpha value on each instance of WeightedSum\r\ndef update_fadein(models, step, n_steps):\r\n\t# calculate current alpha (linear from 0 to 1)\r\n\talpha = step \/ float(n_steps - 1)\r\n\t# update the alpha for each model\r\n\tfor model in models:\r\n\t\tfor layer in model.layers:\r\n\t\t\tif isinstance(layer, WeightedSum):\r\n\t\t\t\tbackend.set_value(layer.alpha, alpha)<\/pre>\n<p>We can define a generic function for training a given generator, discriminator, and composite model for a given number of training epochs.<\/p>\n<p>The <em>train_epochs()<\/em> function below implements this where first the discriminator model is updated on real and fake images, then the generator model is updated, and the process is repeated for the required number of training iterations based on the dataset size and the number of epochs.<\/p>\n<p>This function calls helper functions for retrieving a batch of real images via <em>generate_real_samples()<\/em>, generating a batch of fake samples with the generator <em>generate_fake_samples()<\/em>, and generating a sample of points in latent space <em>generate_latent_points()<\/em>. You can define these functions yourself quite trivially.<\/p>\n<pre class=\"crayon-plain-tag\"># train a generator and discriminator\r\ndef train_epochs(g_model, d_model, gan_model, dataset, n_epochs, n_batch, fadein=False):\r\n\t# calculate the number of batches per training epoch\r\n\tbat_per_epo = int(dataset.shape[0] \/ n_batch)\r\n\t# calculate the number of training iterations\r\n\tn_steps = bat_per_epo * n_epochs\r\n\t# calculate the size of half a batch of samples\r\n\thalf_batch = int(n_batch \/ 2)\r\n\t# manually enumerate epochs\r\n\tfor i in range(n_steps):\r\n\t\t# update alpha for all WeightedSum layers when fading in new blocks\r\n\t\tif fadein:\r\n\t\t\tupdate_fadein([g_model, d_model, gan_model], i, n_steps)\r\n\t\t# prepare real and fake samples\r\n\t\tX_real, y_real = generate_real_samples(dataset, half_batch)\r\n\t\tX_fake, y_fake = generate_fake_samples(g_model, latent_dim, half_batch)\r\n\t\t# update discriminator model\r\n\t\td_loss1 = d_model.train_on_batch(X_real, y_real)\r\n\t\td_loss2 = d_model.train_on_batch(X_fake, y_fake)\r\n\t\t# update the generator via the discriminator's error\r\n\t\tz_input = generate_latent_points(latent_dim, n_batch)\r\n\t\ty_real2 = ones((n_batch, 1))\r\n\t\tg_loss = gan_model.train_on_batch(z_input, y_real2)\r\n\t\t# summarize loss on this batch\r\n\t\tprint('>%d, d1=%.3f, d2=%.3f g=%.3f' % (i+1, d_loss1, d_loss2, g_loss))<\/pre>\n<p>The images must be scaled to the size of each model. If the images are in-memory, we can define a simple scale_dataset() function to scale the loaded images.<\/p>\n<p>In this case, we are using the <a href=\"https:\/\/scikit-image.org\/docs\/dev\/api\/skimage.transform.html#skimage.transform.resize\">skimage.transform.resize<\/a> function from the <a href=\"https:\/\/scikit-image.org\/\">scikit-image library<\/a> to resize the NumPy array of pixels to the required size and use nearest neighbor interpolation.<\/p>\n<pre class=\"crayon-plain-tag\"># scale images to preferred size\r\ndef scale_dataset(images, new_shape):\r\n\timages_list = list()\r\n\tfor image in images:\r\n\t\t# resize with nearest neighbor interpolation\r\n\t\tnew_image = resize(image, new_shape, 0)\r\n\t\t# store\r\n\t\timages_list.append(new_image)\r\n\treturn asarray(images_list)<\/pre>\n<p>First, the baseline model must be fit for a given number of training epochs, e.g. the model that outputs 4\u00d74 sized images.<\/p>\n<p>This will require that the loaded images be scaled to the required size defined by the shape of the generator models output layer.<\/p>\n<pre class=\"crayon-plain-tag\"># fit the baseline model\r\ng_normal, d_normal, gan_normal = g_models[0][0], d_models[0][0], gan_models[0][0]\r\n# scale dataset to appropriate size\r\ngen_shape = g_normal.output_shape\r\nscaled_data = scale_dataset(dataset, gen_shape[1:])\r\nprint('Scaled Data', scaled_data.shape)\r\n# train normal or straight-through models\r\ntrain_epochs(g_normal, d_normal, gan_normal, scaled_data, e_norm, n_batch)<\/pre>\n<p>We can then process each level of growth, e.g. the first being 8\u00d78.<\/p>\n<p>This involves first retrieving the models, scaling the data to the appropriate size, then fitting the fade-in model followed by training the straight-through version of the model for fine tuning.<\/p>\n<p>We can repeat this for each level of growth in a loop.<\/p>\n<pre class=\"crayon-plain-tag\"># process each level of growth\r\nfor i in range(1, len(g_models)):\r\n\t# retrieve models for this level of growth\r\n\t[g_normal, g_fadein] = g_models[i]\r\n\t[d_normal, d_fadein] = d_models[i]\r\n\t[gan_normal, gan_fadein] = gan_models[i]\r\n\t# scale dataset to appropriate size\r\n\tgen_shape = g_normal.output_shape\r\n\tscaled_data = scale_dataset(dataset, gen_shape[1:])\r\n\tprint('Scaled Data', scaled_data.shape)\r\n\t# train fade-in models for next level of growth\r\n\ttrain_epochs(g_fadein, d_fadein, gan_fadein, scaled_data, e_fadein, n_batch)\r\n\t# train normal or straight-through models\r\n\ttrain_epochs(g_normal, d_normal, gan_normal, scaled_data, e_norm, n_batch)<\/pre>\n<p>We can tie this together and define a function called <em>train()<\/em> to train the progressive growing GAN function.<\/p>\n<pre class=\"crayon-plain-tag\"># train the generator and discriminator\r\ndef train(g_models, d_models, gan_models, dataset, latent_dim, e_norm, e_fadein, n_batch):\r\n\t# fit the baseline model\r\n\tg_normal, d_normal, gan_normal = g_models[0][0], d_models[0][0], gan_models[0][0]\r\n\t# scale dataset to appropriate size\r\n\tgen_shape = g_normal.output_shape\r\n\tscaled_data = scale_dataset(dataset, gen_shape[1:])\r\n\tprint('Scaled Data', scaled_data.shape)\r\n\t# train normal or straight-through models\r\n\ttrain_epochs(g_normal, d_normal, gan_normal, scaled_data, e_norm, n_batch)\r\n\t# process each level of growth\r\n\tfor i in range(1, len(g_models)):\r\n\t\t# retrieve models for this level of growth\r\n\t\t[g_normal, g_fadein] = g_models[i]\r\n\t\t[d_normal, d_fadein] = d_models[i]\r\n\t\t[gan_normal, gan_fadein] = gan_models[i]\r\n\t\t# scale dataset to appropriate size\r\n\t\tgen_shape = g_normal.output_shape\r\n\t\tscaled_data = scale_dataset(dataset, gen_shape[1:])\r\n\t\tprint('Scaled Data', scaled_data.shape)\r\n\t\t# train fade-in models for next level of growth\r\n\t\ttrain_epochs(g_fadein, d_fadein, gan_fadein, scaled_data, e_fadein, n_batch, True)\r\n\t\t# train normal or straight-through models\r\n\t\ttrain_epochs(g_normal, d_normal, gan_normal, scaled_data, e_norm, n_batch)<\/pre>\n<p>The number of epochs for the normal phase is defined by the <em>e_norm<\/em> argument and the number of epochs during the fade-in phase is defined by the <em>e_fadein<\/em> argument.<\/p>\n<p>The number of epochs must be specified based on the size of the image dataset and the same number of epochs can be used for each phase, as was used in the paper.<\/p>\n<blockquote>\n<p>We start with 4\u00d74 resolution and train the networks until we have shown the discriminator 800k real images in total. We then alternate between two phases: fade in the first 3-layer block during the next 800k images, stabilize the networks for 800k images, fade in the next 3-layer block during 800k images, etc.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>, 2017.<\/p>\n<p>We can then define our models as we did in the previous section, then call the training function.<\/p>\n<pre class=\"crayon-plain-tag\"># number of growth phase, e.g. 3 = 16x16 images\r\nn_blocks = 3\r\n# size of the latent space\r\nlatent_dim = 100\r\n# define models\r\nd_models = define_discriminator(n_blocks)\r\n# define models\r\ng_models = define_generator(100, n_blocks)\r\n# define composite models\r\ngan_models = define_composite(d_models, g_models)\r\n# load image data\r\ndataset = load_real_samples()\r\n# train model\r\ntrain(g_models, d_models, gan_models, dataset, latent_dim, 100, 100, 16)<\/pre>\n<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Official<\/h3>\n<ul>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1710.10196\">Progressive Growing of GANs for Improved Quality, Stability, and Variation<\/a>, 2017.<\/li>\n<li><a href=\"https:\/\/research.nvidia.com\/publication\/2017-10_Progressive-Growing-of\">Progressive Growing of GANs for Improved Quality, Stability, and Variation, Official<\/a>.<\/li>\n<li><a href=\"https:\/\/github.com\/tkarras\/progressive_growing_of_gans\">progressive_growing_of_gans Project (official), GitHub<\/a>.<\/li>\n<li><a href=\"https:\/\/openreview.net\/forum?id=Hk99zCeAb&#038;noteId=Hk99zCeAb\">Progressive Growing of GANs for Improved Quality, Stability, and Variation. Open Review<\/a>.<\/li>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=G06dEcZ-QTg\">Progressive Growing of GANs for Improved Quality, Stability, and Variation, YouTube<\/a>.<\/li>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=ReZiqCybQPA\">Progressive growing of GANs for improved quality, stability and variation, KeyNote, YouTube<\/a>.<\/li>\n<\/ul>\n<h3>API<\/h3>\n<ul>\n<li><a href=\"https:\/\/keras.io\/datasets\/\">Keras Datasets API<\/a>.<\/li>\n<li><a href=\"https:\/\/keras.io\/models\/sequential\/\">Keras Sequential Model API<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/layers\/convolutional\/\">Keras Convolutional Layers API<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/getting-started\/faq\/#how-can-i-freeze-keras-layers\">How can I \u201cfreeze\u201d Keras layers?<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/keras-team\/keras-contrib\">Keras Contrib Project<\/a><\/li>\n<li><a href=\"https:\/\/scikit-image.org\/docs\/dev\/api\/skimage.transform.html#skimage.transform.resize\">skimage.transform.resize API<\/a><\/li>\n<\/ul>\n<h3>Articles<\/h3>\n<ul>\n<li><a href=\"https:\/\/github.com\/MSC-BUAA\/Keras-progressive_growing_of_gans\">Keras-progressive_growing_of_gans Project, GitHub<\/a>.<\/li>\n<li><a href=\"https:\/\/github.com\/PacktPublishing\/Hands-On-Generative-Adversarial-Networks-with-Keras\">Hands-On-Generative-Adversarial-Networks-with-Keras Project, GitHub<\/a>.<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to develop progressive growing generative adversarial network models from scratch with Keras.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to develop pre-defined discriminator and generator models at each level of output image growth.<\/li>\n<li>How to define composite models for training the generator models via the discriminator models.<\/li>\n<li>How to cycle the training of fade-in version and normal versions of models at each level of output image growth.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/how-to-implement-progressive-growing-gan-models-in-keras\/\">How to Implement Progressive Growing GAN Models in Keras<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/how-to-implement-progressive-growing-gan-models-in-keras\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee The progressive growing generative adversarial network is an approach for training a deep convolutional neural network model for generating synthetic images. It [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/13\/how-to-implement-progressive-growing-gan-models-in-keras\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":2461,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2460"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2460"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2460\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/2461"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2460"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2460"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2460"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}