{"id":2068,"date":"2019-04-28T19:00:08","date_gmt":"2019-04-28T19:00:08","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/04\/28\/a-gentle-introduction-to-1x1-convolutions-to-reduce-the-complexity-of-convolutional-neural-networks\/"},"modified":"2019-04-28T19:00:08","modified_gmt":"2019-04-28T19:00:08","slug":"a-gentle-introduction-to-1x1-convolutions-to-reduce-the-complexity-of-convolutional-neural-networks","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/04\/28\/a-gentle-introduction-to-1x1-convolutions-to-reduce-the-complexity-of-convolutional-neural-networks\/","title":{"rendered":"A Gentle Introduction to 1\u00d71 Convolutions to Reduce the Complexity of Convolutional Neural Networks"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>Pooling can be used to down sample the content of feature maps, reducing their width and height whilst maintaining their salient features.<\/p>\n<p>A problem with deep convolutional neural networks is that the number of feature maps often increases with the depth of the network. This problem can result in a dramatic increase in the number of parameters and computation required when larger filter sizes are used, such as 5\u00d75 and 7\u00d77.<\/p>\n<p>To address this problem, a 1\u00d71 convolutional layer can be used that offers a channel-wise pooling, often called feature map pooling or a projection layer. This simple technique can be used for dimensionality reduction, decreasing the number of feature maps whilst retaining their salient features. It can also be used directly to create a one-to-one projection of the feature maps to pool features across channels or to increase the number of feature maps, such as after traditional pooling layers.<\/p>\n<p>In this tutorial, you will discover how to use 1\u00d71 filters to control the number of feature maps in a convolutional neural network.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>The 1\u00d71 filter can be used to create a linear projection of a stack of feature maps.<\/li>\n<li>The projection created by a 1\u00d71 can act like channel-wise pooling and be used for dimensionality reduction.<\/li>\n<li>The projection created by a 1\u00d71 can also be used directly or be used to increase the number of feature maps in a model.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_7500\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7500\" class=\"size-full wp-image-7500\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/04\/A-Gentle-Introduction-to-1x1-Convolutions-to-Reduce-the-Complexity-of-Convolutional-Neural-Networks.jpg\" alt=\"A Gentle Introduction to 1x1 Convolutions to Reduce the Complexity of Convolutional Neural Networks\" width=\"640\" height=\"480\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/04\/A-Gentle-Introduction-to-1x1-Convolutions-to-Reduce-the-Complexity-of-Convolutional-Neural-Networks.jpg 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/04\/A-Gentle-Introduction-to-1x1-Convolutions-to-Reduce-the-Complexity-of-Convolutional-Neural-Networks-300x225.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p id=\"caption-attachment-7500\" class=\"wp-caption-text\">A Gentle Introduction to 1\u00d71 Convolutions to Reduce the Complexity of Convolutional Neural Networks<br \/><a href=\"https:\/\/www.flickr.com\/photos\/98703782@N04\/9278049442\/\">Photo copyright<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into five parts; they are:<\/p>\n<ol>\n<li>Convolutions Over Channels<\/li>\n<li>Problem of Too Many Feature Maps<\/li>\n<li>Downsample Feature Maps With 1\u00d71 Filters<\/li>\n<li>Examples of How to Use 1\u00d71 Convolutions<\/li>\n<li>Examples of 1\u00d71 Filters in CNN Model Architectures<\/li>\n<\/ol>\n<h2>Convolutions Over Channels<\/h2>\n<p>Recall that a convolutional operation is a linear application of a smaller filter to a larger input that results in an output feature map.<\/p>\n<p>A filter applied to an input image or input feature map always results in a single number. The systematic left-to-right and top-to-bottom application of the filter to the input results in a two-dimensional feature map. One filter creates one corresponding feature map.<\/p>\n<p>A filter must have the same depth or number of channels as the input, yet, regardless of the depth of the input and the filter, the resulting output is a single number and one filter creates a feature map with a single channel.<\/p>\n<p>Let\u2019s make this concrete with some examples:<\/p>\n<ul>\n<li>If the input has one channel such as a grayscale image, then a 3\u00d73 filter will be applied in 3x3x1 blocks.<\/li>\n<li>If the input image has three channels for red, green, and blue, then a 3\u00d73 filter will be applied in 3x3x3 blocks.<\/li>\n<li>If the input is a block of feature maps from another convolutional or pooling layer and has the depth of 64, then the 3\u00d73 filter will be applied in 3x3x64 blocks to create the single values to make up the single output feature map.<\/li>\n<\/ul>\n<p>The depth of the output of one convolutional layer is only defined by the number of parallel filters applied to the input.<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Want Results with Deep Learning for Computer Vision?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/1458ca1e0972a2%3A164f8be4f346dc\/4715926590455808\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\" rel=\"noopener noreferrer\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"1458ca1e0972a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/1458ca1e0972a2%3A164f8be4f346dc\/4715926590455808\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1553357564.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h2>Problem of Too Many Feature Maps<\/h2>\n<p>The depth of the input or number of filters used in convolutional layers often increases with the depth of the network, resulting in an increase in the number of resulting feature maps. It is a common model design pattern.<\/p>\n<p>Further, some network architectures, such as the inception architecture, may also concatenate the output feature maps from multiple convolutional layers, which may also dramatically increase the depth of the input to subsequent convolutional layers.<\/p>\n<p>A large number of feature maps in a convolutional neural network can cause a problem as a convolutional operation must be performed down through the depth of the input. This is a particular problem if the convolutional operation being performed is relatively large, such as 5\u00d75 or 7\u00d77 pixels, as it can result in considerably more parameters (weights) and, in turn, computation to perform the convolutional operations (large space and time complexity).<\/p>\n<p>Pooling layers are designed to downscale feature maps and systematically halve the width and height of feature maps in the network. Nevertheless, pooling layers do not change the number of filters in the model, the depth, or number of channels.<\/p>\n<p>Deep convolutional neural networks require a corresponding pooling type of layer that can downsample or reduce the depth or number of feature maps.<\/p>\n<h2>Downsample Feature Maps With 1\u00d71 Filters<\/h2>\n<p>The solution is to use a 1\u00d71 filter to down sample the depth or number of feature maps.<\/p>\n<p>A 1\u00d71 filter will only have a single parameter or weight for each channel in the input, and like the application of any filter results in a single output value. This structure allows the 1\u00d71 filter to act like a single neuron with an input from the same position across each of the feature maps in the input. This single neuron can then be applied systematically with a <a href=\"https:\/\/machinelearningmastery.com\/padding-and-stride-for-convolutional-neural-networks\/\">stride of one<\/a>, left-to-right and top-to-bottom without any need for padding, resulting in a feature map with the same width and height as the input.<\/p>\n<p>The 1\u00d71 filter is so simple that it does not involve any neighboring pixels in the input; it may not be considered a convolutional operation. Instead, it is a linear weighting or projection of the input. Further, a nonlinearity is used as with other convolutional layers, allowing the projection to perform non-trivial computation on the input feature maps.<\/p>\n<p>This simple 1\u00d71 filter provides a way to usefully summarize the input feature maps. The use of multiple 1\u00d71 filters, in turn, allows the tuning of the number of summaries of the input feature maps to create, effectively allowing the depth of the feature maps to be increased or decreased as needed.<\/p>\n<p>A convolutional layer with a 1\u00d71 filter can, therefore, be used at any point in a convolutional neural network to control the number of feature maps. As such, it is often referred to as a projection operation or projection layer, or even a feature map or channel pooling layer.<\/p>\n<p>Now that we know that we can control the number of feature maps with 1\u00d71 filters, let\u2019s make it concrete with some examples.<\/p>\n<h2>Examples of How to Use 1\u00d71 Convolutions<\/h2>\n<p>We can make the use of a 1\u00d71 filter concrete with some examples.<\/p>\n<p>Consider that we have a convolutional neural network that expected color images input with the square shape of 256x256x3 pixels.<\/p>\n<p>These images then pass through a first hidden layer with 512 filters, each with the size of 3\u00d73 with the same padding, followed by a ReLU activation function.<\/p>\n<p>The example below demonstrates this simple model.<\/p>\n<pre class=\"crayon-plain-tag\"># example of simple cnn model\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Conv2D\r\n# create model\r\nmodel = Sequential()\r\nmodel.add(Conv2D(512, (3,3), padding='same', activation='relu', input_shape=(256, 256, 3)))\r\n# summarize model\r\nmodel.summary()<\/pre>\n<p>Running the example creates the model and summarizes the model architecture.<\/p>\n<p>There are no surprises; the output of the first hidden layer is a block of feature maps with the three-dimensional shape of 256x256x512.<\/p>\n<pre class=\"crayon-plain-tag\">_________________________________________________________________\r\nLayer (type)                 Output Shape              Param #\r\n=================================================================\r\nconv2d_1 (Conv2D)            (None, 256, 256, 512)     14336\r\n=================================================================\r\nTotal params: 14,336\r\nTrainable params: 14,336\r\nNon-trainable params: 0\r\n_________________________________________________________________<\/pre>\n<\/p>\n<h3>Example of Projecting Feature Maps<\/h3>\n<p>A 1\u00d71 filter can be used to create a projection of the feature maps.<\/p>\n<p>The number of feature maps created will be the same number and the effect may be a refinement of the features already extracted. This is often called channel-wise pooling, as opposed to traditional feature-wise pooling on each channel. It can be implemented as follows:<\/p>\n<pre class=\"crayon-plain-tag\">model.add(Conv2D(512, (1,1), activation='relu'))<\/pre>\n<p>We can see that we use the same number of features and still follow the application of the filter with a rectified linear activation function.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of a 1x1 filter for projection\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Conv2D\r\n# create model\r\nmodel = Sequential()\r\nmodel.add(Conv2D(512, (3,3), padding='same', activation='relu', input_shape=(256, 256, 3)))\r\nmodel.add(Conv2D(512, (1,1), activation='relu'))\r\n# summarize model\r\nmodel.summary()<\/pre>\n<p>Running the example creates the model and summarizes the architecture.<\/p>\n<p>We can see that no change is made to the width or height of the feature maps, and by design, the number of feature maps is kept constant with a simple projection operation applied.<\/p>\n<pre class=\"crayon-plain-tag\">_________________________________________________________________\r\nLayer (type)                 Output Shape              Param #\r\n=================================================================\r\nconv2d_1 (Conv2D)            (None, 256, 256, 512)     14336\r\n_________________________________________________________________\r\nconv2d_2 (Conv2D)            (None, 256, 256, 512)     262656\r\n=================================================================\r\nTotal params: 276,992\r\nTrainable params: 276,992\r\nNon-trainable params: 0\r\n_________________________________________________________________<\/pre>\n<\/p>\n<h3>Example of Decreasing Feature Maps<\/h3>\n<p>The 1\u00d71 filter can be used to decrease the number of feature maps.<\/p>\n<p>This is the most common application of this type of filter and in this way, the layer is often called a feature map pooling layer.<\/p>\n<p>In this example, we can decrease the depth (or channels) from 512 to 64. This might be useful if the subsequent layer we were going to add to our model would be another convolutional layer with 7\u00d77 filters. These filters would only be applied at a depth of 64 rather than 512.<\/p>\n<pre class=\"crayon-plain-tag\">model.add(Conv2D(64, (1,1), activation='relu'))<\/pre>\n<p>The composition of the 64 feature maps is not the same as the original 512, but contains a useful summary of dimensionality reduction that captures the salient features, such that the 7\u00d77 operation may have a similar effect on the 64 feature maps as it might have on the original 512.<\/p>\n<p>Further, a 7\u00d77 convolutional layer with 64 filters itself applied to the 512 feature maps output by the first hidden layer would result in approximately one million parameters (weights). If the 1\u00d71 filter is used to reduce the number of feature maps to 64 first, then the number of parameters required for the 7\u00d77 layer is only approximately 200,000, an enormous difference.<\/p>\n<p>The complete example of using a 1\u00d71 filter for dimensionality reduction is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of a 1x1 filter for dimensionality reduction\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Conv2D\r\n# create model\r\nmodel = Sequential()\r\nmodel.add(Conv2D(512, (3,3), padding='same', activation='relu', input_shape=(256, 256, 3)))\r\nmodel.add(Conv2D(64, (1,1), activation='relu'))\r\n# summarize model\r\nmodel.summary()<\/pre>\n<p>Running the example creates the model and summarizes its structure.<\/p>\n<p>We can see that the width and height of the feature maps are unchanged, yet the number of feature maps was reduced from 512 to 64.<\/p>\n<pre class=\"crayon-plain-tag\">_________________________________________________________________\r\nLayer (type)                 Output Shape              Param #\r\n=================================================================\r\nconv2d_1 (Conv2D)            (None, 256, 256, 512)     14336\r\n_________________________________________________________________\r\nconv2d_2 (Conv2D)            (None, 256, 256, 64)      32832\r\n=================================================================\r\nTotal params: 47,168\r\nTrainable params: 47,168\r\nNon-trainable params: 0\r\n_________________________________________________________________<\/pre>\n<\/p>\n<h3>Example of Increasing Feature Maps<\/h3>\n<p>The 1\u00d71 filter can be used to increase the number of feature maps.<\/p>\n<p>This is a common operation used after a pooling layer prior to applying another convolutional layer.<\/p>\n<p>The projection effect of the filter can be applied as many times as needed to the input, allowing the number of feature maps to be scaled up and yet have a composition that captures the salient features of the original.<\/p>\n<p>We can increase the number of feature maps from 512 input from the first hidden layer to double the size at 1,024 feature maps.<\/p>\n<pre class=\"crayon-plain-tag\">model.add(Conv2D(1024, (1,1), activation='relu'))<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of a 1x1 filter to increase dimensionality\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Conv2D\r\n# create model\r\nmodel = Sequential()\r\nmodel.add(Conv2D(512, (3,3), padding='same', activation='relu', input_shape=(256, 256, 3)))\r\nmodel.add(Conv2D(1024, (1,1), activation='relu'))\r\n# summarize model\r\nmodel.summary()<\/pre>\n<p>Running the example creates the model and summarizes its structure.<\/p>\n<p>We can see that the width and height of the feature maps are unchanged and that the number of feature maps was increased from 512 to double the size at 1,024.<\/p>\n<pre class=\"crayon-plain-tag\">_________________________________________________________________\r\nLayer (type)                 Output Shape              Param #\r\n=================================================================\r\nconv2d_1 (Conv2D)            (None, 256, 256, 512)     14336\r\n_________________________________________________________________\r\nconv2d_2 (Conv2D)            (None, 256, 256, 1024)    525312\r\n=================================================================\r\nTotal params: 539,648\r\nTrainable params: 539,648\r\nNon-trainable params: 0\r\n_________________________________________________________________<\/pre>\n<p>Now that we are familiar with how to use 1\u00d71 filters, let\u2019s look at some examples where they have been used in the architecture of convolutional neural network models.<\/p>\n<h2>Examples of 1\u00d71 Filters in CNN Model Architectures<\/h2>\n<p>In this section, we will highlight some important examples where 1\u00d71 filters have been used as key elements in modern convolutional neural network model architectures.<\/p>\n<h3>Network in Network<\/h3>\n<p>The 1\u00d71 filter was perhaps first described and popularized in the 2013 paper by Min Lin, et al. in their paper titled \u201c<a href=\"https:\/\/arxiv.org\/abs\/1312.4400\">Network In Network<\/a>.\u201d<\/p>\n<p>In the paper, the authors propose the need for an MLP convolutional layer and the need for cross-channel pooling to promote learning across channels.<\/p>\n<blockquote>\n<p>This cascaded cross channel parametric pooling structure allows complex and learnable interactions of cross channel information.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1312.4400\">Network In Network<\/a>, 2013.<\/p>\n<p>They describe a 1\u00d71 convolutional layer as a specific implementation of cross-channel parametric pooling, which, in effect, that is exactly what a 1\u00d71 filter achieves.<\/p>\n<blockquote>\n<p>Each pooling layer performs weighted linear recombination on the input feature maps, which then go through a rectifier linear unit. [\u2026] The cross channel parametric pooling layer is also equivalent to a convolution layer with 1\u00d71 convolution kernel.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1312.4400\">Network In Network<\/a>, 2013.<\/p>\n<h3>Inception Architecture<\/h3>\n<p>The 1\u00d71 filter was used explicitly for dimensionality reduction and for increasing the dimensionality of feature maps after pooling in the design of the inception module, used in the GoogLeNet model by Christian Szegedy, et al. in their 2014 paper titled \u201c<a href=\"https:\/\/arxiv.org\/abs\/1409.4842\">Going Deeper with Convolutions<\/a>.\u201d<\/p>\n<p>The paper describes an \u201c<em>inception module<\/em>\u201d where an input block of feature maps is processed in parallel by different convolutional layers each with differently sized filters, where a 1\u00d71 size filter is one of the layers used.<\/p>\n<div id=\"attachment_7497\" style=\"width: 788px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7497\" class=\"size-full wp-image-7497\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/02\/Example-of-the-Naive-Inception-Module-1.png\" alt=\"Example of the Naive Inception Module\" width=\"778\" height=\"382\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/02\/Example-of-the-Naive-Inception-Module-1.png 778w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/02\/Example-of-the-Naive-Inception-Module-1-300x147.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/02\/Example-of-the-Naive-Inception-Module-1-768x377.png 768w\" sizes=\"(max-width: 778px) 100vw, 778px\"><\/p>\n<p id=\"caption-attachment-7497\" class=\"wp-caption-text\">Example of the Naive Inception Module<br \/>Taken from Going Deeper with Convolutions, 2014.<\/p>\n<\/div>\n<p>The output of the parallel layers are then stacked, channel-wise, resulting in very deep stacks of convolutional layers to be processed by subsequent inception modules.<\/p>\n<blockquote>\n<p>The merging of the output of the pooling layer with the outputs of convolutional layers would lead to an inevitable increase in the number of outputs from stage to stage. Even while this architecture might cover the optimal sparse structure, it would do it very inefficiently, leading to a computational blow up within a few stages.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1409.4842\">Going Deeper with Convolutions<\/a>, 2014.<\/p>\n<p>The inception module is then redesigned to use 1\u00d71 filters to reduce the number of feature maps prior to parallel convolutional layers with 5\u00d75 and 7\u00d77 sized filters.<\/p>\n<blockquote>\n<p>This leads to the second idea of the proposed architecture: judiciously applying dimension reductions and projections wherever the computational requirements would increase too much otherwise. [\u2026] That is, 1\u00d71 convolutions are used to compute reductions before the expensive 3\u00d73 and 5\u00d75 convolutions. Besides being used as reductions, they also include the use of rectified linear activation which makes them dual-purpose<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1409.4842\">Going Deeper with Convolutions<\/a>, 2014.<\/p>\n<p>The 1\u00d71 filter is also used to increase the number of feature maps after pooling, artificially creating more projections of the downsampled feature map content.<\/p>\n<div id=\"attachment_7498\" style=\"width: 780px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7498\" class=\"size-full wp-image-7498\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/02\/Example-of-the-Inception-Module-with-Dimensionality-Reduction.png\" alt=\"Example of the Inception Module with Dimensionality Reduction\" width=\"770\" height=\"422\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/02\/Example-of-the-Inception-Module-with-Dimensionality-Reduction.png 770w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/02\/Example-of-the-Inception-Module-with-Dimensionality-Reduction-300x164.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/02\/Example-of-the-Inception-Module-with-Dimensionality-Reduction-768x421.png 768w\" sizes=\"(max-width: 770px) 100vw, 770px\"><\/p>\n<p id=\"caption-attachment-7498\" class=\"wp-caption-text\">Example of the Inception Module With Dimensionality Reduction<br \/>Taken from Going Deeper with Convolutions, 2014.<\/p>\n<\/div>\n<h3>Residual Architecture<\/h3>\n<p>The 1\u00d71 filter was used as a projection technique to match the number of filters of input to the output of residual modules in the design of the residual network by Kaiming He, et al. in their 2015 paper titled \u201c<a href=\"https:\/\/arxiv.org\/abs\/1512.03385\">Deep Residual Learning for Image Recognition<\/a>.\u201d<\/p>\n<p>The authors describe an architecture comprised of \u201c<em>residual modules<\/em>\u201d where the input to a module is added to the output of the module in what is referred to as a shortcut connection.<\/p>\n<p>Because the input is added to the output of the module, the dimensionality must match in terms of width, height, and depth. Width and height can be maintained via padding, although a 1\u00d71 filter is used to change the depth of the input as needed so that it can be added with the output of the module. This type of connection is referred to as a projection shortcut connection.<\/p>\n<p>Further, the residual modules use a bottleneck design with 1\u00d71 filters to reduce the number of feature maps for computational efficiency reasons.<\/p>\n<blockquote>\n<p>The three layers are 1\u00d71, 3\u00d73, and 1\u00d71 convolutions, where the 1\u00d71 layers are responsible for reducing and then increasing (restoring) dimensions, leaving the 3\u00d73 layer a bottleneck with smaller input\/output dimensions.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/arxiv.org\/abs\/1512.03385\">Deep Residual Learning for Image Recognition<\/a>, 2015.<\/p>\n<div id=\"attachment_7499\" style=\"width: 734px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7499\" class=\"size-full wp-image-7499\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/02\/Example-of-a-Normal-and-Bottleneck-Residual-Modules-with-Shortcut-Connections.png\" alt=\"Example of a Normal and Bottleneck Residual Modules with Shortcut Connections\" width=\"724\" height=\"300\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/02\/Example-of-a-Normal-and-Bottleneck-Residual-Modules-with-Shortcut-Connections.png 724w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/02\/Example-of-a-Normal-and-Bottleneck-Residual-Modules-with-Shortcut-Connections-300x124.png 300w\" sizes=\"(max-width: 724px) 100vw, 724px\"><\/p>\n<p id=\"caption-attachment-7499\" class=\"wp-caption-text\">Example of a Normal and Bottleneck Residual Modules With Shortcut Connections<br \/>Taken from Deep Residual Learning for Image Recognition, 2015.<\/p>\n<\/div>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Papers<\/h3>\n<ul>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1312.4400\">Network In Network<\/a>, 2013.<\/li>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1409.4842\">Going Deeper with Convolutions<\/a>, 2014.<\/li>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1512.03385\">Deep Residual Learning for Image Recognition<\/a>, 2015.<\/li>\n<\/ul>\n<h2>Articles<\/h2>\n<ul>\n<li><a href=\"https:\/\/iamaaditya.github.io\/2016\/03\/one-by-one-convolution\/\">One by One [ 1 x 1 ] Convolution \u2013 counter-intuitively useful<\/a>, 2016.<\/li>\n<li><a href=\"https:\/\/www.facebook.com\/yann.lecun\/posts\/10152820758292143\">Yann LeCun on No Fully Connected Layers in CNN<\/a>, 2015.<\/li>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=c1RBQzKsDCk\">Networks in Networks and 1\u00d71 Convolutions, YouTube<\/a>.<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to use 1\u00d71 filters to control the number of feature maps in a convolutional neural network.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>The 1\u00d71 filter can be used to create a linear projection of a stack of feature maps.<\/li>\n<li>The projection created by a 1\u00d71 can act like channel-wise pooling and be used for dimensionality reduction.<\/li>\n<li>The projection created by a 1\u00d71 can also be used directly or be used to increase the number of feature maps in a model.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/introduction-to-1x1-convolutions-to-reduce-the-complexity-of-convolutional-neural-networks\/\">A Gentle Introduction to 1\u00d71 Convolutions to Reduce the Complexity of Convolutional Neural Networks<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/introduction-to-1x1-convolutions-to-reduce-the-complexity-of-convolutional-neural-networks\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee Pooling can be used to down sample the content of feature maps, reducing their width and height whilst maintaining their salient features. [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/04\/28\/a-gentle-introduction-to-1x1-convolutions-to-reduce-the-complexity-of-convolutional-neural-networks\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":2069,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2068"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2068"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2068\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/2069"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2068"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2068"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2068"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}