{"id":1282,"date":"2018-11-11T18:00:29","date_gmt":"2018-11-11T18:00:29","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/11\/11\/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting\/"},"modified":"2018-11-11T18:00:29","modified_gmt":"2018-11-11T18:00:29","slug":"how-to-develop-convolutional-neural-network-models-for-time-series-forecasting","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/11\/11\/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting\/","title":{"rendered":"How to Develop Convolutional Neural Network Models for Time Series Forecasting"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>Convolutional Neural Network models, or CNNs for short, can be applied to time series forecasting.<\/p>\n<p>There are many types of CNN models that can be used for each specific type of time series forecasting problem.<\/p>\n<p>In this tutorial, you will discover how to develop a suite of CNN models for a range of standard time series forecasting problems.<\/p>\n<p>The objective of this tutorial is to provide standalone examples of each model on each type of time series problem as a template that you can copy and adapt for your specific time series forecasting problem.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to develop CNN models for univariate time series forecasting.<\/li>\n<li>How to develop CNN models for multivariate time series forecasting.<\/li>\n<li>How to develop CNN models for multi-step time series forecasting.<\/li>\n<\/ul>\n<p>This is a large and important post; you may want to bookmark it for future reference.<\/p>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_6433\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6433\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/11\/How-to-Develop-Convolutional-Neural-Network-Models-for-Time-Series-Forecasting.jpg\" alt=\"How to Develop Convolutional Neural Network Models for Time Series Forecasting\" width=\"640\" height=\"457\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/11\/How-to-Develop-Convolutional-Neural-Network-Models-for-Time-Series-Forecasting.jpg 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/11\/How-to-Develop-Convolutional-Neural-Network-Models-for-Time-Series-Forecasting-300x214.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p class=\"wp-caption-text\">How to Develop Convolutional Neural Network Models for Time Series Forecasting<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/blmoregon\/35464087364\/\">Bureau of Land Management<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>In this tutorial, we will explore how to develop a suite of different types of CNN models for time series forecasting.<\/p>\n<p>The models are demonstrated on small contrived time series problems intended to give the flavor of the type of time series problem being addressed. The chosen configuration of the models is arbitrary and not optimized for each problem; that was not the goal.<\/p>\n<p>This tutorial is divided into four parts; they are:<\/p>\n<ol>\n<li>Univariate CNN Models<\/li>\n<li>Multivariate CNN Models<\/li>\n<li>Multi-Step CNN Models<\/li>\n<li>Multivariate Multi-Step CNN Models<\/li>\n<\/ol>\n<h2>Univariate CNN Models<\/h2>\n<p>Although traditionally developed for two-dimensional image data, CNNs can be used to model univariate time series forecasting problems.<\/p>\n<p>Univariate time series are datasets comprised of a single series of observations with a temporal ordering and a model is required to learn from the series of past observations to predict the next value in the sequence.<\/p>\n<p>This section is divided into two parts; they are:<\/p>\n<ol>\n<li>Data Preparation<\/li>\n<li>CNN Model<\/li>\n<\/ol>\n<h3>Data Preparation<\/h3>\n<p>Before a univariate series can be modeled, it must be prepared.<\/p>\n<p>The CNN model will learn a function that maps a sequence of past observations as input to an output observation. As such, the sequence of observations must be transformed into multiple examples from which the model can learn.<\/p>\n<p>Consider a given univariate sequence:<\/p>\n<pre class=\"crayon-plain-tag\">[10, 20, 30, 40, 50, 60, 70, 80, 90]<\/pre>\n<p>We can divide the sequence into multiple input\/output patterns called samples, where three time steps are used as input and one time step is used as output for the one-step prediction that is being learned.<\/p>\n<pre class=\"crayon-plain-tag\">X,\t\t\t\ty\r\n10, 20, 30\t\t40\r\n20, 30, 40\t\t50\r\n30, 40, 50\t\t60\r\n...<\/pre>\n<p>The <em>split_sequence()<\/em> function below implements this behavior and will split a given univariate sequence into multiple samples where each sample has a specified number of time steps and the output is a single time step.<\/p>\n<pre class=\"crayon-plain-tag\"># split a univariate sequence into samples\r\ndef split_sequence(sequence, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequence)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the sequence\r\n\t\tif end_ix > len(sequence)-1:\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequence[i:end_ix], sequence[end_ix]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)<\/pre>\n<p>We can demonstrate this function on our small contrived dataset above.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># univariate data preparation\r\nfrom numpy import array\r\n\r\n# split a univariate sequence into samples\r\ndef split_sequence(sequence, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequence)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the sequence\r\n\t\tif end_ix > len(sequence)-1:\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequence[i:end_ix], sequence[end_ix]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nraw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]\r\n# choose a number of time steps\r\nn_steps = 3\r\n# split into samples\r\nX, y = split_sequence(raw_seq, n_steps)\r\n# summarize the data\r\nfor i in range(len(X)):\r\n\tprint(X[i], y[i])<\/pre>\n<p>Running the example splits the univariate series into six samples where each sample has three input time steps and one output time step.<\/p>\n<pre class=\"crayon-plain-tag\">[10 20 30] 40\r\n[20 30 40] 50\r\n[30 40 50] 60\r\n[40 50 60] 70\r\n[50 60 70] 80\r\n[60 70 80] 90<\/pre>\n<p>Now that we know how to prepare a univariate series for modeling, let\u2019s look at developing a CNN model that can learn the mapping of inputs to outputs.<\/p>\n<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Need help with Deep Learning for Time Series?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14531ee73f72a2%3A164f8be4f346dc\/5630742793027584\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"14531ee73f72a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14531ee73f72a2%3A164f8be4f346dc\/5630742793027584\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1534880695.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h3>CNN Model<\/h3>\n<p>A one-dimensional CNN is a CNN model that has a convolutional hidden layer that operates over a 1D sequence. This is followed by perhaps a second convolutional layer in some cases, such as very long input sequences, and then a pooling layer whose job it is to distill the output of the convolutional layer to the most salient elements.<\/p>\n<p>The convolutional and pooling layers are followed by a dense fully connected layer that interprets the features extracted by the convolutional part of the model. A flatten layer is used between the convolutional layers and the dense layer to reduce the feature maps to a single one-dimensional vector.<\/p>\n<p>We can define a 1D CNN Model for univariate time series forecasting as follows.<\/p>\n<pre class=\"crayon-plain-tag\"># define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(1))\r\nmodel.compile(optimizer='adam', loss='mse')<\/pre>\n<p>Key in the definition is the shape of the input; that is what the model expects as input for each sample in terms of the number of time steps and the number of features.<\/p>\n<p>We are working with a univariate series, so the number of features is one, for one variable.<\/p>\n<p>The number of time steps as input is the number we chose when preparing our dataset as an argument to the <em>split_sequence()<\/em> function.<\/p>\n<p>The input shape for each sample is specified in the <em>input_shape<\/em> argument on the definition of the first hidden layer.<\/p>\n<p>We almost always have multiple samples, therefore, the model will expect the input component of training data to have the dimensions or shape:<\/p>\n<pre class=\"crayon-plain-tag\">[samples, timesteps, features]<\/pre>\n<p>Our <em>split_sequence()<\/em> function in the previous section outputs the X with the shape [<em>samples, timesteps<\/em>], so we can easily reshape it to have an additional dimension for the one feature.<\/p>\n<pre class=\"crayon-plain-tag\"># reshape from [samples, timesteps] into [samples, timesteps, features]\r\nn_features = 1\r\nX = X.reshape((X.shape[0], X.shape[1], n_features))<\/pre>\n<p>The CNN does not actually view the data as having time steps, instead, it is treated as a sequence over which convolutional read operations can be performed, like a one-dimensional image.<\/p>\n<p>In this example, we define a convolutional layer with 64 filter maps and a kernel size of 2. This is followed by a max pooling layer and a dense layer to interpret the input feature. An output layer is specified that predicts a single numerical value.<\/p>\n<p>The model is fit using the efficient <a href=\"https:\/\/machinelearningmastery.com\/adam-optimization-algorithm-for-deep-learning\/\">Adam version of stochastic gradient descent<\/a> and optimized using the mean squared error, or \u2018<em>mse<\/em>\u2018, loss function.<\/p>\n<p>Once the model is defined, we can fit it on the training dataset.<\/p>\n<pre class=\"crayon-plain-tag\"># fit model\r\nmodel.fit(X, y, epochs=1000, verbose=0)<\/pre>\n<p>After the model is fit, we can use it to make a prediction.<\/p>\n<p>We can predict the next value in the sequence by providing the input:<\/p>\n<pre class=\"crayon-plain-tag\">[70, 80, 90]<\/pre>\n<p>And expecting the model to predict something like:<\/p>\n<pre class=\"crayon-plain-tag\">[100]<\/pre>\n<p>The model expects the input shape to be three-dimensional with [<em>samples, timesteps, features<\/em>], therefore, we must reshape the single input sample before making the prediction.<\/p>\n<pre class=\"crayon-plain-tag\"># demonstrate prediction\r\nx_input = array([70, 80, 90])\r\nx_input = x_input.reshape((1, n_steps, n_features))\r\nyhat = model.predict(x_input, verbose=0)<\/pre>\n<p>We can tie all of this together and demonstrate how to develop a 1D CNN model for univariate time series forecasting and make a single prediction.<\/p>\n<pre class=\"crayon-plain-tag\"># univariate cnn example\r\nfrom numpy import array\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers.convolutional import Conv1D\r\nfrom keras.layers.convolutional import MaxPooling1D\r\n\r\n# split a univariate sequence into samples\r\ndef split_sequence(sequence, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequence)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the sequence\r\n\t\tif end_ix > len(sequence)-1:\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequence[i:end_ix], sequence[end_ix]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nraw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]\r\n# choose a number of time steps\r\nn_steps = 3\r\n# split into samples\r\nX, y = split_sequence(raw_seq, n_steps)\r\n# reshape from [samples, timesteps] into [samples, timesteps, features]\r\nn_features = 1\r\nX = X.reshape((X.shape[0], X.shape[1], n_features))\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(1))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit(X, y, epochs=1000, verbose=0)\r\n# demonstrate prediction\r\nx_input = array([70, 80, 90])\r\nx_input = x_input.reshape((1, n_steps, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example prepares the data, fits the model, and makes a prediction.<\/p>\n<p>Your results may vary given the stochastic nature of the algorithm; try running the example a few times.<\/p>\n<p>We can see that the model predicts the next value in the sequence.<\/p>\n<pre class=\"crayon-plain-tag\">[[101.67965]]<\/pre>\n<\/p>\n<h2>Multivariate CNN Models<\/h2>\n<p>Multivariate time series data means data where there is more than one observation for each time step.<\/p>\n<p>There are two main models that we may require with multivariate time series data; they are:<\/p>\n<ol>\n<li>Multiple Input Series.<\/li>\n<li>Multiple Parallel Series.<\/li>\n<\/ol>\n<p>Let\u2019s take a look at each in turn.<\/p>\n<h3>Multiple Input Series<\/h3>\n<p>A problem may have two or more parallel input time series and an output time series that is dependent on the input time series.<\/p>\n<p>The input time series are parallel because each series has observations at the same time steps.<\/p>\n<p>We can demonstrate this with a simple example of two parallel input time series where the output series is the simple addition of the input series.<\/p>\n<pre class=\"crayon-plain-tag\"># define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])<\/pre>\n<p>We can reshape these three arrays of data as a single dataset where each row is a time step and each column is a separate time series.<\/p>\n<p>This is a standard way of storing parallel time series in a CSV file.<\/p>\n<pre class=\"crayon-plain-tag\"># convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate data preparation\r\nfrom numpy import array\r\nfrom numpy import hstack\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\nprint(dataset)<\/pre>\n<p>Running the example prints the dataset with one row per time step and one column for each of the two input and one output parallel time series.<\/p>\n<pre class=\"crayon-plain-tag\">[[ 10  15  25]\r\n [ 20  25  45]\r\n [ 30  35  65]\r\n [ 40  45  85]\r\n [ 50  55 105]\r\n [ 60  65 125]\r\n [ 70  75 145]\r\n [ 80  85 165]\r\n [ 90  95 185]]<\/pre>\n<p>As with the univariate time series, we must structure these data into samples with input and output samples.<\/p>\n<p>A 1D CNN model needs sufficient context to learn a mapping from an input sequence to an output value. CNNs can support parallel input time series as separate channels, like red, green, and blue components of an image. Therefore, we need to split the data into samples maintaining the order of observations across the two input sequences.<\/p>\n<p>If we chose three input time steps, then the first sample would look as follows:<\/p>\n<p>Input:<\/p>\n<pre class=\"crayon-plain-tag\">10, 15\r\n20, 25\r\n30, 35<\/pre>\n<p>Output:<\/p>\n<pre class=\"crayon-plain-tag\">65<\/pre>\n<p>That is, the first three time steps of each parallel series are provided as input to the model and the model associates this with the value in the output series at the third time step, in this case, 65.<\/p>\n<p>We can see that, in transforming the time series into input\/output samples to train the model, that we will have to discard some values from the output time series where we do not have values in the input time series at prior time steps. In turn, the choice of the size of the number of input time steps will have an important effect on how much of the training data is used.<\/p>\n<p>We can define a function named <em>split_sequences()<\/em> that will take a dataset as we have defined it with rows for time steps and columns for parallel series and return input\/output samples.<\/p>\n<pre class=\"crayon-plain-tag\"># split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the dataset\r\n\t\tif end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)<\/pre>\n<p>We can test this function on our dataset using three time steps for each input time series as input.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate data preparation\r\nfrom numpy import array\r\nfrom numpy import hstack\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the dataset\r\n\t\tif end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps = 3\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps)\r\nprint(X.shape, y.shape)\r\n# summarize the data\r\nfor i in range(len(X)):\r\n\tprint(X[i], y[i])<\/pre>\n<p>Running the example first prints the shape of the <em>X<\/em> and <em>y<\/em> components.<\/p>\n<p>We can see that the <em>X<\/em> component has a three-dimensional structure.<\/p>\n<p>The first dimension is the number of samples, in this case 7. The second dimension is the number of time steps per sample, in this case 3, the value specified to the function. Finally, the last dimension specifies the number of parallel time series or the number of variables, in this case 2 for the two parallel series.<\/p>\n<p>This is the exact three-dimensional structure expected by a 1D CNN as input. The data is ready to use without further reshaping.<\/p>\n<p>We can then see that the input and output for each sample is printed, showing the three time steps for each of the two input series and the associated output for each sample.<\/p>\n<pre class=\"crayon-plain-tag\">(7, 3, 2) (7,)\r\n\r\n[[10 15]\r\n [20 25]\r\n [30 35]] 65\r\n[[20 25]\r\n [30 35]\r\n [40 45]] 85\r\n[[30 35]\r\n [40 45]\r\n [50 55]] 105\r\n[[40 45]\r\n [50 55]\r\n [60 65]] 125\r\n[[50 55]\r\n [60 65]\r\n [70 75]] 145\r\n[[60 65]\r\n [70 75]\r\n [80 85]] 165\r\n[[70 75]\r\n [80 85]\r\n [90 95]] 185<\/pre>\n<p>We are now ready to fit a 1D CNN model on this data, specifying the expected number of time steps and features to expect for each input sample, in this case three and two respectively.<\/p>\n<pre class=\"crayon-plain-tag\"># define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(1))\r\nmodel.compile(optimizer='adam', loss='mse')<\/pre>\n<p>When making a prediction, the model expects three time steps for two input time series.<\/p>\n<p>We can predict the next value in the output series providing the input values of:<\/p>\n<pre class=\"crayon-plain-tag\">80,\t 85\r\n90,\t 95\r\n100, 105<\/pre>\n<p>The shape of the one sample with three time steps and two variables must be [1, 3, 2].<\/p>\n<p>We would expect the next value in the sequence to be 100 + 105 or 205.<\/p>\n<pre class=\"crayon-plain-tag\"># demonstrate prediction\r\nx_input = array([[80, 85], [90, 95], [100, 105]])\r\nx_input = x_input.reshape((1, n_steps, n_features))\r\nyhat = model.predict(x_input, verbose=0)<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate cnn example\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers.convolutional import Conv1D\r\nfrom keras.layers.convolutional import MaxPooling1D\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the dataset\r\n\t\tif end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps = 3\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps)\r\n# the dataset knows the number of features, e.g. 2\r\nn_features = X.shape[2]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(1))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit(X, y, epochs=1000, verbose=0)\r\n# demonstrate prediction\r\nx_input = array([[80, 85], [90, 95], [100, 105]])\r\nx_input = x_input.reshape((1, n_steps, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example prepares the data, fits the model, and makes a prediction.<\/p>\n<pre class=\"crayon-plain-tag\">[[206.0161]]<\/pre>\n<p>There is another, more elaborate way to model the problem.<\/p>\n<p>Each input series can be handled by a separate CNN and the output of each of these submodels can be combined before a prediction is made for the output sequence.<\/p>\n<p>We can refer to this as a multi-headed CNN model. It may offer more flexibility or better performance depending on the specifics of the problem that is being modeled. For example, it allows you to configure each sub-model differently for each input series, such as the number of filter maps and the kernel size.<\/p>\n<p>This type of model can be defined in Keras using the <a href=\"https:\/\/machinelearningmastery.com\/keras-functional-api-deep-learning\/\">Keras functional API<\/a>.<\/p>\n<p>First, we can define the first input model as a 1D CNN with an input layer that expects vectors with <em>n_steps<\/em> and 1 feature.<\/p>\n<pre class=\"crayon-plain-tag\"># first input model\r\nvisible1 = Input(shape=(n_steps, n_features))\r\ncnn1 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible1)\r\ncnn1 = MaxPooling1D(pool_size=2)(cnn1)\r\ncnn1 = Flatten()(cnn1)<\/pre>\n<p>We can define the second input submodel in the same way.<\/p>\n<pre class=\"crayon-plain-tag\"># second input model\r\nvisible2 = Input(shape=(n_steps, n_features))\r\ncnn2 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible2)\r\ncnn2 = MaxPooling1D(pool_size=2)(cnn2)\r\ncnn2 = Flatten()(cnn2)<\/pre>\n<p>Now that both input submodels have been defined, we can merge the output from each model into one long vector which can be interpreted before making a prediction for the output sequence.<\/p>\n<pre class=\"crayon-plain-tag\"># merge input models\r\nmerge = concatenate([cnn1, cnn2])\r\ndense = Dense(50, activation='relu')(merge)\r\noutput = Dense(1)(dense)<\/pre>\n<p>We can then tie the inputs and outputs together.<\/p>\n<pre class=\"crayon-plain-tag\">model = Model(inputs=[visible1, visible2], outputs=output)<\/pre>\n<p>The image below provides a schematic for how this model looks, including the shape of the inputs and outputs of each layer.<\/p>\n<div id=\"attachment_6431\" style=\"width: 1002px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6431\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/08\/Plot-of-Multi-Headed-1D-CNN-for-Multivariate-Time-Series-Forecasting.png\" alt=\"Plot of Multi-Headed 1D CNN for Multivariate Time Series Forecasting\" width=\"992\" height=\"737\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/08\/Plot-of-Multi-Headed-1D-CNN-for-Multivariate-Time-Series-Forecasting.png 992w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/08\/Plot-of-Multi-Headed-1D-CNN-for-Multivariate-Time-Series-Forecasting-300x223.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/08\/Plot-of-Multi-Headed-1D-CNN-for-Multivariate-Time-Series-Forecasting-768x571.png 768w\" sizes=\"(max-width: 992px) 100vw, 992px\"><\/p>\n<p class=\"wp-caption-text\">Plot of Multi-Headed 1D CNN for Multivariate Time Series Forecasting<\/p>\n<\/div>\n<p>This model requires input to be provided as a list of two elements where each element in the list contains data for one of the submodels.<\/p>\n<p>In order to achieve this, we can split the 3D input data into two separate arrays of input data; that is from one array with the shape [7, 3, 2] to two 3D arrays with [7, 3, 1]<\/p>\n<pre class=\"crayon-plain-tag\"># one time series per head\r\nn_features = 1\r\n# separate input data\r\nX1 = X[:, :, 0].reshape(X.shape[0], X.shape[1], n_features)\r\nX2 = X[:, :, 1].reshape(X.shape[0], X.shape[1], n_features)<\/pre>\n<p>These data can then be provided in order to fit the model.<\/p>\n<pre class=\"crayon-plain-tag\"># fit model\r\nmodel.fit([X1, X2], y, epochs=1000, verbose=0)<\/pre>\n<p>Similarly, we must prepare the data for a single sample as two separate two-dimensional arrays when making a single one-step prediction.<\/p>\n<pre class=\"crayon-plain-tag\">x_input = array([[80, 85], [90, 95], [100, 105]])\r\nx1 = x_input[:, 0].reshape((1, n_steps, n_features))\r\nx2 = x_input[:, 1].reshape((1, n_steps, n_features))<\/pre>\n<p>We can tie all of this together; the complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate multi-headed 1d cnn example\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.models import Model\r\nfrom keras.layers import Input\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers.convolutional import Conv1D\r\nfrom keras.layers.convolutional import MaxPooling1D\r\nfrom keras.layers.merge import concatenate\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the dataset\r\n\t\tif end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps = 3\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps)\r\n# one time series per head\r\nn_features = 1\r\n# separate input data\r\nX1 = X[:, :, 0].reshape(X.shape[0], X.shape[1], n_features)\r\nX2 = X[:, :, 1].reshape(X.shape[0], X.shape[1], n_features)\r\n# first input model\r\nvisible1 = Input(shape=(n_steps, n_features))\r\ncnn1 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible1)\r\ncnn1 = MaxPooling1D(pool_size=2)(cnn1)\r\ncnn1 = Flatten()(cnn1)\r\n# second input model\r\nvisible2 = Input(shape=(n_steps, n_features))\r\ncnn2 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible2)\r\ncnn2 = MaxPooling1D(pool_size=2)(cnn2)\r\ncnn2 = Flatten()(cnn2)\r\n# merge input models\r\nmerge = concatenate([cnn1, cnn2])\r\ndense = Dense(50, activation='relu')(merge)\r\noutput = Dense(1)(dense)\r\nmodel = Model(inputs=[visible1, visible2], outputs=output)\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit([X1, X2], y, epochs=1000, verbose=0)\r\n# demonstrate prediction\r\nx_input = array([[80, 85], [90, 95], [100, 105]])\r\nx1 = x_input[:, 0].reshape((1, n_steps, n_features))\r\nx2 = x_input[:, 1].reshape((1, n_steps, n_features))\r\nyhat = model.predict([x1, x2], verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example prepares the data, fits the model, and makes a prediction.<\/p>\n<pre class=\"crayon-plain-tag\">[[205.871]]<\/pre>\n<\/p>\n<h3>Multiple Parallel Series<\/h3>\n<p>An alternate time series problem is the case where there are multiple parallel time series and a value must be predicted for each.<\/p>\n<p>For example, given the data from the previous section:<\/p>\n<pre class=\"crayon-plain-tag\">[[ 10  15  25]\r\n [ 20  25  45]\r\n [ 30  35  65]\r\n [ 40  45  85]\r\n [ 50  55 105]\r\n [ 60  65 125]\r\n [ 70  75 145]\r\n [ 80  85 165]\r\n [ 90  95 185]]<\/pre>\n<p>We may want to predict the value for each of the three time series for the next time step.<\/p>\n<p>This might be referred to as multivariate forecasting.<\/p>\n<p>Again, the data must be split into input\/output samples in order to train a model.<\/p>\n<p>The first sample of this dataset would be:<\/p>\n<p>Input:<\/p>\n<pre class=\"crayon-plain-tag\">10, 15, 25\r\n20, 25, 45\r\n30, 35, 65<\/pre>\n<p>Output:<\/p>\n<pre class=\"crayon-plain-tag\">40, 45, 85<\/pre>\n<p>The <em>split_sequences()<\/em> function below will split multiple parallel time series with rows for time steps and one series per column into the required input\/output shape.<\/p>\n<pre class=\"crayon-plain-tag\"># split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the dataset\r\n\t\tif end_ix > len(sequences)-1:\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)<\/pre>\n<p>We can demonstrate this on the contrived problem; the complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate output data prep\r\nfrom numpy import array\r\nfrom numpy import hstack\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the dataset\r\n\t\tif end_ix > len(sequences)-1:\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps = 3\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps)\r\nprint(X.shape, y.shape)\r\n# summarize the data\r\nfor i in range(len(X)):\r\n\tprint(X[i], y[i])<\/pre>\n<p>Running the example first prints the shape of the prepared X and y components.<\/p>\n<p>The shape of X is three-dimensional, including the number of samples (6), the number of time steps chosen per sample (3), and the number of parallel time series or features (3).<\/p>\n<p>The shape of y is two-dimensional as we might expect for the number of samples (6) and the number of time variables per sample to be predicted (3).<\/p>\n<p>The data is ready to use in a 1D CNN model that expects three-dimensional input and two-dimensional output shapes for the X and y components of each sample.<\/p>\n<p>Then, each of the samples is printed showing the input and output components of each sample.<\/p>\n<pre class=\"crayon-plain-tag\">(6, 3, 3) (6, 3)\r\n\r\n[[10 15 25]\r\n [20 25 45]\r\n [30 35 65]] [40 45 85]\r\n[[20 25 45]\r\n [30 35 65]\r\n [40 45 85]] [ 50  55 105]\r\n[[ 30  35  65]\r\n [ 40  45  85]\r\n [ 50  55 105]] [ 60  65 125]\r\n[[ 40  45  85]\r\n [ 50  55 105]\r\n [ 60  65 125]] [ 70  75 145]\r\n[[ 50  55 105]\r\n [ 60  65 125]\r\n [ 70  75 145]] [ 80  85 165]\r\n[[ 60  65 125]\r\n [ 70  75 145]\r\n [ 80  85 165]] [ 90  95 185]<\/pre>\n<p>We are now ready to fit a 1D CNN model on this data.<\/p>\n<p>In this model, the number of time steps and parallel series (features) are specified for the input layer via the <em>input_shape<\/em> argument.<\/p>\n<p>The number of parallel series is also used in the specification of the number of values to predict by the model in the output layer; again, this is three.<\/p>\n<pre class=\"crayon-plain-tag\"># define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(n_features))\r\nmodel.compile(optimizer='adam', loss='mse')<\/pre>\n<p>We can predict the next value in each of the three parallel series by providing an input of three time steps for each series.<\/p>\n<pre class=\"crayon-plain-tag\">70, 75, 145\r\n80, 85, 165\r\n90, 95, 185<\/pre>\n<p>The shape of the input for making a single prediction must be 1 sample, 3 time steps, and 3 features, or [1, 3, 3].<\/p>\n<pre class=\"crayon-plain-tag\"># demonstrate prediction\r\nx_input = array([[70,75,145], [80,85,165], [90,95,185]])\r\nx_input = x_input.reshape((1, n_steps, n_features))\r\nyhat = model.predict(x_input, verbose=0)<\/pre>\n<p>We would expect the vector output to be:<\/p>\n<pre class=\"crayon-plain-tag\">[100, 105, 205]<\/pre>\n<p>We can tie all of this together and demonstrate a 1D CNN for multivariate output time series forecasting below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate output 1d cnn example\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers.convolutional import Conv1D\r\nfrom keras.layers.convolutional import MaxPooling1D\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the dataset\r\n\t\tif end_ix > len(sequences)-1:\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps = 3\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps)\r\n# the dataset knows the number of features, e.g. 2\r\nn_features = X.shape[2]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(n_features))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit(X, y, epochs=3000, verbose=0)\r\n# demonstrate prediction\r\nx_input = array([[70,75,145], [80,85,165], [90,95,185]])\r\nx_input = x_input.reshape((1, n_steps, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example prepares the data, fits the model and makes a prediction.<\/p>\n<pre class=\"crayon-plain-tag\">[[100.11272 105.32213 205.53436]]<\/pre>\n<p>As with multiple input series, there is another more elaborate way to model the problem.<\/p>\n<p>Each output series can be handled by a separate output CNN model.<\/p>\n<p>We can refer to this as a multi-output CNN model. It may offer more flexibility or better performance depending on the specifics of the problem that is being modeled.<\/p>\n<p>This type of model can be defined in Keras using the <a href=\"https:\/\/machinelearningmastery.com\/keras-functional-api-deep-learning\/\">Keras functional API<\/a>.<\/p>\n<p>First, we can define the first input model as a 1D CNN model.<\/p>\n<pre class=\"crayon-plain-tag\"># define model\r\nvisible = Input(shape=(n_steps, n_features))\r\ncnn = Conv1D(filters=64, kernel_size=2, activation='relu')(visible)\r\ncnn = MaxPooling1D(pool_size=2)(cnn)\r\ncnn = Flatten()(cnn)\r\ncnn = Dense(50, activation='relu')(cnn)<\/pre>\n<p>We can then define one output layer for each of the three series that we wish to forecast, where each output submodel will forecast a single time step.<\/p>\n<pre class=\"crayon-plain-tag\"># define output 1\r\noutput1 = Dense(1)(cnn)\r\n# define output 2\r\noutput2 = Dense(1)(cnn)\r\n# define output 3\r\noutput3 = Dense(1)(cnn)<\/pre>\n<p>We can then tie the input and output layers together into a single model.<\/p>\n<pre class=\"crayon-plain-tag\"># tie together\r\nmodel = Model(inputs=visible, outputs=[output1, output2, output3])\r\nmodel.compile(optimizer='adam', loss='mse')<\/pre>\n<p>To make the model architecture clear, the schematic below clearly shows the three separate output layers of the model and the input and output shapes of each layer.<\/p>\n<div id=\"attachment_6432\" style=\"width: 1021px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6432\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/08\/Plot-of-Multi-Output-1D-CNN-for-Multivariate-Time-Series-Forecasting.png\" alt=\"Plot of Multi-Output 1D CNN for Multivariate Time Series Forecasting\" width=\"1011\" height=\"627\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/08\/Plot-of-Multi-Output-1D-CNN-for-Multivariate-Time-Series-Forecasting.png 1011w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/08\/Plot-of-Multi-Output-1D-CNN-for-Multivariate-Time-Series-Forecasting-300x186.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/08\/Plot-of-Multi-Output-1D-CNN-for-Multivariate-Time-Series-Forecasting-768x476.png 768w\" sizes=\"(max-width: 1011px) 100vw, 1011px\"><\/p>\n<p class=\"wp-caption-text\">Plot of Multi-Output 1D CNN for Multivariate Time Series Forecasting<\/p>\n<\/div>\n<p>When training the model, it will require three separate output arrays per sample. We can achieve this by converting the output training data that has the shape [7, 3] to three arrays with the shape [7, 1].<\/p>\n<pre class=\"crayon-plain-tag\"># separate output\r\ny1 = y[:, 0].reshape((y.shape[0], 1))\r\ny2 = y[:, 1].reshape((y.shape[0], 1))\r\ny3 = y[:, 2].reshape((y.shape[0], 1))<\/pre>\n<p>These arrays can be provided to the model during training.<\/p>\n<pre class=\"crayon-plain-tag\"># fit model\r\nmodel.fit(X, [y1,y2,y3], epochs=2000, verbose=0)<\/pre>\n<p>Tying all of this together, the complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate output 1d cnn example\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.models import Model\r\nfrom keras.layers import Input\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers.convolutional import Conv1D\r\nfrom keras.layers.convolutional import MaxPooling1D\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps\r\n\t\t# check if we are beyond the dataset\r\n\t\tif end_ix > len(sequences)-1:\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps = 3\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps)\r\n# the dataset knows the number of features, e.g. 2\r\nn_features = X.shape[2]\r\n# separate output\r\ny1 = y[:, 0].reshape((y.shape[0], 1))\r\ny2 = y[:, 1].reshape((y.shape[0], 1))\r\ny3 = y[:, 2].reshape((y.shape[0], 1))\r\n# define model\r\nvisible = Input(shape=(n_steps, n_features))\r\ncnn = Conv1D(filters=64, kernel_size=2, activation='relu')(visible)\r\ncnn = MaxPooling1D(pool_size=2)(cnn)\r\ncnn = Flatten()(cnn)\r\ncnn = Dense(50, activation='relu')(cnn)\r\n# define output 1\r\noutput1 = Dense(1)(cnn)\r\n# define output 2\r\noutput2 = Dense(1)(cnn)\r\n# define output 3\r\noutput3 = Dense(1)(cnn)\r\n# tie together\r\nmodel = Model(inputs=visible, outputs=[output1, output2, output3])\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit(X, [y1,y2,y3], epochs=2000, verbose=0)\r\n# demonstrate prediction\r\nx_input = array([[70,75,145], [80,85,165], [90,95,185]])\r\nx_input = x_input.reshape((1, n_steps, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example prepares the data, fits the model, and makes a prediction.<\/p>\n<pre class=\"crayon-plain-tag\">[array([[100.96118]], dtype=float32),\r\n array([[105.502686]], dtype=float32),\r\n array([[205.98045]], dtype=float32)]<\/pre>\n<\/p>\n<h2>Multi-Step CNN Models<\/h2>\n<p>In practice, there is little difference to the 1D CNN model in predicting a vector output that represents different output variables (as in the previous example), or a vector output that represents multiple time steps of one variable.<\/p>\n<p>Nevertheless, there are subtle and important differences in the way the training data is prepared. In this section, we will demonstrate the case of developing a multi-step forecast model using a vector model.<\/p>\n<p>Before we look at the specifics of the model, let\u2019s first look at the preparation of data for multi-step forecasting.<\/p>\n<h3>Data Preparation<\/h3>\n<p>As with one-step forecasting, a time series used for multi-step time series forecasting must be split into samples with input and output components.<\/p>\n<p>Both the input and output components will be comprised of multiple time steps and may or may not have the same number of steps.<\/p>\n<p>For example, given the univariate time series:<\/p>\n<pre class=\"crayon-plain-tag\">[10, 20, 30, 40, 50, 60, 70, 80, 90]<\/pre>\n<p>We could use the last three time steps as input and forecast the next two time steps.<\/p>\n<p>The first sample would look as follows:<\/p>\n<p>Input:<\/p>\n<pre class=\"crayon-plain-tag\">[10, 20, 30]<\/pre>\n<p>Output:<\/p>\n<pre class=\"crayon-plain-tag\">[40, 50]<\/pre>\n<p>The <em>split_sequence()<\/em> function below implements this behavior and will split a given univariate time series into samples with a specified number of input and output time steps.<\/p>\n<pre class=\"crayon-plain-tag\"># split a univariate sequence into samples\r\ndef split_sequence(sequence, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequence)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out\r\n\t\t# check if we are beyond the sequence\r\n\t\tif out_end_ix > len(sequence):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)<\/pre>\n<p>We can demonstrate this function on the small contrived dataset.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multi-step data preparation\r\nfrom numpy import array\r\n\r\n# split a univariate sequence into samples\r\ndef split_sequence(sequence, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequence)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out\r\n\t\t# check if we are beyond the sequence\r\n\t\tif out_end_ix > len(sequence):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nraw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]\r\n# choose a number of time steps\r\nn_steps_in, n_steps_out = 3, 2\r\n# split into samples\r\nX, y = split_sequence(raw_seq, n_steps_in, n_steps_out)\r\n# summarize the data\r\nfor i in range(len(X)):\r\n\tprint(X[i], y[i])<\/pre>\n<p>Running the example splits the univariate series into input and output time steps and prints the input and output components of each.<\/p>\n<pre class=\"crayon-plain-tag\">[10 20 30] [40 50]\r\n[20 30 40] [50 60]\r\n[30 40 50] [60 70]\r\n[40 50 60] [70 80]\r\n[50 60 70] [80 90]<\/pre>\n<p>Now that we know how to prepare data for multi-step forecasting, let\u2019s look at a 1D CNN model that can learn this mapping.<\/p>\n<h3>Vector Output Model<\/h3>\n<p>The 1D CNN can output a vector directly that can be interpreted as a multi-step forecast.<\/p>\n<p>This approach was seen in the previous section were one time step of each output time series was forecasted as a vector.<\/p>\n<p>As with the 1D CNN models for univariate data in a prior section, the prepared samples must first be reshaped. The CNN expects data to have a three-dimensional structure of [<em>samples, timesteps, features<\/em>], and in this case, we only have one feature so the reshape is straightforward.<\/p>\n<pre class=\"crayon-plain-tag\"># reshape from [samples, timesteps] into [samples, timesteps, features]\r\nn_features = 1\r\nX = X.reshape((X.shape[0], X.shape[1], n_features))<\/pre>\n<p>With the number of input and output steps specified in the <em>n_steps_in<\/em> and <em>n_steps_out<\/em> variables, we can define a multi-step time-series forecasting model.<\/p>\n<pre class=\"crayon-plain-tag\"># define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps_in, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(n_steps_out))\r\nmodel.compile(optimizer='adam', loss='mse')<\/pre>\n<p>The model can make a prediction for a single sample. We can predict the next two steps beyond the end of the dataset by providing the input:<\/p>\n<pre class=\"crayon-plain-tag\">[70, 80, 90]<\/pre>\n<p>We would expect the predicted output to be:<\/p>\n<pre class=\"crayon-plain-tag\">[100, 110]<\/pre>\n<p>As expected by the model, the shape of the single sample of input data when making the prediction must be [1, 3, 1] for the 1 sample, 3 time steps of the input, and the single feature.<\/p>\n<pre class=\"crayon-plain-tag\"># demonstrate prediction\r\nx_input = array([70, 80, 90])\r\nx_input = x_input.reshape((1, n_steps_in, n_features))\r\nyhat = model.predict(x_input, verbose=0)<\/pre>\n<p>Tying all of this together, the 1D CNN for multi-step forecasting with a univariate time series is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># univariate multi-step vector-output 1d cnn example\r\nfrom numpy import array\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers.convolutional import Conv1D\r\nfrom keras.layers.convolutional import MaxPooling1D\r\n\r\n# split a univariate sequence into samples\r\ndef split_sequence(sequence, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequence)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out\r\n\t\t# check if we are beyond the sequence\r\n\t\tif out_end_ix > len(sequence):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nraw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]\r\n# choose a number of time steps\r\nn_steps_in, n_steps_out = 3, 2\r\n# split into samples\r\nX, y = split_sequence(raw_seq, n_steps_in, n_steps_out)\r\n# reshape from [samples, timesteps] into [samples, timesteps, features]\r\nn_features = 1\r\nX = X.reshape((X.shape[0], X.shape[1], n_features))\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps_in, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(n_steps_out))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit(X, y, epochs=2000, verbose=0)\r\n# demonstrate prediction\r\nx_input = array([70, 80, 90])\r\nx_input = x_input.reshape((1, n_steps_in, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example forecasts and prints the next two time steps in the sequence.<\/p>\n<pre class=\"crayon-plain-tag\">[[102.86651 115.08979]]<\/pre>\n<\/p>\n<h2>Multivariate Multi-Step CNN Models<\/h2>\n<p>In the previous sections, we have looked at univariate, multivariate, and multi-step time series forecasting.<\/p>\n<p>It is possible to mix and match the different types of 1D CNN models presented so far for the different problems. This too applies to time series forecasting problems that involve multivariate and multi-step forecasting, but it may be a little more challenging.<\/p>\n<p>In this section, we will explore short examples of data preparation and modeling for multivariate multi-step time series forecasting as a template to ease this challenge, specifically:<\/p>\n<ol>\n<li>Multiple Input Multi-Step Output.<\/li>\n<li>Multiple Parallel Input and Multi-Step Output.<\/li>\n<\/ol>\n<p>Perhaps the biggest stumbling block is in the preparation of data, so this is where we will focus our attention.<\/p>\n<h3>Multiple Input Multi-Step Output<\/h3>\n<p>There are those multivariate time series forecasting problems where the output series is separate but dependent upon the input time series, and multiple time steps are required for the output series.<\/p>\n<p>For example, consider our multivariate time series from a prior section:<\/p>\n<pre class=\"crayon-plain-tag\">[[ 10  15  25]\r\n [ 20  25  45]\r\n [ 30  35  65]\r\n [ 40  45  85]\r\n [ 50  55 105]\r\n [ 60  65 125]\r\n [ 70  75 145]\r\n [ 80  85 165]\r\n [ 90  95 185]]<\/pre>\n<p>We may use three prior time steps of each of the two input time series to predict two time steps of the output time series.<\/p>\n<p>Input:<\/p>\n<pre class=\"crayon-plain-tag\">10, 15\r\n20, 25\r\n30, 35<\/pre>\n<p>Output:<\/p>\n<pre class=\"crayon-plain-tag\">65\r\n85<\/pre>\n<p>The <em>split_sequences()<\/em> function below implements this behavior.<\/p>\n<pre class=\"crayon-plain-tag\"># split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out-1\r\n\t\t# check if we are beyond the dataset\r\n\t\tif out_end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)<\/pre>\n<p>We can demonstrate this on our contrived dataset. The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate multi-step data preparation\r\nfrom numpy import array\r\nfrom numpy import hstack\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out-1\r\n\t\t# check if we are beyond the dataset\r\n\t\tif out_end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps_in, n_steps_out = 3, 2\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps_in, n_steps_out)\r\nprint(X.shape, y.shape)\r\n# summarize the data\r\nfor i in range(len(X)):\r\n\tprint(X[i], y[i])<\/pre>\n<p>Running the example first prints the shape of the prepared training data.<\/p>\n<p>We can see that the shape of the input portion of the samples is three-dimensional, comprised of six samples, with three time steps and two variables for the two input time series.<\/p>\n<p>The output portion of the samples is two-dimensional for the six samples and the two time steps for each sample to be predicted.<\/p>\n<p>The prepared samples are then printed to confirm that the data was prepared as we specified.<\/p>\n<pre class=\"crayon-plain-tag\">(6, 3, 2) (6, 2)\r\n\r\n[[10 15]\r\n [20 25]\r\n [30 35]] [65 85]\r\n[[20 25]\r\n [30 35]\r\n [40 45]] [ 85 105]\r\n[[30 35]\r\n [40 45]\r\n [50 55]] [105 125]\r\n[[40 45]\r\n [50 55]\r\n [60 65]] [125 145]\r\n[[50 55]\r\n [60 65]\r\n [70 75]] [145 165]\r\n[[60 65]\r\n [70 75]\r\n [80 85]] [165 185]<\/pre>\n<p>We can now develop a 1D CNN model for multi-step predictions.<\/p>\n<p>In this case, we will demonstrate a vector output model. The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate multi-step 1d cnn example\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers.convolutional import Conv1D\r\nfrom keras.layers.convolutional import MaxPooling1D\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out-1\r\n\t\t# check if we are beyond the dataset\r\n\t\tif out_end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps_in, n_steps_out = 3, 2\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps_in, n_steps_out)\r\n# the dataset knows the number of features, e.g. 2\r\nn_features = X.shape[2]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps_in, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(n_steps_out))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit(X, y, epochs=2000, verbose=0)\r\n# demonstrate prediction\r\nx_input = array([[70, 75], [80, 85], [90, 95]])\r\nx_input = x_input.reshape((1, n_steps_in, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example fits the model and predicts the next two time steps of the output sequence beyond the dataset.<\/p>\n<p>We would expect the next two steps to be [185, 205].<\/p>\n<p>It is a challenging framing of the problem with very little data, and the arbitrarily configured version of the model gets close.<\/p>\n<pre class=\"crayon-plain-tag\">[[185.57011 207.77893]]<\/pre>\n<\/p>\n<h3>Multiple Parallel Input and Multi-Step Output<\/h3>\n<p>A problem with parallel time series may require the prediction of multiple time steps of each time series.<\/p>\n<p>For example, consider our multivariate time series from a prior section:<\/p>\n<pre class=\"crayon-plain-tag\">[[ 10  15  25]\r\n [ 20  25  45]\r\n [ 30  35  65]\r\n [ 40  45  85]\r\n [ 50  55 105]\r\n [ 60  65 125]\r\n [ 70  75 145]\r\n [ 80  85 165]\r\n [ 90  95 185]]<\/pre>\n<p>We may use the last three time steps from each of the three time series as input to the model, and predict the next time steps of each of the three time series as output.<\/p>\n<p>The first sample in the training dataset would be the following.<\/p>\n<p>Input:<\/p>\n<pre class=\"crayon-plain-tag\">10, 15, 25\r\n20, 25, 45\r\n30, 35, 65<\/pre>\n<p>Output:<\/p>\n<pre class=\"crayon-plain-tag\">40, 45, 85\r\n50, 55, 105<\/pre>\n<p>The <em>split_sequences()<\/em> function below implements this behavior.<\/p>\n<pre class=\"crayon-plain-tag\"># split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out\r\n\t\t# check if we are beyond the dataset\r\n\t\tif out_end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)<\/pre>\n<p>We can demonstrate this function on the small contrived dataset.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate multi-step data preparation\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.models import Sequential\r\nfrom keras.layers import LSTM\r\nfrom keras.layers import Dense\r\nfrom keras.layers import RepeatVector\r\nfrom keras.layers import TimeDistributed\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out\r\n\t\t# check if we are beyond the dataset\r\n\t\tif out_end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps_in, n_steps_out = 3, 2\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps_in, n_steps_out)\r\nprint(X.shape, y.shape)\r\n# summarize the data\r\nfor i in range(len(X)):\r\n\tprint(X[i], y[i])<\/pre>\n<p>Running the example first prints the shape of the prepared training dataset.<\/p>\n<p>We can see that both the input (<em>X<\/em>) and output (<em>Y<\/em>) elements of the dataset are three dimensional for the number of samples, time steps, and variables or parallel time series respectively.<\/p>\n<p>The input and output elements of each series are then printed side by side so that we can confirm that the data was prepared as we expected.<\/p>\n<pre class=\"crayon-plain-tag\">(5, 3, 3) (5, 2, 3)\r\n\r\n[[10 15 25]\r\n [20 25 45]\r\n [30 35 65]] [[ 40  45  85]\r\n [ 50  55 105]]\r\n[[20 25 45]\r\n [30 35 65]\r\n [40 45 85]] [[ 50  55 105]\r\n [ 60  65 125]]\r\n[[ 30  35  65]\r\n [ 40  45  85]\r\n [ 50  55 105]] [[ 60  65 125]\r\n [ 70  75 145]]\r\n[[ 40  45  85]\r\n [ 50  55 105]\r\n [ 60  65 125]] [[ 70  75 145]\r\n [ 80  85 165]]\r\n[[ 50  55 105]\r\n [ 60  65 125]\r\n [ 70  75 145]] [[ 80  85 165]\r\n [ 90  95 185]]<\/pre>\n<p>We can now develop a 1D CNN model for this dataset.<\/p>\n<p>We will use a vector-output model in this case. As such, we must flatten the three-dimensional structure of the output portion of each sample in order to train the model. This means, instead of predicting two steps for each series, the model is trained on and expected to predict a vector of six numbers directly.<\/p>\n<pre class=\"crayon-plain-tag\"># flatten output\r\nn_output = y.shape[1] * y.shape[2]\r\ny = y.reshape((y.shape[0], n_output))<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate output multi-step 1d cnn example\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.layers import Flatten\r\nfrom keras.layers.convolutional import Conv1D\r\nfrom keras.layers.convolutional import MaxPooling1D\r\n\r\n# split a multivariate sequence into samples\r\ndef split_sequences(sequences, n_steps_in, n_steps_out):\r\n\tX, y = list(), list()\r\n\tfor i in range(len(sequences)):\r\n\t\t# find the end of this pattern\r\n\t\tend_ix = i + n_steps_in\r\n\t\tout_end_ix = end_ix + n_steps_out\r\n\t\t# check if we are beyond the dataset\r\n\t\tif out_end_ix > len(sequences):\r\n\t\t\tbreak\r\n\t\t# gather input and output parts of the pattern\r\n\t\tseq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]\r\n\t\tX.append(seq_x)\r\n\t\ty.append(seq_y)\r\n\treturn array(X), array(y)\r\n\r\n# define input sequence\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])\r\nout_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])\r\n# convert to [rows, columns] structure\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2, out_seq))\r\n# choose a number of time steps\r\nn_steps_in, n_steps_out = 3, 2\r\n# convert into input\/output\r\nX, y = split_sequences(dataset, n_steps_in, n_steps_out)\r\n# flatten output\r\nn_output = y.shape[1] * y.shape[2]\r\ny = y.reshape((y.shape[0], n_output))\r\n# the dataset knows the number of features, e.g. 2\r\nn_features = X.shape[2]\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps_in, n_features)))\r\nmodel.add(MaxPooling1D(pool_size=2))\r\nmodel.add(Flatten())\r\nmodel.add(Dense(50, activation='relu'))\r\nmodel.add(Dense(n_output))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit(X, y, epochs=7000, verbose=0)\r\n# demonstrate prediction\r\nx_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]])\r\nx_input = x_input.reshape((1, n_steps_in, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example fits the model and predicts the values for each of the three time steps for the next two time steps beyond the end of the dataset.<\/p>\n<p>We would expect the values for these series and time steps to be as follows:<\/p>\n<pre class=\"crayon-plain-tag\">90, 95, 185\r\n100, 105, 205<\/pre>\n<p>We can see that the model forecast gets reasonably close to the expected values.<\/p>\n<pre class=\"crayon-plain-tag\">[[ 90.47855 95.621284 186.02629 100.48118 105.80815 206.52821 ]]<\/pre>\n<\/p>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to develop a suite of CNN models for a range of standard time series forecasting problems.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to develop CNN models for univariate time series forecasting.<\/li>\n<li>How to develop CNN models for multivariate time series forecasting.<\/li>\n<li>How to develop CNN models for multi-step time series forecasting.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting\/\">How to Develop Convolutional Neural Network Models for Time Series Forecasting<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee Convolutional Neural Network models, or CNNs for short, can be applied to time series forecasting. There are many types of CNN models [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/11\/11\/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":1283,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1282"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=1282"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1282\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/1283"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=1282"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=1282"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=1282"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}