{"id":1263,"date":"2018-11-06T18:00:05","date_gmt":"2018-11-06T18:00:05","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/11\/06\/how-to-use-the-timeseriesgenerator-for-time-series-forecasting-in-keras\/"},"modified":"2018-11-06T18:00:05","modified_gmt":"2018-11-06T18:00:05","slug":"how-to-use-the-timeseriesgenerator-for-time-series-forecasting-in-keras","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/11\/06\/how-to-use-the-timeseriesgenerator-for-time-series-forecasting-in-keras\/","title":{"rendered":"How to Use the TimeseriesGenerator for Time Series Forecasting in Keras"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>Time series data must be transformed into a structure of samples with input and output components before it can be used to fit a supervised learning model.<\/p>\n<p>This can be challenging if you have to perform this transformation manually. The Keras deep learning library provides the TimeseriesGenerator to automatically transform both univariate and multivariate time series data into samples, ready to train deep learning models.<\/p>\n<p>In this tutorial, you will discover how to use the Keras TimeseriesGenerator for preparing time series data for modeling with deep learning methods.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to define the TimeseriesGenerator generator and use it to fit deep learning models.<\/li>\n<li>How to prepare a generator for univariate time series and fit MLP and LSTM models.<\/li>\n<li>How to prepare a generator for multivariate time series and fit an LSTM model.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_6423\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6423\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/11\/How-to-Use-the-TimeseriesGenerator-for-Time-Series-Forecasting-in-Keras.jpg\" alt=\"How to Use the TimeseriesGenerator for Time Series Forecasting in Keras\" width=\"640\" height=\"424\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/11\/How-to-Use-the-TimeseriesGenerator-for-Time-Series-Forecasting-in-Keras.jpg 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/11\/How-to-Use-the-TimeseriesGenerator-for-Time-Series-Forecasting-in-Keras-300x199.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p class=\"wp-caption-text\">How to Use the TimeseriesGenerator for Time Series Forecasting in Keras<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/chrisfithall\/13933989150\/\">Chris Fithall<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into six parts; they are:<\/p>\n<ol>\n<li>Problem with Time Series for Supervised Learning<\/li>\n<li>How to Use the TimeseriesGenerator<\/li>\n<li>Univariate Time Series Example<\/li>\n<li>Multivariate Time Series Example<\/li>\n<li>Multivariate Inputs and Dependent Series Example<\/li>\n<li>Multi-step Forecasts Example<\/li>\n<\/ol>\n<h2>Problem with Time Series for Supervised Learning<\/h2>\n<p>Time series data requires preparation before it can be used to train a supervised learning model, such as a deep learning model.<\/p>\n<p>For example, a univariate time series is represented as a vector of observations:<\/p>\n<pre class=\"crayon-plain-tag\">[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]<\/pre>\n<p>A supervised learning algorithm requires that data is provided as a collection of samples, where each sample has an input component (<em>X<\/em>) and an output component (<em>y<\/em>).<\/p>\n<pre class=\"crayon-plain-tag\">X,\t\t\t\t\ty\r\nexample input, \t\texample output\r\nexample input, \t\texample output\r\nexample input, \t\texample output\r\n...<\/pre>\n<p>The model will learn how to map inputs to outputs from the provided examples.<\/p>\n<pre class=\"crayon-plain-tag\">y = f(X)<\/pre>\n<p>A time series must be transformed into samples with input and output components. The transform both informs what the model will learn and how you intend to use the model in the future when making predictions, e.g. what is required to make a prediction (<em>X<\/em>) and what prediction is made (<em>y<\/em>).<\/p>\n<p>For a univariate time series interested in one-step predictions, the observations at prior time steps, so-called lag observations, are used as input and the output is the observation at the current time step.<\/p>\n<p>For example, the above 10-step univariate series can be expressed as a supervised learning problem with three time steps for input and one step as output, as follows:<\/p>\n<pre class=\"crayon-plain-tag\">X,\t\t\ty\r\n[1, 2, 3],\t[4]\r\n[2, 3, 4],\t[5]\r\n[3, 4, 5],\t[6]\r\n...<\/pre>\n<p>You can write code to perform this transform yourself; for example, see the post:<\/p>\n<ul>\n<li><a href=\"https:\/\/machinelearningmastery.com\/convert-time-series-supervised-learning-problem-python\/\">How to Convert a Time Series to a Supervised Learning Problem in Python<\/a><\/li>\n<\/ul>\n<p>Alternately, when you are interested in training neural network models with Keras, you can use the TimeseriesGenerator class.<\/p>\n<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Need help with Deep Learning for Time Series?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14531ee73f72a2%3A164f8be4f346dc\/5630742793027584\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"14531ee73f72a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14531ee73f72a2%3A164f8be4f346dc\/5630742793027584\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1534880695.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h2>How to use the TimeseriesGenerator<\/h2>\n<p>Keras provides the <a href=\"https:\/\/keras.io\/preprocessing\/sequence\/\">TimeseriesGenerator<\/a> that can be used to automatically transform a univariate or multivariate time series dataset into a supervised learning problem.<\/p>\n<p>There are two parts to using the TimeseriesGenerator: defining it and using it to train models.<\/p>\n<h3>Defining a TimeseriesGenerator<\/h3>\n<p>You can create an instance of the class and specify the input and output aspects of your time series problem and it will provide an instance of a <a href=\"https:\/\/keras.io\/utils\/#sequence\">Sequence class<\/a> that can then be used to iterate across the inputs and outputs of the series.<\/p>\n<p>In most time series prediction problems, the input and output series will be the same series.<\/p>\n<p>For example:<\/p>\n<pre class=\"crayon-plain-tag\"># load data\r\ninputs = ...\r\noutputs = ...\r\n# define generator\r\ngenerator = TimeseriesGenerator(inputs, outputs, ...)\r\n# iterator generator\r\nfor i in range(len(generator)):\r\n\t...<\/pre>\n<p>Technically, the class is not a generator in the sense that it is not a <a href=\"https:\/\/wiki.python.org\/moin\/Generators\">Python Generator<\/a> and you cannot use the <em>next()<\/em> function on it.<\/p>\n<p>In addition to specifying the input and output aspects of your time series problem, there are some additional parameters that you should configure; for example:<\/p>\n<ul>\n<li><strong>length<\/strong>: The number of lag observations to use in the input portion of each sample (e.g. 3).<\/li>\n<li><strong>batch_size<\/strong>: The number of samples to return on each iteration (e.g. 32).<\/li>\n<\/ul>\n<p>You must define a length argument based on your designed framing of the problem. That is the desired number of lag observations to use as input.<\/p>\n<p>You must also define the batch size as the batch size of your model during training. If the number of samples in your dataset is less than your batch size, you can set the batch size in the generator and in your model to the total number of samples in your generator found via calculating its length; for example:<\/p>\n<pre class=\"crayon-plain-tag\">print(len(generator))<\/pre>\n<p>There are also other arguments such as defining start and end offsets into your data, the sampling rate, stride, and more. You are less likely to use these features, but you can see the <a href=\"https:\/\/keras.io\/preprocessing\/sequence\/\">full API<\/a> for more details.<\/p>\n<p>The samples are not shuffled by default. This is useful for some recurrent neural networks like LSTMs that maintain state across samples within a batch.<\/p>\n<p>It can benefit other neural networks, such as CNNs and MLPs, to shuffle the samples when training. Shuffling can be enabled by setting the \u2018<em>shuffle<\/em>\u2018 argument to True. This will have the effect of shuffling samples returned for each batch.<\/p>\n<p>At the time of writing, the TimeseriesGenerator is limited to one-step outputs. Multi-step time series forecasting is not supported.<\/p>\n<h3>Training a Model with a TimeseriesGenerator<\/h3>\n<p>Once a TimeseriesGenerator instance has been defined, it can be used to train a neural network model.<\/p>\n<p>A model can be trained using the TimeseriesGenerator as a data generator. This can be achieved by fitting the defined model using the <em>fit_generator()<\/em> function.<\/p>\n<p>This function takes the generator as an argument. It also takes a <em>steps_per_epoch<\/em> argument that defines the number of samples to use in each epoch. This can be set to the length of the TimeseriesGenerator instance to use all samples in the generator.<\/p>\n<p>For example:<\/p>\n<pre class=\"crayon-plain-tag\"># define generator\r\ngenerator = TimeseriesGenerator(...)\r\n# define model\r\nmodel = ...\r\n# fit model\r\nmodel.fit_generator(generator, steps_per_epoch=len(generator), ...)<\/pre>\n<p>Similarly, the generator can be used to evaluate a fit model by calling the <em>evaluate_generator()<\/em> function, and using a fit model to make predictions on new data with the <em>predict_generator()<\/em> function.<\/p>\n<p>A model fit with the data generator does not have to use the generator versions of the evaluate and predict functions. They can be used only if you wish to have the data generator prepare your data for the model.<\/p>\n<h2>Univariate Time Series Example<\/h2>\n<p>We can make the TimeseriesGenerator concrete with a worked example with a small contrived univariate time series dataset.<\/p>\n<p>First, let\u2019s define our dataset.<\/p>\n<pre class=\"crayon-plain-tag\"># define dataset\r\nseries = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])<\/pre>\n<p>We will choose to frame the problem where the last two lag observations will be used to predict the next value in the sequence. For example:<\/p>\n<pre class=\"crayon-plain-tag\">X,\t\t\ty\r\n[1, 2]\t\t3<\/pre>\n<p>For now, we will use a batch size of 1, so that we can explore the data in the generator.<\/p>\n<pre class=\"crayon-plain-tag\"># define generator\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(series, series, length=n_input, batch_size=1)<\/pre>\n<p>Next, we can see how many samples will be prepared by the data generator for this time series.<\/p>\n<pre class=\"crayon-plain-tag\"># number of samples\r\nprint('Samples: %d' % len(generator))<\/pre>\n<p>Finally, we can print the input and output components of each sample, to confirm that the data was prepared as we expected.<\/p>\n<pre class=\"crayon-plain-tag\">for i in range(len(generator)):\r\n\tx, y = generator[i]\r\n\tprint('%s => %s' % (x, y))<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># univariate one step problem\r\nfrom numpy import array\r\nfrom keras.preprocessing.sequence import TimeseriesGenerator\r\n# define dataset\r\nseries = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\r\n# define generator\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(series, series, length=n_input, batch_size=1)\r\n# number of samples\r\nprint('Samples: %d' % len(generator))\r\n# print each sample\r\nfor i in range(len(generator)):\r\n\tx, y = generator[i]\r\n\tprint('%s => %s' % (x, y))<\/pre>\n<p>Running the example first prints the total number of samples in the generator, which is eight.<\/p>\n<p>We can then see that each input array has the shape [1, 2] and each output has the shape [1,].<\/p>\n<p>The observations are prepared as we expected, with two lag observations that will be used as input and the subsequent value in the sequence as the output.<\/p>\n<pre class=\"crayon-plain-tag\">Samples: 8\r\n\r\n[[1. 2.]] => [3.]\r\n[[2. 3.]] => [4.]\r\n[[3. 4.]] => [5.]\r\n[[4. 5.]] => [6.]\r\n[[5. 6.]] => [7.]\r\n[[6. 7.]] => [8.]\r\n[[7. 8.]] => [9.]\r\n[[8. 9.]] => [10.]<\/pre>\n<p>Now we can fit a model on this data and learn to map the input sequence to the output sequence.<\/p>\n<p>We will start with a simple Multilayer Perceptron, or MLP, model.<\/p>\n<p>The generator will be defined so that all samples will be used in each batch, given the small number of samples.<\/p>\n<pre class=\"crayon-plain-tag\"># define generator\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(series, series, length=n_input, batch_size=8)<\/pre>\n<p>We can define a simple model with one hidden layer with 50 nodes and an output layer that will make the prediction.<\/p>\n<pre class=\"crayon-plain-tag\"># define model\r\nmodel = Sequential()\r\nmodel.add(Dense(100, activation='relu', input_dim=n_input))\r\nmodel.add(Dense(1))\r\nmodel.compile(optimizer='adam', loss='mse')<\/pre>\n<p>We can then fit the model with the generator using the <em>fit_generator()<\/em> function. We only have one batch worth of data in the generator so we\u2019ll set the <em>steps_per_epoch<\/em> to 1. The model will be fit for 200 epochs.<\/p>\n<pre class=\"crayon-plain-tag\"># fit model\r\nmodel.fit_generator(generator, steps_per_epoch=1, epochs=200, verbose=0)<\/pre>\n<p>Once fit, we will make an out of sample prediction.<\/p>\n<p>Given the inputs [9, 10], we will make a prediction and expect the model to predict [11], or close to it. The model is not tuned; this is just an example of how to use the generator.<\/p>\n<pre class=\"crayon-plain-tag\"># make a one step prediction out of sample\r\nx_input = array([9, 10]).reshape((1, n_input))\r\nyhat = model.predict(x_input, verbose=0)<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># univariate one step problem with mlp\r\nfrom numpy import array\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.preprocessing.sequence import TimeseriesGenerator\r\n# define dataset\r\nseries = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\r\n# define generator\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(series, series, length=n_input, batch_size=8)\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(Dense(100, activation='relu', input_dim=n_input))\r\nmodel.add(Dense(1))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit_generator(generator, steps_per_epoch=1, epochs=200, verbose=0)\r\n# make a one step prediction out of sample\r\nx_input = array([9, 10]).reshape((1, n_input))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example prepares the generator, fits the model, and makes the out of sample prediction, correctly predicting a value close to 11.<\/p>\n<pre class=\"crayon-plain-tag\">[[11.510406]]<\/pre>\n<p>We can also use the generator to fit a recurrent neural network, such as a Long Short-Term Memory network, or LSTM.<\/p>\n<p>The LSTM expects data input to have the shape [<em>samples, timesteps, features<\/em>], whereas the generator described so far is providing lag observations as features or the shape [<em>samples, features<\/em>].<\/p>\n<p>We can reshape the univariate time series prior to preparing the generator from [10, ] to [10, 1] for 10 time steps and 1 feature; for example:<\/p>\n<pre class=\"crayon-plain-tag\"># reshape to [10, 1]\r\nn_features = 1\r\nseries = series.reshape((len(series), n_features))<\/pre>\n<p>The TimeseriesGenerator will then split the series into samples with the shape [<em>batch, n_input, 1<\/em>] or [8, 2, 1] for all eight samples in the generator and the two lag observations used as time steps.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># univariate one step problem with lstm\r\nfrom numpy import array\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.layers import LSTM\r\nfrom keras.preprocessing.sequence import TimeseriesGenerator\r\n# define dataset\r\nseries = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\r\n# reshape to [10, 1]\r\nn_features = 1\r\nseries = series.reshape((len(series), n_features))\r\n# define generator\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(series, series, length=n_input, batch_size=8)\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(LSTM(100, activation='relu', input_shape=(n_input, n_features)))\r\nmodel.add(Dense(1))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit_generator(generator, steps_per_epoch=1, epochs=500, verbose=0)\r\n# make a one step prediction out of sample\r\nx_input = array([9, 10]).reshape((1, n_input, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Again, running the example prepares the data, fits the model, and predicts the next out of sample value in the sequence.<\/p>\n<pre class=\"crayon-plain-tag\">[[11.092189]]<\/pre>\n<\/p>\n<h2>Multivariate Time Series Example<\/h2>\n<p>The TimeseriesGenerator also supports multivariate time series problems.<\/p>\n<p>These are problems where you have multiple parallel series, with observations at the same time step in each series.<\/p>\n<p>We can demonstrate this with an example.<\/p>\n<p>First, we can contrive a dataset of two parallel series.<\/p>\n<pre class=\"crayon-plain-tag\"># define dataset\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95, 105])<\/pre>\n<p>It is a standard structure to have multivariate time series formatted such that each time series is a separate column and rows are the observations at each time step.<\/p>\n<p>The series we have defined are vectors, but we can convert them into columns. We can reshape each series into an array with the shape [10, 1] for the 10 time steps and 1 feature.<\/p>\n<pre class=\"crayon-plain-tag\"># reshape series\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))<\/pre>\n<p>We can now horizontally stack the columns into a dataset by calling the <em>hstack()<\/em> NumPy function.<\/p>\n<pre class=\"crayon-plain-tag\"># horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2))<\/pre>\n<p>We can now provide this dataset to the TimeseriesGenerator directly. We will use the prior two observations of each series as input and the next observation of each series as output.<\/p>\n<pre class=\"crayon-plain-tag\"># define generator\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(dataset, dataset, length=n_input, batch_size=1)<\/pre>\n<p>Each sample will then be a three-dimensional array of [1, 2, 2] for the 1 sample, 2 time steps, and 2 features or parallel series. The output will be a two-dimensional series of [1, 2] for the 1 sample and 2 features. The first sample will be:<\/p>\n<pre class=\"crayon-plain-tag\">X, \t\t\t\t\t\t\ty\r\n[[10, 15], [20, 25]]\t\t[[30, 35]]<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate one step problem\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.preprocessing.sequence import TimeseriesGenerator\r\n# define dataset\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95, 105])\r\n# reshape series\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2))\r\nprint(dataset)\r\n# define generator\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(dataset, dataset, length=n_input, batch_size=1)\r\n# number of samples\r\nprint('Samples: %d' % len(generator))\r\n# print each sample\r\nfor i in range(len(generator)):\r\n\tx, y = generator[i]\r\n\tprint('%s => %s' % (x, y))<\/pre>\n<p>Running the example will first print the prepared dataset, followed by the total number of samples in the dataset.<\/p>\n<p>Next, the input and output portion of each sample is printed, confirming our intended structure.<\/p>\n<pre class=\"crayon-plain-tag\">[[ 10  15]\r\n [ 20  25]\r\n [ 30  35]\r\n [ 40  45]\r\n [ 50  55]\r\n [ 60  65]\r\n [ 70  75]\r\n [ 80  85]\r\n [ 90  95]\r\n [100 105]]\r\n\r\nSamples: 8\r\n\r\n[[[10. 15.]\r\n  [20. 25.]]] => [[30. 35.]]\r\n[[[20. 25.]\r\n  [30. 35.]]] => [[40. 45.]]\r\n[[[30. 35.]\r\n  [40. 45.]]] => [[50. 55.]]\r\n[[[40. 45.]\r\n  [50. 55.]]] => [[60. 65.]]\r\n[[[50. 55.]\r\n  [60. 65.]]] => [[70. 75.]]\r\n[[[60. 65.]\r\n  [70. 75.]]] => [[80. 85.]]\r\n[[[70. 75.]\r\n  [80. 85.]]] => [[90. 95.]]\r\n[[[80. 85.]\r\n  [90. 95.]]] => [[100. 105.]]<\/pre>\n<p>The three-dimensional structure of the samples means that the generator cannot be used directly for simple models like MLPs.<\/p>\n<p>This could be achieved by first flattening the time series dataset to a one-dimensional vector prior to providing it to the TimeseriesGenerator and set length to the number of steps to use as input multiplied by the number of columns in the series (<em>n_steps * n_features<\/em>).<\/p>\n<p>A limitation of this approach is that the generator will only allow you to predict one variable. You almost certainly may be better off writing your own function to prepare multivariate time series for an MLP than using the TimeseriesGenerator.<\/p>\n<p>The three-dimensional structure of the samples can be used directly by CNN and LSTM models. A complete example for multivariate time series forecasting with the TimeseriesGenerator is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate one step problem with lstm\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.layers import LSTM\r\nfrom keras.preprocessing.sequence import TimeseriesGenerator\r\n# define dataset\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95, 105])\r\n# reshape series\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2))\r\n# define generator\r\nn_features = dataset.shape[1]\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(dataset, dataset, length=n_input, batch_size=8)\r\n# define model\r\nmodel = Sequential()\r\nmodel.add(LSTM(100, activation='relu', input_shape=(n_input, n_features)))\r\nmodel.add(Dense(2))\r\nmodel.compile(optimizer='adam', loss='mse')\r\n# fit model\r\nmodel.fit_generator(generator, steps_per_epoch=1, epochs=500, verbose=0)\r\n# make a one step prediction out of sample\r\nx_input = array([[90, 95], [100, 105]]).reshape((1, n_input, n_features))\r\nyhat = model.predict(x_input, verbose=0)\r\nprint(yhat)<\/pre>\n<p>Running the example prepares the data, fits the model, and makes a prediction for the next value in each of the input time series, which we expect to be [110, 115].<\/p>\n<pre class=\"crayon-plain-tag\">[[111.03207 116.58153]]<\/pre>\n<\/p>\n<h2>Multivariate Inputs and Dependent Series Example<\/h2>\n<p>There are multivariate time series problems where there are one or more input series and a separate output series to be forecasted that is dependent upon the input series.<\/p>\n<p>To make this concrete, we can contrive one example with two input time series and an output series that is the sum of the input series.<\/p>\n<pre class=\"crayon-plain-tag\"># define dataset\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95, 105])\r\nout_seq = array([25, 45, 65, 85, 105, 125, 145, 165, 185, 205])<\/pre>\n<p>Where values in the output sequence are the sum of values at the same time step in the input time series.<\/p>\n<pre class=\"crayon-plain-tag\">10 + 15 = 25<\/pre>\n<p>This is different from prior examples where, given inputs, we wish to predict a value in the target time series for the next time step, not the same time step as the input.<\/p>\n<p>For example, we want samples like:<\/p>\n<pre class=\"crayon-plain-tag\">X, \t\t\ty\r\n[10, 15],\t25\r\n[20, 25],\t45\r\n[30, 35],\t65\r\n...<\/pre>\n<p>We don\u2019t want samples like the following:<\/p>\n<pre class=\"crayon-plain-tag\">X, \t\t\ty\r\n[10, 15],\t45\r\n[20, 25],\t65\r\n[30, 35],\t85\r\n...<\/pre>\n<p>Nevertheless, the TimeseriesGenerator class assumes that we are predicting the next time step and will provide data as in the second case above.<\/p>\n<p>For example:<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate one step problem\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom keras.preprocessing.sequence import TimeseriesGenerator\r\n# define dataset\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95, 105])\r\nout_seq = array([25, 45, 65, 85, 105, 125, 145, 165, 185, 205])\r\n# reshape series\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2))\r\n# define generator\r\nn_input = 1\r\ngenerator = TimeseriesGenerator(dataset, out_seq, length=n_input, batch_size=1)\r\n# print each sample\r\nfor i in range(len(generator)):\r\n\tx, y = generator[i]\r\n\tprint('%s => %s' % (x, y))<\/pre>\n<p>Running the example prints the input and output portions of the samples with the output values for the next time step rather than the current time step as we may desire for this type of problem.<\/p>\n<pre class=\"crayon-plain-tag\">[[[10. 15.]]] => [[45.]]\r\n[[[20. 25.]]] => [[65.]]\r\n[[[30. 35.]]] => [[85.]]\r\n[[[40. 45.]]] => [[105.]]\r\n[[[50. 55.]]] => [[125.]]\r\n[[[60. 65.]]] => [[145.]]\r\n[[[70. 75.]]] => [[165.]]\r\n[[[80. 85.]]] => [[185.]]\r\n[[[90. 95.]]] => [[205.]]<\/pre>\n<p>We can therefore modify the target series (<em>out_seq<\/em>) and insert an additional value at the beginning in order to push all observations down by one time step.<\/p>\n<p>This artificial shift will allow the preferred framing of the problem.<\/p>\n<pre class=\"crayon-plain-tag\"># shift the target sample by one step\r\nout_seq = insert(out_seq, 0, 0)<\/pre>\n<p>The complete example with this shift is provided below.<\/p>\n<pre class=\"crayon-plain-tag\"># multivariate one step problem\r\nfrom numpy import array\r\nfrom numpy import hstack\r\nfrom numpy import insert\r\nfrom keras.preprocessing.sequence import TimeseriesGenerator\r\n# define dataset\r\nin_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])\r\nin_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95, 105])\r\nout_seq = array([25, 45, 65, 85, 105, 125, 145, 165, 185, 205])\r\n# reshape series\r\nin_seq1 = in_seq1.reshape((len(in_seq1), 1))\r\nin_seq2 = in_seq2.reshape((len(in_seq2), 1))\r\nout_seq = out_seq.reshape((len(out_seq), 1))\r\n# horizontally stack columns\r\ndataset = hstack((in_seq1, in_seq2))\r\n# shift the target sample by one step\r\nout_seq = insert(out_seq, 0, 0)\r\n# define generator\r\nn_input = 1\r\ngenerator = TimeseriesGenerator(dataset, out_seq, length=n_input, batch_size=1)\r\n# print each sample\r\nfor i in range(len(generator)):\r\n\tx, y = generator[i]\r\n\tprint('%s => %s' % (x, y))<\/pre>\n<p>Running the example shows the preferred framing of the problem.<\/p>\n<p>This approach will work regardless of the length of the input sample.<\/p>\n<pre class=\"crayon-plain-tag\">[[[10. 15.]]] => [25.]\r\n[[[20. 25.]]] => [45.]\r\n[[[30. 35.]]] => [65.]\r\n[[[40. 45.]]] => [85.]\r\n[[[50. 55.]]] => [105.]\r\n[[[60. 65.]]] => [125.]\r\n[[[70. 75.]]] => [145.]\r\n[[[80. 85.]]] => [165.]\r\n[[[90. 95.]]] => [185.]<\/pre>\n<\/p>\n<h2>Multi-step Forecasts Example<\/h2>\n<p>A benefit of neural network models over many other types of classical and machine learning models is that they can make multi-step forecasts.<\/p>\n<p>That is, that the model can learn to map an input pattern of one or more features to an output pattern of more than one feature. This can be used in time series forecasting to directly forecast multiple future time steps.<\/p>\n<p>This can be achieved either by directly outputting a vector from the model, by specifying the desired number of outputs as the number of nodes in the output layer, or it can be achieved by specialized sequence prediction models such as an encoder-decoder model.<\/p>\n<p>A limitation of the TimeseriesGenerator is that it does not directly support multi-step outputs. Specifically, it will not create the multiple steps that may be required in the target sequence.<\/p>\n<p>Nevertheless, if you prepare your target sequence to have multiple steps, it will honor and use them as the output portion of each sample. This means the onus is on you to prepare the expected output for each time step.<\/p>\n<p>We can demonstrate this with a simple univariate time series with two time steps in the output sequence.<\/p>\n<p>You can see that you must have the same number of rows in the target sequence as you do in the input sequence. In this case, we must know values beyond the values in the input sequence, or trim the input sequence to the length of the target sequence.<\/p>\n<pre class=\"crayon-plain-tag\"># define dataset\r\nseries = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\r\ntarget = array([[1,2],[2,3],[3,4],[4,5],[5,6],[6,7],[7,8],[8,9],[9,10],[10,11]])<\/pre>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># univariate multi-step problem\r\nfrom numpy import array\r\nfrom keras.preprocessing.sequence import TimeseriesGenerator\r\n# define dataset\r\nseries = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])\r\ntarget = array([[1,2],[2,3],[3,4],[4,5],[5,6],[6,7],[7,8],[8,9],[9,10],[10,11]])\r\n# define generator\r\nn_input = 2\r\ngenerator = TimeseriesGenerator(series, target, length=n_input, batch_size=1)\r\n# print each sample\r\nfor i in range(len(generator)):\r\n\tx, y = generator[i]\r\n\tprint('%s => %s' % (x, y))<\/pre>\n<p>Running the example prints the input and output portions of the samples showing the two lag observations as input and the two steps as output in the multi-step forecasting problem.<\/p>\n<pre class=\"crayon-plain-tag\">[[1. 2.]] => [[3. 4.]]\r\n[[2. 3.]] => [[4. 5.]]\r\n[[3. 4.]] => [[5. 6.]]\r\n[[4. 5.]] => [[6. 7.]]\r\n[[5. 6.]] => [[7. 8.]]\r\n[[6. 7.]] => [[8. 9.]]\r\n[[7. 8.]] => [[ 9. 10.]]\r\n[[8. 9.]] => [[10. 11.]]<\/pre>\n<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<ul>\n<li><a href=\"https:\/\/machinelearningmastery.com\/convert-time-series-supervised-learning-problem-python\/\">How to Convert a Time Series to a Supervised Learning Problem in Python<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/preprocessing\/sequence\/\">TimeseriesGenerator Keras API<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/utils\/#sequence\">Sequence Keras API<\/a><\/li>\n<li><a href=\"https:\/\/keras.io\/models\/sequential\/\">Sequential Model Keras API<\/a><\/li>\n<li><a href=\"https:\/\/wiki.python.org\/moin\/Generators\">Python Generator<\/a><\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to use the Keras TimeseriesGenerator for preparing time series data for modeling with deep learning methods.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to define the TimeseriesGenerator generator and use it to fit deep learning models.<\/li>\n<li>How to prepare a generator for univariate time series and fit MLP and LSTM models.<\/li>\n<li>How to prepare a generator for multivariate time series and fit an LSTM model.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/how-to-use-the-timeseriesgenerator-for-time-series-forecasting-in-keras\/\">How to Use the TimeseriesGenerator for Time Series Forecasting in Keras<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/how-to-use-the-timeseriesgenerator-for-time-series-forecasting-in-keras\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee Time series data must be transformed into a structure of samples with input and output components before it can be used to [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/11\/06\/how-to-use-the-timeseriesgenerator-for-time-series-forecasting-in-keras\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":1264,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1263"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=1263"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1263\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/1264"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=1263"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=1263"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=1263"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}