{"id":1205,"date":"2018-10-23T18:00:46","date_gmt":"2018-10-23T18:00:46","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/10\/23\/how-to-grid-search-sarima-model-hyperparameters-for-time-series-forecasting-in-python\/"},"modified":"2018-10-23T18:00:46","modified_gmt":"2018-10-23T18:00:46","slug":"how-to-grid-search-sarima-model-hyperparameters-for-time-series-forecasting-in-python","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/10\/23\/how-to-grid-search-sarima-model-hyperparameters-for-time-series-forecasting-in-python\/","title":{"rendered":"How to Grid Search SARIMA Model Hyperparameters for Time Series Forecasting in Python"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>The Seasonal Autoregressive Integrated Moving Average, or SARIMA, model is an approach for modeling univariate time series data that may contain trend and seasonal components.<\/p>\n<p>It is an effective approach for time series forecasting, although it requires careful analysis and domain expertise in order to configure the seven or more model hyperparameters.<\/p>\n<p>An alternative approach to configuring the model that makes use of fast and parallel modern hardware is to grid search a suite of hyperparameter configurations in order to discover what works best. Often, this process can reveal non-intuitive model configurations that result in lower forecast error than those configurations specified through careful analysis.<\/p>\n<p>In this tutorial, you will discover how to develop a framework for grid searching all of the SARIMA model hyperparameters for univariate time series forecasting.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to develop a framework for grid searching SARIMA models from scratch using walk-forward validation.<\/li>\n<li>How to grid search SARIMA model hyperparameters for daily time series data for births.<\/li>\n<li>How to grid search SARIMA model hyperparameters for monthly time series data for shampoo sales, car sales, and temperature.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_6358\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6358\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/10\/How-to-Grid-Search-SARIMA-Model-Hyperparameters-for-Time-Series-Forecasting-in-Python.jpg\" alt=\"How to Grid Search SARIMA Model Hyperparameters for Time Series Forecasting in Python\" width=\"640\" height=\"360\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/10\/How-to-Grid-Search-SARIMA-Model-Hyperparameters-for-Time-Series-Forecasting-in-Python.jpg 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/10\/How-to-Grid-Search-SARIMA-Model-Hyperparameters-for-Time-Series-Forecasting-in-Python-300x169.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p class=\"wp-caption-text\">How to Grid Search SARIMA Model Hyperparameters for Time Series Forecasting in Python<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/photommo\/17832992898\/\">Thomas<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into six parts; they are:<\/p>\n<ol>\n<li>SARIMA for Time Series Forecasting<\/li>\n<li>Develop a Grid Search Framework<\/li>\n<li>Case Study 1: No Trend or Seasonality<\/li>\n<li>Case Study 2: Trend<\/li>\n<li>Case Study 3: Seasonality<\/li>\n<li>Case Study 4: Trend and Seasonality<\/li>\n<\/ol>\n<h2>SARIMA for Time Series Forecasting<\/h2>\n<p>Seasonal Autoregressive Integrated Moving Average, SARIMA or Seasonal ARIMA, is an extension of ARIMA that explicitly supports univariate time series data with a seasonal component.<\/p>\n<p>It adds three new hyperparameters to specify the autoregression (AR), differencing (I), and moving average (MA) for the seasonal component of the series, as well as an additional parameter for the period of the seasonality.<\/p>\n<blockquote>\n<p>A seasonal ARIMA model is formed by including additional seasonal terms in the ARIMA [\u2026] The seasonal part of the model consists of terms that are very similar to the non-seasonal components of the model, but they involve backshifts of the seasonal period.<\/p>\n<\/blockquote>\n<p>\u2014 Page 242, <a href=\"https:\/\/amzn.to\/2xlJsfV\">Forecasting: principles and practice<\/a>, 2013.<\/p>\n<p>Configuring a SARIMA requires selecting hyperparameters for both the trend and seasonal elements of the series.<\/p>\n<p>There are three trend elements that require configuration.<\/p>\n<p>They are the same as the ARIMA model; specifically:<\/p>\n<ul>\n<li><strong>p<\/strong>: Trend autoregression order.<\/li>\n<li><strong>d<\/strong>: Trend difference order.<\/li>\n<li><strong>q<\/strong>: Trend moving average order.<\/li>\n<\/ul>\n<p>There are four seasonal elements that are not part of ARIMA that must be configured; they are:<\/p>\n<ul>\n<li><strong>P<\/strong>: Seasonal autoregressive order.<\/li>\n<li><strong>D<\/strong>: Seasonal difference order.<\/li>\n<li><strong>Q<\/strong>: Seasonal moving average order.<\/li>\n<li><strong>m<\/strong>: The number of time steps for a single seasonal period.<\/li>\n<\/ul>\n<p>Together, the notation for a SARIMA model is specified as:<\/p>\n<pre class=\"crayon-plain-tag\">SARIMA(p,d,q)(P,D,Q)m<\/pre>\n<p>The SARIMA model can subsume the ARIMA, ARMA, AR, and MA models via model configuration parameters.<\/p>\n<p>The trend and seasonal hyperparameters of the model can be configured by analyzing autocorrelation and partial autocorrelation plots, and this can take some expertise.<\/p>\n<p>An alternative approach is to grid search a suite of model configurations and discover which configurations work best for a specific univariate time series.<\/p>\n<blockquote>\n<p>Seasonal ARIMA models can potentially have a large number of parameters and combinations of terms. Therefore, it is appropriate to try out a wide range of models when fitting to data and choose a best fitting model using an appropriate criterion \u2026<\/p>\n<\/blockquote>\n<p>\u2014 Pages 143-144, <a href=\"https:\/\/amzn.to\/2smB9LR\">Introductory Time Series with R<\/a>, 2009.<\/p>\n<p>This approach can be faster on modern computers than an analysis process and can reveal surprising findings that might not be obvious and result in lower forecast error.<\/p>\n<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Need help with Deep Learning for Time Series?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14531ee73f72a2%3A164f8be4f346dc\/5630742793027584\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"14531ee73f72a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14531ee73f72a2%3A164f8be4f346dc\/5630742793027584\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1534880695.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h2>Develop a Grid Search Framework<\/h2>\n<p>In this section, we will develop a framework for grid searching SARIMA model hyperparameters for a given univariate time series forecasting problem.<\/p>\n<p>We will use the implementation of <a href=\"http:\/\/www.statsmodels.org\/dev\/generated\/statsmodels.tsa.statespace.sarimax.SARIMAX.html\">SARIMA<\/a> provided by the statsmodels library.<\/p>\n<p>This model has hyperparameters that control the nature of the model performed for the series, trend and seasonality, specifically:<\/p>\n<ul>\n<li><strong>order<\/strong>: A tuple p, d, and q parameters for the modeling of the trend.<\/li>\n<li><strong>sesonal_order<\/strong>: A tuple of P, D, Q, and m parameters for the modeling the seasonality<\/li>\n<li><strong>trend<\/strong>: A parameter for controlling a model of the deterministic trend as one of \u2018n\u2019,\u2019c\u2019,\u2019t\u2019,\u2019ct\u2019 for no trend, constant, linear, and constant with linear trend, respectively.<\/li>\n<\/ul>\n<p>If you know enough about your problem to specify one or more of these parameters, then you should specify them. If not, you can try grid searching these parameters.<\/p>\n<p>We can start-off by defining a function that will fit a model with a given configuration and make a one-step forecast.<\/p>\n<p>The <em>sarima_forecast()<\/em> below implements this behavior.<\/p>\n<p>The function takes an array or list of contiguous prior observations and a list of configuration parameters used to configure the model, specifically two tuples and a string for the trend order, seasonal order trend, and parameter.<\/p>\n<p>We also try to make the model robust by relaxing constraints, such as that the data must be stationary and that the MA transform be invertible.<\/p>\n<pre class=\"crayon-plain-tag\"># one-step sarima forecast\r\ndef sarima_forecast(history, config):\r\n\torder, sorder, trend = config\r\n\t# define model\r\n\tmodel = SARIMAX(history, order=order, seasonal_order=sorder, trend=trend, enforce_stationarity=False, enforce_invertibility=False)\r\n\t# fit model\r\n\tmodel_fit = model.fit(disp=False)\r\n\t# make one step forecast\r\n\tyhat = model_fit.predict(len(history), len(history))\r\n\treturn yhat[0]<\/pre>\n<p>Next, we need to build up some functions for fitting and evaluating a model repeatedly via walk-forward validation, including splitting a dataset into train and test sets and evaluating one-step forecasts.<\/p>\n<p>We can split a list or NumPy array of data using a slice given a specified size of the split, e.g. the number of time steps to use from the data in the test set.<\/p>\n<p>The <em>train_test_split()<\/em> function below implements this for a provided dataset and a specified number of time steps to use in the test set.<\/p>\n<pre class=\"crayon-plain-tag\"># split a univariate dataset into train\/test sets\r\ndef train_test_split(data, n_test):\r\n\treturn data[:-n_test], data[-n_test:]<\/pre>\n<p>After forecasts have been made for each step in the test dataset, they need to be compared to the test set in order to calculate an error score.<\/p>\n<p>There are many popular error scores for time series forecasting. In this case we will use root mean squared error (RMSE), but you can change this to your preferred measure, e.g. MAPE, MAE, etc.<\/p>\n<p>The <em>measure_rmse()<\/em> function below will calculate the RMSE given a list of actual (the test set) and predicted values.<\/p>\n<pre class=\"crayon-plain-tag\"># root mean squared error or rmse\r\ndef measure_rmse(actual, predicted):\r\n\treturn sqrt(mean_squared_error(actual, predicted))<\/pre>\n<p>We can now implement the walk-forward validation scheme. This is a standard approach to evaluating a time series forecasting model that respects the temporal ordering of observations.<\/p>\n<p>First, a provided univariate time series dataset is split into train and test sets using the <em>train_test_split()<\/em> function. Then the number of observations in the test set are enumerated. For each we fit a model on all of the history and make a one step forecast. The true observation for the time step is then added to the history and the process is repeated. The <em>sarima_forecast()<\/em> function is called in order to fit a model and make a prediction. Finally, an error score is calculated by comparing all one-step forecasts to the actual test set by calling the <em>measure_rmse()<\/em> function.<\/p>\n<p>The <em>walk_forward_validation()<\/em> function below implements this, taking a univariate time series, a number of time steps to use in the test set, and an array of model configuration.<\/p>\n<pre class=\"crayon-plain-tag\"># walk-forward validation for univariate data\r\ndef walk_forward_validation(data, n_test, cfg):\r\n\tpredictions = list()\r\n\t# split dataset\r\n\ttrain, test = train_test_split(data, n_test)\r\n\t# seed history with training dataset\r\n\thistory = [x for x in train]\r\n\t# step over each time-step in the test set\r\n\tfor i in range(len(test)):\r\n\t\t# fit model and make forecast for history\r\n\t\tyhat = sarima_forecast(history, cfg)\r\n\t\t# store forecast in list of predictions\r\n\t\tpredictions.append(yhat)\r\n\t\t# add actual observation to history for the next loop\r\n\t\thistory.append(test[i])\r\n\t# estimate prediction error\r\n\terror = measure_rmse(test, predictions)\r\n\treturn error<\/pre>\n<p>If you are interested in making multi-step predictions, you can change the call to <em>predict()<\/em> in the <em>sarima_forecast()<\/em> function and also change the calculation of error in the <em>measure_rmse()<\/em> function.<\/p>\n<p>We can call <em>walk_forward_validation()<\/em> repeatedly with different lists of model configurations.<\/p>\n<p>One possible issue is that some combinations of model configurations may not be called for the model and will throw an exception, e.g. specifying some but not all aspects of the seasonal structure in the data.<\/p>\n<p>Further, some models may also raise warnings on some data, e.g. from the linear algebra libraries called by the statsmodels library.<\/p>\n<p>We can trap exceptions and ignore warnings during the grid search by wrapping all calls to <em>walk_forward_validation()<\/em> with a try-except and a block to ignore warnings. We can also add debugging support to disable these protections in the case we want to see what is really going on. Finally, if an error does occur, we can return a None result, otherwise we can print some information about the skill of each model evaluated. This is helpful when a large number of models are evaluated.<\/p>\n<p>The <em>score_model()<\/em> function below implements this and returns a tuple of (key and result), where the key is a string version of the tested model configuration.<\/p>\n<pre class=\"crayon-plain-tag\"># score a model, return None on failure\r\ndef score_model(data, n_test, cfg, debug=False):\r\n\tresult = None\r\n\t# convert config to a key\r\n\tkey = str(cfg)\r\n\t# show all warnings and fail on exception if debugging\r\n\tif debug:\r\n\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\telse:\r\n\t\t# one failure during model validation suggests an unstable config\r\n\t\ttry:\r\n\t\t\t# never show warnings when grid searching, too noisy\r\n\t\t\twith catch_warnings():\r\n\t\t\t\tfilterwarnings(\"ignore\")\r\n\t\t\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\t\texcept:\r\n\t\t\terror = None\r\n\t# check for an interesting result\r\n\tif result is not None:\r\n\t\tprint(' > Model[%s] %.3f' % (key, result))\r\n\treturn (key, result)<\/pre>\n<p>Next, we need a loop to test a list of different model configurations.<\/p>\n<p>This is the main function that drives the grid search process and will call the <em>score_model()<\/em> function for each model configuration.<\/p>\n<p>We can dramatically speed up the grid search process by evaluating model configurations in parallel. One way to do that is to use the <a href=\"https:\/\/pythonhosted.org\/joblib\/\">Joblib library<\/a>.<\/p>\n<p>We can define a Parallel object with the number of cores to use and set it to the number of scores detected in your hardware.<\/p>\n<pre class=\"crayon-plain-tag\">executor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')<\/pre>\n<p>We can then can then create a list of tasks to execute in parallel, which will be one call to the <em>score_model()<\/em> function for each model configuration we have.<\/p>\n<pre class=\"crayon-plain-tag\">tasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)<\/pre>\n<p>Finally, we can use the Parallel object to execute the list of tasks in parallel.<\/p>\n<pre class=\"crayon-plain-tag\">scores = executor(tasks)<\/pre>\n<p>That\u2019s it.<\/p>\n<p>We can also provide a non-parallel version of evaluating all model configurations in case we want to debug something.<\/p>\n<pre class=\"crayon-plain-tag\">scores = [score_model(data, n_test, cfg) for cfg in cfg_list]<\/pre>\n<p>The result of evaluating a list of configurations will be a list of tuples, each with a name that summarizes a specific model configuration and the error of the model evaluated with that configuration as either the RMSE or None if there was an error.<\/p>\n<p>We can filter out all scores with a None.<\/p>\n<pre class=\"crayon-plain-tag\">scores = [r for r in scores if r[1] != None]<\/pre>\n<p>We can then sort all tuples in the list by the score in ascending order (best are first), then return this list of scores for review.<\/p>\n<p>The <em>grid_search()<\/em> function below implements this behavior given a univariate time series dataset, a list of model configurations (list of lists), and the number of time steps to use in the test set. An optional <em>parallel<\/em> argument allows the evaluation of models across all cores to be tuned on or off, and is on by default.<\/p>\n<pre class=\"crayon-plain-tag\"># grid search configs\r\ndef grid_search(data, cfg_list, n_test, parallel=True):\r\n\tscores = None\r\n\tif parallel:\r\n\t\t# execute configs in parallel\r\n\t\texecutor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')\r\n\t\ttasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)\r\n\t\tscores = executor(tasks)\r\n\telse:\r\n\t\tscores = [score_model(data, n_test, cfg) for cfg in cfg_list]\r\n\t# remove empty results\r\n\tscores = [r for r in scores if r[1] != None]\r\n\t# sort configs by error, asc\r\n\tscores.sort(key=lambda tup: tup[1])\r\n\treturn scores<\/pre>\n<p>We\u2019re nearly done.<\/p>\n<p>The only thing left to do is to define a list of model configurations to try for a dataset.<\/p>\n<p>We can define this generically. The only parameter we may want to specify is the periodicity of the seasonal component in the series, if one exists. By default, we will assume no seasonal component.<\/p>\n<p>The <em>sarima_configs()<\/em> function below will create a list of model configurations to evaluate.<\/p>\n<p>The configurations assume each of the AR, MA, and I components for trend and seasonality are low order, e.g. off (0) or in [1,2]. You may want to extend these ranges if you believe the order may be higher. An optional list of seasonal periods can be specified, and you could even change the function to specify other elements that you may know about your time series.<\/p>\n<p>In theory, there are 1,296 possible model configurations to evaluate, but in practice, many will not be valid and will result in an error that we will trap and ignore.<\/p>\n<pre class=\"crayon-plain-tag\"># create a set of sarima configs to try\r\ndef sarima_configs(seasonal=[0]):\r\n\tmodels = list()\r\n\t# define config lists\r\n\tp_params = [0, 1, 2]\r\n\td_params = [0, 1]\r\n\tq_params = [0, 1, 2]\r\n\tt_params = ['n','c','t','ct']\r\n\tP_params = [0, 1, 2]\r\n\tD_params = [0, 1]\r\n\tQ_params = [0, 1, 2]\r\n\tm_params = seasonal\r\n\t# create config instances\r\n\tfor p in p_params:\r\n\t\tfor d in d_params:\r\n\t\t\tfor q in q_params:\r\n\t\t\t\tfor t in t_params:\r\n\t\t\t\t\tfor P in P_params:\r\n\t\t\t\t\t\tfor D in D_params:\r\n\t\t\t\t\t\t\tfor Q in Q_params:\r\n\t\t\t\t\t\t\t\tfor m in m_params:\r\n\t\t\t\t\t\t\t\t\tcfg = [(p,d,q), (P,D,Q,m), t]\r\n\t\t\t\t\t\t\t\t\tmodels.append(cfg)\r\n\treturn models<\/pre>\n<p>We now have a framework for grid searching SARIMA model hyperparameters via one-step walk-forward validation.<\/p>\n<p>It is generic and will work for any in-memory univariate time series provided as a list or NumPy array.<\/p>\n<p>We can make sure all the pieces work together by testing it on a contrived 10-step dataset.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># grid search sarima hyperparameters\r\nfrom math import sqrt\r\nfrom multiprocessing import cpu_count\r\nfrom joblib import Parallel\r\nfrom joblib import delayed\r\nfrom warnings import catch_warnings\r\nfrom warnings import filterwarnings\r\nfrom statsmodels.tsa.statespace.sarimax import SARIMAX\r\nfrom sklearn.metrics import mean_squared_error\r\n\r\n# one-step sarima forecast\r\ndef sarima_forecast(history, config):\r\n\torder, sorder, trend = config\r\n\t# define model\r\n\tmodel = SARIMAX(history, order=order, seasonal_order=sorder, trend=trend, enforce_stationarity=False, enforce_invertibility=False)\r\n\t# fit model\r\n\tmodel_fit = model.fit(disp=False)\r\n\t# make one step forecast\r\n\tyhat = model_fit.predict(len(history), len(history))\r\n\treturn yhat[0]\r\n\r\n# root mean squared error or rmse\r\ndef measure_rmse(actual, predicted):\r\n\treturn sqrt(mean_squared_error(actual, predicted))\r\n\r\n# split a univariate dataset into train\/test sets\r\ndef train_test_split(data, n_test):\r\n\treturn data[:-n_test], data[-n_test:]\r\n\r\n# walk-forward validation for univariate data\r\ndef walk_forward_validation(data, n_test, cfg):\r\n\tpredictions = list()\r\n\t# split dataset\r\n\ttrain, test = train_test_split(data, n_test)\r\n\t# seed history with training dataset\r\n\thistory = [x for x in train]\r\n\t# step over each time-step in the test set\r\n\tfor i in range(len(test)):\r\n\t\t# fit model and make forecast for history\r\n\t\tyhat = sarima_forecast(history, cfg)\r\n\t\t# store forecast in list of predictions\r\n\t\tpredictions.append(yhat)\r\n\t\t# add actual observation to history for the next loop\r\n\t\thistory.append(test[i])\r\n\t# estimate prediction error\r\n\terror = measure_rmse(test, predictions)\r\n\treturn error\r\n\r\n# score a model, return None on failure\r\ndef score_model(data, n_test, cfg, debug=False):\r\n\tresult = None\r\n\t# convert config to a key\r\n\tkey = str(cfg)\r\n\t# show all warnings and fail on exception if debugging\r\n\tif debug:\r\n\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\telse:\r\n\t\t# one failure during model validation suggests an unstable config\r\n\t\ttry:\r\n\t\t\t# never show warnings when grid searching, too noisy\r\n\t\t\twith catch_warnings():\r\n\t\t\t\tfilterwarnings(\"ignore\")\r\n\t\t\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\t\texcept:\r\n\t\t\terror = None\r\n\t# check for an interesting result\r\n\tif result is not None:\r\n\t\tprint(' > Model[%s] %.3f' % (key, result))\r\n\treturn (key, result)\r\n\r\n# grid search configs\r\ndef grid_search(data, cfg_list, n_test, parallel=True):\r\n\tscores = None\r\n\tif parallel:\r\n\t\t# execute configs in parallel\r\n\t\texecutor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')\r\n\t\ttasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)\r\n\t\tscores = executor(tasks)\r\n\telse:\r\n\t\tscores = [score_model(data, n_test, cfg) for cfg in cfg_list]\r\n\t# remove empty results\r\n\tscores = [r for r in scores if r[1] != None]\r\n\t# sort configs by error, asc\r\n\tscores.sort(key=lambda tup: tup[1])\r\n\treturn scores\r\n\r\n# create a set of sarima configs to try\r\ndef sarima_configs(seasonal=[0]):\r\n\tmodels = list()\r\n\t# define config lists\r\n\tp_params = [0, 1, 2]\r\n\td_params = [0, 1]\r\n\tq_params = [0, 1, 2]\r\n\tt_params = ['n','c','t','ct']\r\n\tP_params = [0, 1, 2]\r\n\tD_params = [0, 1]\r\n\tQ_params = [0, 1, 2]\r\n\tm_params = seasonal\r\n\t# create config instances\r\n\tfor p in p_params:\r\n\t\tfor d in d_params:\r\n\t\t\tfor q in q_params:\r\n\t\t\t\tfor t in t_params:\r\n\t\t\t\t\tfor P in P_params:\r\n\t\t\t\t\t\tfor D in D_params:\r\n\t\t\t\t\t\t\tfor Q in Q_params:\r\n\t\t\t\t\t\t\t\tfor m in m_params:\r\n\t\t\t\t\t\t\t\t\tcfg = [(p,d,q), (P,D,Q,m), t]\r\n\t\t\t\t\t\t\t\t\tmodels.append(cfg)\r\n\treturn models\r\n\r\nif __name__ == '__main__':\r\n\t# define dataset\r\n\tdata = [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]\r\n\tprint(data)\r\n\t# data split\r\n\tn_test = 4\r\n\t# model configs\r\n\tcfg_list = sarima_configs()\r\n\t# grid search\r\n\tscores = grid_search(data, cfg_list, n_test)\r\n\tprint('done')\r\n\t# list top 3 configs\r\n\tfor cfg, error in scores[:3]:\r\n\t\tprint(cfg, error)<\/pre>\n<p>Running the example first prints the contrived time series dataset.<\/p>\n<p>Next, the model configurations and their errors are reported as they are evaluated, truncated below for brevity.<\/p>\n<p>Finally, the configurations and the error for the top three configurations are reported. We can see that many models achieve perfect performance on this simple linearly increasing contrived time series problem.<\/p>\n<pre class=\"crayon-plain-tag\">[10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]\r\n\r\n...\r\n > Model[[(2, 0, 0), (2, 0, 0, 0), 'ct']] 0.001\r\n > Model[[(2, 0, 0), (2, 0, 1, 0), 'ct']] 0.000\r\n > Model[[(2, 0, 1), (0, 0, 0, 0), 'n']] 0.000\r\n > Model[[(2, 0, 1), (0, 0, 1, 0), 'n']] 0.000\r\ndone\r\n\r\n[(2, 1, 0), (1, 0, 0, 0), 'n'] 0.0\r\n[(2, 1, 0), (2, 0, 0, 0), 'n'] 0.0\r\n[(2, 1, 1), (1, 0, 1, 0), 'n'] 0.0<\/pre>\n<p>Now that we have a robust framework for grid searching SARIMA model hyperparameters, let\u2019s test it out on a suite of standard univariate time series datasets.<\/p>\n<p>The datasets were chosen for demonstration purposes; I am not suggesting that a SARIMA model is the best approach for each dataset; perhaps an ETS or something else would be more appropriate in some cases.<\/p>\n<h2>Case Study 1: No Trend or Seasonality<\/h2>\n<p>The \u2018daily female births\u2019 dataset summarizes the daily total female births in California, USA in 1959.<\/p>\n<p>The dataset has no obvious trend or seasonal component.<\/p>\n<div id=\"attachment_6350\" style=\"width: 1450px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6350\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Daily-Female-Births-Dataset.png\" alt=\"Line Plot of the Daily Female Births Dataset\" width=\"1440\" height=\"780\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Daily-Female-Births-Dataset.png 1440w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Daily-Female-Births-Dataset-300x163.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Daily-Female-Births-Dataset-768x416.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Daily-Female-Births-Dataset-1024x555.png 1024w\" sizes=\"(max-width: 1440px) 100vw, 1440px\"><\/p>\n<p class=\"wp-caption-text\">Line Plot of the Daily Female Births Dataset<\/p>\n<\/div>\n<p>You can learn more about the dataset from <a href=\"https:\/\/datamarket.com\/data\/set\/235k\/daily-total-female-births-in-california-1959#!ds=235k&#038;display=line\">DataMarket<\/a>.<\/p>\n<p>Download the dataset directly from here:<\/p>\n<ul>\n<li><a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/daily-total-female-births.csv\">daily-total-female-births.csv<\/a><\/li>\n<\/ul>\n<p>Save the file with the filename \u2018<em>daily-total-female-births.csv<\/em>\u2018 in your current working directory.<\/p>\n<p>We can load this dataset as a Pandas series using the function <em>read_csv()<\/em>.<\/p>\n<pre class=\"crayon-plain-tag\">series = read_csv('daily-total-female-births.csv', header=0, index_col=0)<\/pre>\n<p>The dataset has one year, or 365 observations. We will use the first 200 for training and the remaining 165 as the test set.<\/p>\n<p>The complete example grid searching the daily female univariate time series forecasting problem is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># grid search sarima hyperparameters for daily female dataset\r\nfrom math import sqrt\r\nfrom multiprocessing import cpu_count\r\nfrom joblib import Parallel\r\nfrom joblib import delayed\r\nfrom warnings import catch_warnings\r\nfrom warnings import filterwarnings\r\nfrom statsmodels.tsa.statespace.sarimax import SARIMAX\r\nfrom sklearn.metrics import mean_squared_error\r\nfrom pandas import read_csv\r\n\r\n# one-step sarima forecast\r\ndef sarima_forecast(history, config):\r\n\torder, sorder, trend = config\r\n\t# define model\r\n\tmodel = SARIMAX(history, order=order, seasonal_order=sorder, trend=trend, enforce_stationarity=False, enforce_invertibility=False)\r\n\t# fit model\r\n\tmodel_fit = model.fit(disp=False)\r\n\t# make one step forecast\r\n\tyhat = model_fit.predict(len(history), len(history))\r\n\treturn yhat[0]\r\n\r\n# root mean squared error or rmse\r\ndef measure_rmse(actual, predicted):\r\n\treturn sqrt(mean_squared_error(actual, predicted))\r\n\r\n# split a univariate dataset into train\/test sets\r\ndef train_test_split(data, n_test):\r\n\treturn data[:-n_test], data[-n_test:]\r\n\r\n# walk-forward validation for univariate data\r\ndef walk_forward_validation(data, n_test, cfg):\r\n\tpredictions = list()\r\n\t# split dataset\r\n\ttrain, test = train_test_split(data, n_test)\r\n\t# seed history with training dataset\r\n\thistory = [x for x in train]\r\n\t# step over each time-step in the test set\r\n\tfor i in range(len(test)):\r\n\t\t# fit model and make forecast for history\r\n\t\tyhat = sarima_forecast(history, cfg)\r\n\t\t# store forecast in list of predictions\r\n\t\tpredictions.append(yhat)\r\n\t\t# add actual observation to history for the next loop\r\n\t\thistory.append(test[i])\r\n\t# estimate prediction error\r\n\terror = measure_rmse(test, predictions)\r\n\treturn error\r\n\r\n# score a model, return None on failure\r\ndef score_model(data, n_test, cfg, debug=False):\r\n\tresult = None\r\n\t# convert config to a key\r\n\tkey = str(cfg)\r\n\t# show all warnings and fail on exception if debugging\r\n\tif debug:\r\n\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\telse:\r\n\t\t# one failure during model validation suggests an unstable config\r\n\t\ttry:\r\n\t\t\t# never show warnings when grid searching, too noisy\r\n\t\t\twith catch_warnings():\r\n\t\t\t\tfilterwarnings(\"ignore\")\r\n\t\t\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\t\texcept:\r\n\t\t\terror = None\r\n\t# check for an interesting result\r\n\tif result is not None:\r\n\t\tprint(' > Model[%s] %.3f' % (key, result))\r\n\treturn (key, result)\r\n\r\n# grid search configs\r\ndef grid_search(data, cfg_list, n_test, parallel=True):\r\n\tscores = None\r\n\tif parallel:\r\n\t\t# execute configs in parallel\r\n\t\texecutor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')\r\n\t\ttasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)\r\n\t\tscores = executor(tasks)\r\n\telse:\r\n\t\tscores = [score_model(data, n_test, cfg) for cfg in cfg_list]\r\n\t# remove empty results\r\n\tscores = [r for r in scores if r[1] != None]\r\n\t# sort configs by error, asc\r\n\tscores.sort(key=lambda tup: tup[1])\r\n\treturn scores\r\n\r\n# create a set of sarima configs to try\r\ndef sarima_configs(seasonal=[0]):\r\n\tmodels = list()\r\n\t# define config lists\r\n\tp_params = [0, 1, 2]\r\n\td_params = [0, 1]\r\n\tq_params = [0, 1, 2]\r\n\tt_params = ['n','c','t','ct']\r\n\tP_params = [0, 1, 2]\r\n\tD_params = [0, 1]\r\n\tQ_params = [0, 1, 2]\r\n\tm_params = seasonal\r\n\t# create config instances\r\n\tfor p in p_params:\r\n\t\tfor d in d_params:\r\n\t\t\tfor q in q_params:\r\n\t\t\t\tfor t in t_params:\r\n\t\t\t\t\tfor P in P_params:\r\n\t\t\t\t\t\tfor D in D_params:\r\n\t\t\t\t\t\t\tfor Q in Q_params:\r\n\t\t\t\t\t\t\t\tfor m in m_params:\r\n\t\t\t\t\t\t\t\t\tcfg = [(p,d,q), (P,D,Q,m), t]\r\n\t\t\t\t\t\t\t\t\tmodels.append(cfg)\r\n\treturn models\r\n\r\nif __name__ == '__main__':\r\n\t# load dataset\r\n\tseries = read_csv('daily-total-female-births.csv', header=0, index_col=0)\r\n\tdata = series.values\r\n\tprint(data.shape)\r\n\t# data split\r\n\tn_test = 165\r\n\t# model configs\r\n\tcfg_list = sarima_configs()\r\n\t# grid search\r\n\tscores = grid_search(data, cfg_list, n_test)\r\n\tprint('done')\r\n\t# list top 3 configs\r\n\tfor cfg, error in scores[:3]:\r\n\t\tprint(cfg, error)<\/pre>\n<p>Running the example may take a few minutes on modern hardware.<\/p>\n<p>Model configurations and the RMSE are printed as the models are evaluated The top three model configurations and their error are reported at the end of the run.<\/p>\n<p>We can see that the best result was an RMSE of about 6.77 births with the following configuration:<\/p>\n<ul>\n<li><strong>Order<\/strong>: (1, 0, 2)<\/li>\n<li><strong>Seasonal Order<\/strong>: (1, 0, 1, 0)<\/li>\n<li><strong>Trend Parameter<\/strong>: \u2018t\u2019 for linear trend<\/li>\n<\/ul>\n<p>It is surprising that a configuration with some seasonal elements resulted in the lowest error. I would not have guessed at this configuration and would have likely stuck with an ARIMA model.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n> Model[[(2, 1, 2), (1, 0, 1, 0), 'ct']] 6.905\r\n> Model[[(2, 1, 2), (2, 0, 0, 0), 'ct']] 7.031\r\n> Model[[(2, 1, 2), (2, 0, 1, 0), 'ct']] 6.985\r\n> Model[[(2, 1, 2), (1, 0, 2, 0), 'ct']] 6.941\r\n> Model[[(2, 1, 2), (2, 0, 2, 0), 'ct']] 7.056\r\ndone\r\n\r\n[(1, 0, 2), (1, 0, 1, 0), 't'] 6.770349800255089\r\n[(0, 1, 2), (1, 0, 2, 0), 'ct'] 6.773217122759515\r\n[(2, 1, 1), (2, 0, 2, 0), 'ct'] 6.886633191752254<\/pre>\n<\/p>\n<h2>Case Study 2: Trend<\/h2>\n<p>The \u2018shampoo\u2019 dataset summarizes the monthly sales of shampoo over a three-year period.<\/p>\n<p>The dataset contains an obvious trend but no obvious seasonal component.<\/p>\n<div id=\"attachment_6351\" style=\"width: 1448px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6351\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Shampoo-Sales-Dataset.png\" alt=\"Line Plot of the Monthly Shampoo Sales Dataset\" width=\"1438\" height=\"776\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Shampoo-Sales-Dataset.png 1438w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Shampoo-Sales-Dataset-300x162.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Shampoo-Sales-Dataset-768x414.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Shampoo-Sales-Dataset-1024x553.png 1024w\" sizes=\"(max-width: 1438px) 100vw, 1438px\"><\/p>\n<p class=\"wp-caption-text\">Line Plot of the Monthly Shampoo Sales Dataset<\/p>\n<\/div>\n<p>You can learn more about the dataset from <a href=\"https:\/\/datamarket.com\/data\/set\/22r0\/sales-of-shampoo-over-a-three-year-period#!ds=22r0&#038;display=line\">DataMarket<\/a>.<\/p>\n<p>Download the dataset directly from here:<\/p>\n<ul>\n<li><a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/shampoo.csv\">shampoo.csv<\/a><\/li>\n<\/ul>\n<p>Save the file with the filename \u2018shampoo.csv\u2019 in your current working directory.<\/p>\n<p>We can load this dataset as a Pandas series using the function <em>read_csv()<\/em>.<\/p>\n<pre class=\"crayon-plain-tag\"># parse dates\r\ndef custom_parser(x):\r\n\treturn datetime.strptime('195'+x, '%Y-%m')\r\n\r\n# load dataset\r\nseries = read_csv('shampoo.csv', header=0, index_col=0, date_parser=custom_parser)<\/pre>\n<p>The dataset has three years, or 36 observations. We will use the first 24 for training and the remaining 12 as the test set.<\/p>\n<p>The complete example grid searching the shampoo sales univariate time series forecasting problem is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># grid search sarima hyperparameters for monthly shampoo sales dataset\r\nfrom math import sqrt\r\nfrom multiprocessing import cpu_count\r\nfrom joblib import Parallel\r\nfrom joblib import delayed\r\nfrom warnings import catch_warnings\r\nfrom warnings import filterwarnings\r\nfrom statsmodels.tsa.statespace.sarimax import SARIMAX\r\nfrom sklearn.metrics import mean_squared_error\r\nfrom pandas import read_csv\r\nfrom pandas import datetime\r\n\r\n# one-step sarima forecast\r\ndef sarima_forecast(history, config):\r\n\torder, sorder, trend = config\r\n\t# define model\r\n\tmodel = SARIMAX(history, order=order, seasonal_order=sorder, trend=trend, enforce_stationarity=False, enforce_invertibility=False)\r\n\t# fit model\r\n\tmodel_fit = model.fit(disp=False)\r\n\t# make one step forecast\r\n\tyhat = model_fit.predict(len(history), len(history))\r\n\treturn yhat[0]\r\n\r\n# root mean squared error or rmse\r\ndef measure_rmse(actual, predicted):\r\n\treturn sqrt(mean_squared_error(actual, predicted))\r\n\r\n# split a univariate dataset into train\/test sets\r\ndef train_test_split(data, n_test):\r\n\treturn data[:-n_test], data[-n_test:]\r\n\r\n# walk-forward validation for univariate data\r\ndef walk_forward_validation(data, n_test, cfg):\r\n\tpredictions = list()\r\n\t# split dataset\r\n\ttrain, test = train_test_split(data, n_test)\r\n\t# seed history with training dataset\r\n\thistory = [x for x in train]\r\n\t# step over each time-step in the test set\r\n\tfor i in range(len(test)):\r\n\t\t# fit model and make forecast for history\r\n\t\tyhat = sarima_forecast(history, cfg)\r\n\t\t# store forecast in list of predictions\r\n\t\tpredictions.append(yhat)\r\n\t\t# add actual observation to history for the next loop\r\n\t\thistory.append(test[i])\r\n\t# estimate prediction error\r\n\terror = measure_rmse(test, predictions)\r\n\treturn error\r\n\r\n# score a model, return None on failure\r\ndef score_model(data, n_test, cfg, debug=False):\r\n\tresult = None\r\n\t# convert config to a key\r\n\tkey = str(cfg)\r\n\t# show all warnings and fail on exception if debugging\r\n\tif debug:\r\n\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\telse:\r\n\t\t# one failure during model validation suggests an unstable config\r\n\t\ttry:\r\n\t\t\t# never show warnings when grid searching, too noisy\r\n\t\t\twith catch_warnings():\r\n\t\t\t\tfilterwarnings(\"ignore\")\r\n\t\t\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\t\texcept:\r\n\t\t\terror = None\r\n\t# check for an interesting result\r\n\tif result is not None:\r\n\t\tprint(' > Model[%s] %.3f' % (key, result))\r\n\treturn (key, result)\r\n\r\n# grid search configs\r\ndef grid_search(data, cfg_list, n_test, parallel=True):\r\n\tscores = None\r\n\tif parallel:\r\n\t\t# execute configs in parallel\r\n\t\texecutor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')\r\n\t\ttasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)\r\n\t\tscores = executor(tasks)\r\n\telse:\r\n\t\tscores = [score_model(data, n_test, cfg) for cfg in cfg_list]\r\n\t# remove empty results\r\n\tscores = [r for r in scores if r[1] != None]\r\n\t# sort configs by error, asc\r\n\tscores.sort(key=lambda tup: tup[1])\r\n\treturn scores\r\n\r\n# create a set of sarima configs to try\r\ndef sarima_configs(seasonal=[0]):\r\n\tmodels = list()\r\n\t# define config lists\r\n\tp_params = [0, 1, 2]\r\n\td_params = [0, 1]\r\n\tq_params = [0, 1, 2]\r\n\tt_params = ['n','c','t','ct']\r\n\tP_params = [0, 1, 2]\r\n\tD_params = [0, 1]\r\n\tQ_params = [0, 1, 2]\r\n\tm_params = seasonal\r\n\t# create config instances\r\n\tfor p in p_params:\r\n\t\tfor d in d_params:\r\n\t\t\tfor q in q_params:\r\n\t\t\t\tfor t in t_params:\r\n\t\t\t\t\tfor P in P_params:\r\n\t\t\t\t\t\tfor D in D_params:\r\n\t\t\t\t\t\t\tfor Q in Q_params:\r\n\t\t\t\t\t\t\t\tfor m in m_params:\r\n\t\t\t\t\t\t\t\t\tcfg = [(p,d,q), (P,D,Q,m), t]\r\n\t\t\t\t\t\t\t\t\tmodels.append(cfg)\r\n\treturn models\r\n\r\n\r\n# parse dates\r\ndef custom_parser(x):\r\n\treturn datetime.strptime('195'+x, '%Y-%m')\r\n\r\nif __name__ == '__main__':\r\n\t# load dataset\r\n\tseries = read_csv('shampoo.csv', header=0, index_col=0, date_parser=custom_parser)\r\n\tdata = series.values\r\n\tprint(data.shape)\r\n\t# data split\r\n\tn_test = 12\r\n\t# model configs\r\n\tcfg_list = sarima_configs()\r\n\t# grid search\r\n\tscores = grid_search(data, cfg_list, n_test)\r\n\tprint('done')\r\n\t# list top 3 configs\r\n\tfor cfg, error in scores[:3]:\r\n\t\tprint(cfg, error)<\/pre>\n<p>Running the example may take a few minutes on modern hardware.<\/p>\n<p>Model configurations and the RMSE are printed as the models are evaluated The top three model configurations and their error are reported at the end of the run.<\/p>\n<p>We can see that the best result was an RMSE of about 54.76 sales with the following configuration:<\/p>\n<ul>\n<li><strong>Trend Order<\/strong>: (0, 1, 2)<\/li>\n<li><strong>Seasonal Order<\/strong>: (2, 0, 2, 0)<\/li>\n<li><strong>Trend Parameter<\/strong>: \u2018t\u2019 (linear trend)<\/li>\n<\/ul>\n<pre class=\"crayon-plain-tag\">...\r\n> Model[[(2, 1, 2), (1, 0, 1, 0), 'ct']] 68.891\r\n> Model[[(2, 1, 2), (2, 0, 0, 0), 'ct']] 75.406\r\n> Model[[(2, 1, 2), (1, 0, 2, 0), 'ct']] 80.908\r\n> Model[[(2, 1, 2), (2, 0, 1, 0), 'ct']] 78.734\r\n> Model[[(2, 1, 2), (2, 0, 2, 0), 'ct']] 82.958\r\ndone\r\n[(0, 1, 2), (2, 0, 2, 0), 't'] 54.767582003072874\r\n[(0, 1, 1), (2, 0, 2, 0), 'ct'] 58.69987083057107\r\n[(1, 1, 2), (0, 0, 1, 0), 't'] 58.709089340600094<\/pre>\n<\/p>\n<h2>Case Study 3: Seasonality<\/h2>\n<p>The \u2018monthly mean temperatures\u2019 dataset summarizes the monthly average air temperatures in Nottingham Castle, England from 1920 to 1939 in degrees Fahrenheit.<\/p>\n<p>The dataset has an obvious seasonal component and no obvious trend.<\/p>\n<div id=\"attachment_6352\" style=\"width: 1464px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6352\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Mean-Temperatures-Dataset.png\" alt=\"Line Plot of the Monthly Mean Temperatures Dataset\" width=\"1454\" height=\"766\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Mean-Temperatures-Dataset.png 1454w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Mean-Temperatures-Dataset-300x158.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Mean-Temperatures-Dataset-768x405.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Mean-Temperatures-Dataset-1024x539.png 1024w\" sizes=\"(max-width: 1454px) 100vw, 1454px\"><\/p>\n<p class=\"wp-caption-text\">Line Plot of the Monthly Mean Temperatures Dataset<\/p>\n<\/div>\n<p>You can learn more about the dataset from <a href=\"https:\/\/datamarket.com\/data\/set\/22li\/mean-monthly-air-temperature-deg-f-nottingham-castle-1920-1939#!ds=22li&#038;display=line\">DataMarket<\/a>.<\/p>\n<p>Download the dataset directly from here:<\/p>\n<ul>\n<li><a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/monthly-mean-temp.csv\">monthly-mean-temp.csv<\/a><\/li>\n<\/ul>\n<p>Save the file with the filename \u2018<em>monthly-mean-temp.csv<\/em>\u2018 in your current working directory.<\/p>\n<p>We can load this dataset as a Pandas series using the function <em>read_csv()<\/em>.<\/p>\n<pre class=\"crayon-plain-tag\">series = read_csv('monthly-mean-temp.csv', header=0, index_col=0)<\/pre>\n<p>The dataset has 20 years, or 240 observations. We will trim the dataset to the last five years of data (60 observations) in order to speed up the model evaluation process and use the last year, or 12 observations, for the test set.<\/p>\n<pre class=\"crayon-plain-tag\"># trim dataset to 5 years\r\ndata = data[-(5*12):]<\/pre>\n<p>The period of the seasonal component is about one year, or 12 observations. We will use this as the seasonal period in the call to the <em>sarima_configs()<\/em> function when preparing the model configurations.<\/p>\n<pre class=\"crayon-plain-tag\"># model configs\r\ncfg_list = sarima_configs(seasonal=[0, 12])<\/pre>\n<p>The complete example grid searching the monthly mean temperature time series forecasting problem is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># grid search sarima hyperparameters for monthly mean temp dataset\r\nfrom math import sqrt\r\nfrom multiprocessing import cpu_count\r\nfrom joblib import Parallel\r\nfrom joblib import delayed\r\nfrom warnings import catch_warnings\r\nfrom warnings import filterwarnings\r\nfrom statsmodels.tsa.statespace.sarimax import SARIMAX\r\nfrom sklearn.metrics import mean_squared_error\r\nfrom pandas import read_csv\r\n\r\n# one-step sarima forecast\r\ndef sarima_forecast(history, config):\r\n\torder, sorder, trend = config\r\n\t# define model\r\n\tmodel = SARIMAX(history, order=order, seasonal_order=sorder, trend=trend, enforce_stationarity=False, enforce_invertibility=False)\r\n\t# fit model\r\n\tmodel_fit = model.fit(disp=False)\r\n\t# make one step forecast\r\n\tyhat = model_fit.predict(len(history), len(history))\r\n\treturn yhat[0]\r\n\r\n# root mean squared error or rmse\r\ndef measure_rmse(actual, predicted):\r\n\treturn sqrt(mean_squared_error(actual, predicted))\r\n\r\n# split a univariate dataset into train\/test sets\r\ndef train_test_split(data, n_test):\r\n\treturn data[:-n_test], data[-n_test:]\r\n\r\n# walk-forward validation for univariate data\r\ndef walk_forward_validation(data, n_test, cfg):\r\n\tpredictions = list()\r\n\t# split dataset\r\n\ttrain, test = train_test_split(data, n_test)\r\n\t# seed history with training dataset\r\n\thistory = [x for x in train]\r\n\t# step over each time-step in the test set\r\n\tfor i in range(len(test)):\r\n\t\t# fit model and make forecast for history\r\n\t\tyhat = sarima_forecast(history, cfg)\r\n\t\t# store forecast in list of predictions\r\n\t\tpredictions.append(yhat)\r\n\t\t# add actual observation to history for the next loop\r\n\t\thistory.append(test[i])\r\n\t# estimate prediction error\r\n\terror = measure_rmse(test, predictions)\r\n\treturn error\r\n\r\n# score a model, return None on failure\r\ndef score_model(data, n_test, cfg, debug=False):\r\n\tresult = None\r\n\t# convert config to a key\r\n\tkey = str(cfg)\r\n\t# show all warnings and fail on exception if debugging\r\n\tif debug:\r\n\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\telse:\r\n\t\t# one failure during model validation suggests an unstable config\r\n\t\ttry:\r\n\t\t\t# never show warnings when grid searching, too noisy\r\n\t\t\twith catch_warnings():\r\n\t\t\t\tfilterwarnings(\"ignore\")\r\n\t\t\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\t\texcept:\r\n\t\t\terror = None\r\n\t# check for an interesting result\r\n\tif result is not None:\r\n\t\tprint(' > Model[%s] %.3f' % (key, result))\r\n\treturn (key, result)\r\n\r\n# grid search configs\r\ndef grid_search(data, cfg_list, n_test, parallel=True):\r\n\tscores = None\r\n\tif parallel:\r\n\t\t# execute configs in parallel\r\n\t\texecutor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')\r\n\t\ttasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)\r\n\t\tscores = executor(tasks)\r\n\telse:\r\n\t\tscores = [score_model(data, n_test, cfg) for cfg in cfg_list]\r\n\t# remove empty results\r\n\tscores = [r for r in scores if r[1] != None]\r\n\t# sort configs by error, asc\r\n\tscores.sort(key=lambda tup: tup[1])\r\n\treturn scores\r\n\r\n# create a set of sarima configs to try\r\ndef sarima_configs(seasonal=[0]):\r\n\tmodels = list()\r\n\t# define config lists\r\n\tp_params = [0, 1, 2]\r\n\td_params = [0, 1]\r\n\tq_params = [0, 1, 2]\r\n\tt_params = ['n','c','t','ct']\r\n\tP_params = [0, 1, 2]\r\n\tD_params = [0, 1]\r\n\tQ_params = [0, 1, 2]\r\n\tm_params = seasonal\r\n\t# create config instances\r\n\tfor p in p_params:\r\n\t\tfor d in d_params:\r\n\t\t\tfor q in q_params:\r\n\t\t\t\tfor t in t_params:\r\n\t\t\t\t\tfor P in P_params:\r\n\t\t\t\t\t\tfor D in D_params:\r\n\t\t\t\t\t\t\tfor Q in Q_params:\r\n\t\t\t\t\t\t\t\tfor m in m_params:\r\n\t\t\t\t\t\t\t\t\tcfg = [(p,d,q), (P,D,Q,m), t]\r\n\t\t\t\t\t\t\t\t\tmodels.append(cfg)\r\n\treturn models\r\n\r\nif __name__ == '__main__':\r\n\t# load dataset\r\n\tseries = read_csv('monthly-mean-temp.csv', header=0, index_col=0)\r\n\tdata = series.values\r\n\t# trim dataset to 5 years\r\n\tdata = data[-(5*12):]\r\n\t# data split\r\n\tn_test = 12\r\n\t# model configs\r\n\tcfg_list = sarima_configs(seasonal=[0, 12])\r\n\t# grid search\r\n\tscores = grid_search(data, cfg_list, n_test)\r\n\tprint('done')\r\n\t# list top 3 configs\r\n\tfor cfg, error in scores[:3]:\r\n\t\tprint(cfg, error)<\/pre>\n<p>Running the example may take a few minutes on modern hardware.<\/p>\n<p>Model configurations and the RMSE are printed as the models are evaluated The top three model configurations and their error are reported at the end of the run.<\/p>\n<p>We can see that the best result was an RMSE of about 1.5 degrees with the following configuration:<\/p>\n<ul>\n<li><strong>Trend Order<\/strong>: (0, 0, 0)<\/li>\n<li><strong>Seasonal Order<\/strong>: (1, 0, 1, 12)<\/li>\n<li><strong>Trend Parameter<\/strong>: \u2018n\u2019 (no trend)<\/li>\n<\/ul>\n<p>As we would expect, the model has no trend component and a 12-month seasonal ARMA component.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n> Model[[(2, 1, 2), (2, 1, 0, 12), 't']] 4.599\r\n> Model[[(2, 1, 2), (1, 1, 0, 12), 'ct']] 2.477\r\n> Model[[(2, 1, 2), (2, 0, 0, 12), 'ct']] 2.548\r\n> Model[[(2, 1, 2), (2, 0, 1, 12), 'ct']] 2.893\r\n> Model[[(2, 1, 2), (2, 1, 0, 12), 'ct']] 5.404\r\ndone\r\n\r\n[(0, 0, 0), (1, 0, 1, 12), 'n'] 1.5577613610905712\r\n[(0, 0, 0), (1, 1, 0, 12), 'n'] 1.6469530713847962\r\n[(0, 0, 0), (2, 0, 0, 12), 'n'] 1.7314448163607488<\/pre>\n<\/p>\n<h2>Case Study 4: Trend and Seasonality<\/h2>\n<p>The \u2018monthly car sales\u2019 dataset summarizes the monthly car sales in Quebec, Canada between 1960 and 1968.<\/p>\n<p>The dataset has an obvious trend and seasonal component.<\/p>\n<div id=\"attachment_6353\" style=\"width: 1472px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6353\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Car-Sales-Dataset.png\" alt=\"Line Plot of the Monthly Car Sales Dataset\" width=\"1462\" height=\"768\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Car-Sales-Dataset.png 1462w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Car-Sales-Dataset-300x158.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Car-Sales-Dataset-768x403.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/07\/Line-Plot-of-the-Monthly-Car-Sales-Dataset-1024x538.png 1024w\" sizes=\"(max-width: 1462px) 100vw, 1462px\"><\/p>\n<p class=\"wp-caption-text\">Line Plot of the Monthly Car Sales Dataset<\/p>\n<\/div>\n<p>You can learn more about the dataset from <a href=\"https:\/\/datamarket.com\/data\/set\/22n4\/monthly-car-sales-in-quebec-1960-1968#!ds=22n4&#038;display=line\">DataMarket<\/a>.<\/p>\n<p>Download the dataset directly from here:<\/p>\n<ul>\n<li><a href=\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/monthly-car-sales.csv\">monthly-car-sales.csv<\/a><\/li>\n<\/ul>\n<p>Save the file with the filename \u2018<em>monthly-car-sales.csv<\/em>\u2018 in your current working directory.<\/p>\n<p>We can load this dataset as a Pandas series using the function <em>read_csv()<\/em>.<\/p>\n<pre class=\"crayon-plain-tag\">series = read_csv('monthly-car-sales.csv', header=0, index_col=0)<\/pre>\n<p>The dataset has 9 years, or 108 observations. We will use the last year, or 12 observations, as the test set.<\/p>\n<p>The period of the seasonal component could be six months or 12 months. We will try both as the seasonal period in the call to the <em>sarima_configs()<\/em> function when preparing the model configurations.<\/p>\n<pre class=\"crayon-plain-tag\"># model configs\r\ncfg_list = sarima_configs(seasonal=[0,6,12])<\/pre>\n<p>The complete example grid searching the monthly car sales time series forecasting problem is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># grid search sarima hyperparameters for monthly car sales dataset\r\nfrom math import sqrt\r\nfrom multiprocessing import cpu_count\r\nfrom joblib import Parallel\r\nfrom joblib import delayed\r\nfrom warnings import catch_warnings\r\nfrom warnings import filterwarnings\r\nfrom statsmodels.tsa.statespace.sarimax import SARIMAX\r\nfrom sklearn.metrics import mean_squared_error\r\nfrom pandas import read_csv\r\n\r\n# one-step sarima forecast\r\ndef sarima_forecast(history, config):\r\n\torder, sorder, trend = config\r\n\t# define model\r\n\tmodel = SARIMAX(history, order=order, seasonal_order=sorder, trend=trend, enforce_stationarity=False, enforce_invertibility=False)\r\n\t# fit model\r\n\tmodel_fit = model.fit(disp=False)\r\n\t# make one step forecast\r\n\tyhat = model_fit.predict(len(history), len(history))\r\n\treturn yhat[0]\r\n\r\n# root mean squared error or rmse\r\ndef measure_rmse(actual, predicted):\r\n\treturn sqrt(mean_squared_error(actual, predicted))\r\n\r\n# split a univariate dataset into train\/test sets\r\ndef train_test_split(data, n_test):\r\n\treturn data[:-n_test], data[-n_test:]\r\n\r\n# walk-forward validation for univariate data\r\ndef walk_forward_validation(data, n_test, cfg):\r\n\tpredictions = list()\r\n\t# split dataset\r\n\ttrain, test = train_test_split(data, n_test)\r\n\t# seed history with training dataset\r\n\thistory = [x for x in train]\r\n\t# step over each time-step in the test set\r\n\tfor i in range(len(test)):\r\n\t\t# fit model and make forecast for history\r\n\t\tyhat = sarima_forecast(history, cfg)\r\n\t\t# store forecast in list of predictions\r\n\t\tpredictions.append(yhat)\r\n\t\t# add actual observation to history for the next loop\r\n\t\thistory.append(test[i])\r\n\t# estimate prediction error\r\n\terror = measure_rmse(test, predictions)\r\n\treturn error\r\n\r\n# score a model, return None on failure\r\ndef score_model(data, n_test, cfg, debug=False):\r\n\tresult = None\r\n\t# convert config to a key\r\n\tkey = str(cfg)\r\n\t# show all warnings and fail on exception if debugging\r\n\tif debug:\r\n\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\telse:\r\n\t\t# one failure during model validation suggests an unstable config\r\n\t\ttry:\r\n\t\t\t# never show warnings when grid searching, too noisy\r\n\t\t\twith catch_warnings():\r\n\t\t\t\tfilterwarnings(\"ignore\")\r\n\t\t\t\tresult = walk_forward_validation(data, n_test, cfg)\r\n\t\texcept:\r\n\t\t\terror = None\r\n\t# check for an interesting result\r\n\tif result is not None:\r\n\t\tprint(' > Model[%s] %.3f' % (key, result))\r\n\treturn (key, result)\r\n\r\n# grid search configs\r\ndef grid_search(data, cfg_list, n_test, parallel=True):\r\n\tscores = None\r\n\tif parallel:\r\n\t\t# execute configs in parallel\r\n\t\texecutor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')\r\n\t\ttasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)\r\n\t\tscores = executor(tasks)\r\n\telse:\r\n\t\tscores = [score_model(data, n_test, cfg) for cfg in cfg_list]\r\n\t# remove empty results\r\n\tscores = [r for r in scores if r[1] != None]\r\n\t# sort configs by error, asc\r\n\tscores.sort(key=lambda tup: tup[1])\r\n\treturn scores\r\n\r\n# create a set of sarima configs to try\r\ndef sarima_configs(seasonal=[0]):\r\n\tmodels = list()\r\n\t# define config lists\r\n\tp_params = [0, 1, 2]\r\n\td_params = [0, 1]\r\n\tq_params = [0, 1, 2]\r\n\tt_params = ['n','c','t','ct']\r\n\tP_params = [0, 1, 2]\r\n\tD_params = [0, 1]\r\n\tQ_params = [0, 1, 2]\r\n\tm_params = seasonal\r\n\t# create config instances\r\n\tfor p in p_params:\r\n\t\tfor d in d_params:\r\n\t\t\tfor q in q_params:\r\n\t\t\t\tfor t in t_params:\r\n\t\t\t\t\tfor P in P_params:\r\n\t\t\t\t\t\tfor D in D_params:\r\n\t\t\t\t\t\t\tfor Q in Q_params:\r\n\t\t\t\t\t\t\t\tfor m in m_params:\r\n\t\t\t\t\t\t\t\t\tcfg = [(p,d,q), (P,D,Q,m), t]\r\n\t\t\t\t\t\t\t\t\tmodels.append(cfg)\r\n\treturn models\r\n\r\nif __name__ == '__main__':\r\n\t# load dataset\r\n\tseries = read_csv('monthly-car-sales.csv', header=0, index_col=0)\r\n\tdata = series.values\r\n\tprint(data.shape)\r\n\t# data split\r\n\tn_test = 12\r\n\t# model configs\r\n\tcfg_list = sarima_configs(seasonal=[0,6,12])\r\n\t# grid search\r\n\tscores = grid_search(data, cfg_list, n_test)\r\n\tprint('done')\r\n\t# list top 3 configs\r\n\tfor cfg, error in scores[:3]:\r\n\t\tprint(cfg, error)<\/pre>\n<p>Running the example may take a few minutes on modern hardware.<\/p>\n<p>Model configurations and the RMSE are printed as the models are evaluated The top three model configurations and their error are reported at the end of the run.<\/p>\n<p>We can see that the best result was an RMSE of about 1,551 sales with the following configuration:<\/p>\n<ul>\n<li><strong>Trend Order<\/strong>: (0, 0, 0)<\/li>\n<li><strong>Seasonal Order<\/strong>: (1, 1, 0, 12)<\/li>\n<li><strong>Trend Parameter<\/strong>: \u2018t\u2019 (linear trend)<\/li>\n<\/ul>\n<pre class=\"crayon-plain-tag\">> Model[[(2, 1, 2), (2, 1, 1, 6), 'ct']] 2246.248\r\n> Model[[(2, 1, 2), (2, 0, 2, 12), 'ct']] 10710.462\r\n> Model[[(2, 1, 2), (2, 1, 2, 6), 'ct']] 2183.568\r\n> Model[[(2, 1, 2), (2, 1, 0, 12), 'ct']] 2105.800\r\n> Model[[(2, 1, 2), (2, 1, 1, 12), 'ct']] 2330.361\r\n> Model[[(2, 1, 2), (2, 1, 2, 12), 'ct']] 31580326686.803\r\ndone\r\n[(0, 0, 0), (1, 1, 0, 12), 't'] 1551.8423920342414\r\n[(0, 0, 0), (2, 1, 1, 12), 'c'] 1557.334614575545\r\n[(0, 0, 0), (1, 1, 0, 12), 'c'] 1559.3276311282675<\/pre>\n<\/p>\n<h2>Extensions<\/h2>\n<p>This section lists some ideas for extending the tutorial that you may wish to explore.<\/p>\n<ul>\n<li><strong>Data Transforms<\/strong>. Update the framework to support configurable data transforms such as normalization and standardization.<\/li>\n<li><strong>Plot Forecast<\/strong>. Update the framework to re-fit a model with the best configuration and forecast the entire test dataset, then plot the forecast compared to the actual observations in the test set.<\/li>\n<li><strong>Tune Amount of History<\/strong>. Update the framework to tune the amount of historical data used to fit the model (e.g. in the case of the 10 years of max temperature data).<\/li>\n<\/ul>\n<p>If you explore any of these extensions, I\u2019d love to know.<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Posts<\/h3>\n<ul>\n<li><a href=\"https:\/\/machinelearningmastery.com\/arima-for-time-series-forecasting-with-python\/\">How to Create an ARIMA Model for Time Series Forecasting with Python<\/a><\/li>\n<li><a href=\"https:\/\/machinelearningmastery.com\/grid-search-arima-hyperparameters-with-python\/\">How to Grid Search ARIMA Model Hyperparameters with Python<\/a><\/li>\n<li><a href=\"https:\/\/machinelearningmastery.com\/gentle-introduction-autocorrelation-partial-autocorrelation\/\">A Gentle Introduction to Autocorrelation and Partial Autocorrelation<\/a><\/li>\n<\/ul>\n<h3>Books<\/h3>\n<ul>\n<li>Chapter 8 ARIMA models, <a href=\"https:\/\/amzn.to\/2xlJsfV\">Forecasting: principles and practice<\/a>, 2013.<\/li>\n<li>Chapter 7, Non-stationary Models, <a href=\"https:\/\/amzn.to\/2smB9LR\">Introductory Time Series with R<\/a>, 2009.<\/li>\n<\/ul>\n<h3>API<\/h3>\n<ul>\n<li><a href=\"http:\/\/www.statsmodels.org\/dev\/statespace.html\">Statsmodels Time Series Analysis by State Space Methods<\/a><\/li>\n<li><a href=\"http:\/\/www.statsmodels.org\/dev\/generated\/statsmodels.tsa.statespace.sarimax.SARIMAX.html\">statsmodels.tsa.statespace.sarimax.SARIMAX API<\/a><\/li>\n<li><a href=\"http:\/\/www.statsmodels.org\/dev\/generated\/statsmodels.tsa.statespace.sarimax.SARIMAXResults.html\">statsmodels.tsa.statespace.sarimax.SARIMAXResults API<\/a><\/li>\n<li><a href=\"http:\/\/www.statsmodels.org\/dev\/examples\/notebooks\/generated\/statespace_sarimax_stata.html\">Statsmodels SARIMAX Notebook<\/a><\/li>\n<li><a href=\"https:\/\/pythonhosted.org\/joblib\/\">Joblib: running Python functions as pipeline jobs<\/a><\/li>\n<\/ul>\n<h3>Articles<\/h3>\n<ul>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Autoregressive_integrated_moving_average\">Autoregressive integrated moving average on Wikipedia<\/a><\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered how to develop a framework for grid searching all of the SARIMA model hyperparameters for univariate time series forecasting.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to develop a framework for grid searching SARIMA models from scratch using walk-forward validation.<\/li>\n<li>How to grid search SARIMA model hyperparameters for daily time series data for births.<\/li>\n<li>How to grid search SARIMA model hyperparameters for monthly time series data for shampoo sales, car sales and temperature.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/how-to-grid-search-sarima-model-hyperparameters-for-time-series-forecasting-in-python\/\">How to Grid Search SARIMA Model Hyperparameters for Time Series Forecasting in Python<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/how-to-grid-search-sarima-model-hyperparameters-for-time-series-forecasting-in-python\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee The Seasonal Autoregressive Integrated Moving Average, or SARIMA, model is an approach for modeling univariate time series data that may contain trend [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/10\/23\/how-to-grid-search-sarima-model-hyperparameters-for-time-series-forecasting-in-python\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":1206,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1205"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=1205"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1205\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/1206"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=1205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=1205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=1205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}