{"id":1033,"date":"2018-09-09T19:00:53","date_gmt":"2018-09-09T19:00:53","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/09\/09\/indoor-movement-time-series-classification-with-machine-learning-algorithms\/"},"modified":"2018-09-09T19:00:53","modified_gmt":"2018-09-09T19:00:53","slug":"indoor-movement-time-series-classification-with-machine-learning-algorithms","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/09\/09\/indoor-movement-time-series-classification-with-machine-learning-algorithms\/","title":{"rendered":"Indoor Movement Time Series Classification with Machine Learning Algorithms"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>Indoor movement prediction involves using wireless sensor strength data to predict the location and motion of subjects within a building.<\/p>\n<p>It is a challenging problem as there is no direct analytical model to translate the variable length traces of signal strength data from multiple sensors into user behavior.<\/p>\n<p>The \u2018<em>indoor user movement<\/em>\u2018 dataset is a standard and freely available time series classification problem.<\/p>\n<p>In this tutorial, you will discover the indoor movement prediction time series classification problem and how to engineer features and evaluate machine learning algorithms for the problem.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>The time series classification problem of predicting the movement between rooms based on sensor strength.<\/li>\n<li>How to investigate the data in order to better understand the problem and how to engineer features from the raw data for predictive modeling.<\/li>\n<li>How to spot check a suite of classification algorithms and tune one algorithm to further lift performance on the problem.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_6141\" style=\"width: 650px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6141\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/09\/Indoor-Movement-Time-Series-Classification-with-Machine-Learning-Algorithms.jpg\" alt=\"Indoor Movement Time Series Classification with Machine Learning Algorithms\" width=\"640\" height=\"428\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/09\/Indoor-Movement-Time-Series-Classification-with-Machine-Learning-Algorithms.jpg 640w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/09\/Indoor-Movement-Time-Series-Classification-with-Machine-Learning-Algorithms-300x201.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\"><\/p>\n<p class=\"wp-caption-text\">Indoor Movement Time Series Classification with Machine Learning Algorithms<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/nolatularosa\/3238977118\/\">Nola Tularosa<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into five parts; they are:<\/p>\n<ol>\n<li>Indoor User Movement Prediction<\/li>\n<li>Indoor Movement Prediction Dataset<\/li>\n<li>Model Evaluation<\/li>\n<li>Data Preparation<\/li>\n<li>Algorithm Spot-Check<\/li>\n<\/ol>\n<h2>Indoor User Movement Prediction<\/h2>\n<p>The \u2018<em>indoor user movement<\/em>\u2018 prediction problem involves determining whether an individual has moved between rooms based on the change in signal strength measured by wireless detectors in the environment.<\/p>\n<p>The dataset was collected and made available by Davide Bacciu, et al. from the University of Pisa in Italy and first described in their 2011 paper \u201c<a href=\"https:\/\/pdfs.semanticscholar.org\/40c2\/393e1874c3fd961fdfff02402c24ccf1c3d7.pdf#page=13\">Predicting User Movements in Heterogeneous Indoor Environments by Reservoir Computing<\/a>\u201d as a dataset for exploring a methodology that seems like recurrent neural networks called \u2018reservoir computing.\u2019<\/p>\n<p>The problem is a special case of the more generic problem of predicting indoor user localization and movement patterns.<\/p>\n<p>Data was collected by positioning four wireless sensors in the environment and one on the subject. The subject moved through the environment while the four wireless sensors detected and recorded a time series of sensor strength.<\/p>\n<p>The result is a dataset comprised of variable length time series with four variates describing trajectory through a well-defined static environment, and the classification of whether the movement led to the subject changing rooms in the environment.<\/p>\n<p>It is a challenging problem because there is no obvious and generic way to relate signal strength data to subject location in an environment.<\/p>\n<blockquote>\n<p>The relationship between the RSS and the location of the tracked object cannot be easily formulated into an analytical model, as it strongly depends on the characteristics of the environment as well as on the wireless devices involved. I<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/pdfs.semanticscholar.org\/40c2\/393e1874c3fd961fdfff02402c24ccf1c3d7.pdf#page=13\">Predicting User Movements in Heterogeneous Indoor Environments by Reservoir Computing<\/a>, 2011.<\/p>\n<p>The data was collected under controlled experimental conditions.<\/p>\n<p>Sensors were placed in three pairs of two connected rooms containing typical office furniture. Two sensors were placed in the corners of each of the two rooms and the subject walked one of six predefined paths through the rooms. Predictions are made at a point along each path that may or may not lead to a change of room.<\/p>\n<p>The cartoon below makes this clear, showing the sensor locations (A1-A4), the six possible paths that may be walked, and the two points (M) where a prediction will be made.<\/p>\n<div id=\"attachment_6131\" style=\"width: 802px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6131\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Overview-of-two-rooms-sensor-locations-and-the-6-pre-defined-paths.png\" alt=\"Overview of two rooms, sensor locations and the 6 pre-defined paths\" width=\"792\" height=\"458\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Overview-of-two-rooms-sensor-locations-and-the-6-pre-defined-paths.png 792w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Overview-of-two-rooms-sensor-locations-and-the-6-pre-defined-paths-300x173.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Overview-of-two-rooms-sensor-locations-and-the-6-pre-defined-paths-768x444.png 768w\" sizes=\"(max-width: 792px) 100vw, 792px\"><\/p>\n<p class=\"wp-caption-text\">Overview of two rooms, sensor locations and the 6 pre-defined paths.<br \/>Taken from \u201cPredicting User Movements in Heterogeneous Indoor Environments by Reservoir Computing.\u201d<\/p>\n<\/div>\n<p>Three datasets were collected from the three pairs of two rooms in which the paths were walked and sensor measurements taken, referred to as Dataset 1, Dataset 2, and Dataset 3.<\/p>\n<p>The table below, taken from the paper, summarizes the number of paths walked in each of the three datasets, the total number of room changes and non-room-changes (class label), and the lengths of the time series inputs.<\/p>\n<div id=\"attachment_6132\" style=\"width: 848px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6132\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Summary-of-sensor-data-collected-from-the-three-pairs-of-two-rooms.png\" alt=\"Summary of sensor data collected from the three pairs of two rooms\" width=\"838\" height=\"466\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Summary-of-sensor-data-collected-from-the-three-pairs-of-two-rooms.png 838w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Summary-of-sensor-data-collected-from-the-three-pairs-of-two-rooms-300x167.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Summary-of-sensor-data-collected-from-the-three-pairs-of-two-rooms-768x427.png 768w\" sizes=\"(max-width: 838px) 100vw, 838px\"><\/p>\n<p class=\"wp-caption-text\">Summary of sensor data collected from the three pairs of two rooms.<br \/>Taken from \u201cPredicting User Movements in Heterogeneous Indoor Environments by Reservoir Computing.\u201d<\/p>\n<\/div>\n<p>Technically, the data is comprised of multivariate time series inputs and a classification output and may be described as a time series classification problem.<\/p>\n<blockquote>\n<p>The RSS values from the four anchors are organized into sequences of varying length corresponding to trajectory measurements from the starting point until marker M. A target classification label is associated to each input sequence to indicate whether the user is about to change its location (room) or not.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/pdfs.semanticscholar.org\/40c2\/393e1874c3fd961fdfff02402c24ccf1c3d7.pdf#page=13\">Predicting User Movements in Heterogeneous Indoor Environments by Reservoir Computing<\/a>, 2011.<\/p>\n<p><!-- Start shortcoder --><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Need help with Deep Learning for Time Series?<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14531ee73f72a2%3A164f8be4f346dc\/5630742793027584\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"14531ee73f72a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/14531ee73f72a2%3A164f8be4f346dc\/5630742793027584\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1534880695.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><!-- End shortcoder v4.1.7--><\/p>\n<h2>Indoor Movement Prediction Dataset<\/h2>\n<p>The dataset is freely available from the UCI Machine Learning Repository:<\/p>\n<ul>\n<li><a href=\"https:\/\/archive.ics.uci.edu\/ml\/datasets\/Indoor+User+Movement+Prediction+from+RSS+data\">Indoor User Movement Prediction from RSS data Data Set, UCI Machine Learning Repository<\/a><\/li>\n<\/ul>\n<p>The data can be downloaded as a .zip file that contains the following salient files:<\/p>\n<ul>\n<li><strong>dataset\/MovementAAL_RSS_???.csv<\/strong> The RSS traces for each movement, where \u2018???\u2019 in the filename marks the trace number from 1 to 311.<\/li>\n<li><strong>dataset\/MovementAAL_target.csv<\/strong> The mapping of trace number to the output class value or target.<\/li>\n<li><strong>groups\/MovementAAL_DatasetGroup.csv<\/strong> The mapping of trace number to the dataset group 1, 2, or 3 marking the pair of rooms from which the trace was recorded.<\/li>\n<li><strong>groups\/MovementAAL_Paths.csv<\/strong> The mapping of trace number to the path type, 1-6, marked in the cartoon diagram above.<\/li>\n<\/ul>\n<p>The provided data is already normalized.<\/p>\n<p>Specifically, each input variable is normalized into the range [-1,1] per dataset (pair of rooms), and the output class variable is marked -1 for no transition between rooms and +1 for a transition through the rooms.<\/p>\n<blockquote>\n<p>[\u2026] put data comprises time series of 4 dimensional RSS measurements (NU = 4) corresponding to the 4 anchors [\u2026] normalized in the range [\u22121, 1] independently for each dataset<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/pdfs.semanticscholar.org\/40c2\/393e1874c3fd961fdfff02402c24ccf1c3d7.pdf#page=13\">Predicting User Movements in Heterogeneous Indoor Environments by Reservoir Computing<\/a>, 2011.<\/p>\n<p>The scaling of data by dataset may (or may not) introduce additional challenges when combining observations across datasets if the pre-normalized distributions differ greatly.<\/p>\n<p>The time series for one trace in a given trace file are provided in temporal order, where one row records the observations for a single time step. The data is recorded at 8Hz, meaning that one second of clock time elapses for eight time steps in the data.<\/p>\n<p>Below is an example of a trace, taken from \u2018<em>dataset\/MovementAAL_RSS_1.csv<\/em>\u2018, which has the output target \u20181\u2019 (a room transition occurred), from group 1 (the first pair of rooms) and is the path 1 (a straight shot from left to right between the rooms).<\/p>\n<pre class=\"crayon-plain-tag\">#RSS_anchor1, RSS_anchor2, RSS_anchor3, RSS_anchor4\r\n-0.90476,-0.48,0.28571,0.3\r\n-0.57143,-0.32,0.14286,0.3\r\n-0.38095,-0.28,-0.14286,0.35\r\n-0.28571,-0.2,-0.47619,0.35\r\n-0.14286,-0.2,0.14286,-0.2\r\n-0.14286,-0.2,0.047619,0\r\n-0.14286,-0.16,-0.38095,0.2\r\n-0.14286,-0.04,-0.61905,-0.2\r\n-0.095238,-0.08,0.14286,-0.55\r\n-0.047619,0.04,-0.095238,0.05\r\n-0.19048,-0.04,0.095238,0.4\r\n-0.095238,-0.04,-0.14286,0.35\r\n-0.33333,-0.08,-0.28571,-0.2\r\n-0.2381,0.04,0.14286,0.35\r\n0,0.08,0.14286,0.05\r\n-0.095238,0.04,0.095238,0.1\r\n-0.14286,-0.2,0.14286,0.5\r\n-0.19048,0.04,-0.42857,0.3\r\n-0.14286,-0.08,-0.2381,0.15\r\n-0.33333,0.16,-0.14286,-0.8\r\n-0.42857,0.16,-0.28571,-0.1\r\n-0.71429,0.16,-0.28571,0.2\r\n-0.095238,-0.08,0.095238,0.35\r\n-0.28571,0.04,0.14286,0.2\r\n0,0.04,0.14286,0.1\r\n0,0.04,-0.047619,-0.05\r\n-0.14286,-0.6,-0.28571,-0.1<\/pre>\n<p>The datasets were used in two specific ways (experimental settings or ES) to evaluate predictive models on the problem, designated ES1 and ES2, as described in the first paper.<\/p>\n<ul>\n<li><strong>ES1<\/strong>: Combines datasets 1 and 2, which is split into train (80%) and test (20%) sets to evaluate a model.<\/li>\n<li><strong>ES2<\/strong>: Combines datasets 1 and 2 which are used as a training set (66%) and dataset 3 is used as a test set (34%) to evaluate a model.<\/li>\n<\/ul>\n<p>The ES1 case evaluates a model to generalize movement within two pairs of known rooms, that is, rooms with known geometry. The ES2 case attempts to generalize movement from two rooms to a third unseen room: a harder problem.<\/p>\n<p>The 2011 paper, reports performance of about 95% classification accuracy on ES1 and about 89% on ES2, which after some testing of a suite of algorithms myself is very impressive.<\/p>\n<h2>Load and Explore Dataset<\/h2>\n<p>In this section, we will load the data into memory and explore it with summarization and visualization to help better understand how the problem might be modeled.<\/p>\n<p>First, download the dataset and unzip the downloaded archive into your current working directory.<\/p>\n<ul>\n<li><a href=\"https:\/\/archive.ics.uci.edu\/ml\/machine-learning-databases\/00348\/MovementAAL.zip\">MovementAAL.zip<\/a>, from the UCI Machine Learning Repository<\/li>\n<\/ul>\n<h3>Load Dataset<\/h3>\n<p>The targets, groups, and path files can be loaded directly as Pandas DataFrames.<\/p>\n<pre class=\"crayon-plain-tag\"># load mapping files\r\nfrom pandas import read_csv\r\ntarget_mapping = read_csv('dataset\/MovementAAL_target.csv', header=0)\r\ngroup_mapping = read_csv('groups\/MovementAAL_DatasetGroup.csv', header=0)\r\npaths_mapping = read_csv('groups\/MovementAAL_Paths.csv', header=0)<\/pre>\n<p>The signal strength traces are stored in separate files in the <em>dataset\/<\/em> directory.<\/p>\n<p>These can be loaded by iterating over all files in the directory and loading the sequences as directly. Because each sequence has a variable length (variable number of rows), we can store the NumPy array for each trace in a list.<\/p>\n<pre class=\"crayon-plain-tag\"># load sequences and targets into memory\r\nfrom pandas import read_csv\r\nfrom os import listdir\r\nsequences = list()\r\ndirectory = 'dataset'\r\ntarget_mapping = None\r\nfor name in listdir(directory):\r\n\tfilename = directory + '\/' + name\r\n\tif filename.endswith('_target.csv'):\r\n\t\tcontinue\r\n\tdf = read_csv(filename, header=0)\r\n\tvalues = df.values\r\n\tsequences.append(values)<\/pre>\n<p>We can tie all of this together into a function named <em>load_dataset()<\/em> and load the data into memory.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># load user movement dataset into memory\r\nfrom pandas import read_csv\r\nfrom os import listdir\r\n\r\n# return list of traces, and arrays for targets, groups and paths\r\ndef load_dataset(prefix=''):\r\n\tgrps_dir, data_dir = prefix+'groups\/', prefix+'dataset\/'\r\n\t# load mapping files\r\n\ttargets = read_csv(data_dir + 'MovementAAL_target.csv', header=0)\r\n\tgroups = read_csv(grps_dir + 'MovementAAL_DatasetGroup.csv', header=0)\r\n\tpaths = read_csv(grps_dir + 'MovementAAL_Paths.csv', header=0)\r\n\t# load traces\r\n\tsequences = list()\r\n\ttarget_mapping = None\r\n\tfor name in listdir(data_dir):\r\n\t\tfilename = data_dir + name\r\n\t\tif filename.endswith('_target.csv'):\r\n\t\t\tcontinue\r\n\t\tdf = read_csv(filename, header=0)\r\n\t\tvalues = df.values\r\n\t\tsequences.append(values)\r\n\treturn sequences, targets.values[:,1], groups.values[:,1], paths.values[:,1]\r\n\r\n# load dataset\r\nsequences, targets, groups, paths = load_dataset()\r\n# summarize shape of the loaded data\r\nprint(len(sequences), targets.shape, groups.shape, paths.shape)<\/pre>\n<p>Running the example loads the data and shows that 314 traces were correctly loaded from disk along with their associated outputs (targets as -1 or +1), dataset number, (group as 1, 2 or 3) and path number (path as 1-6).<\/p>\n<pre class=\"crayon-plain-tag\">314 (314,) (314,) (314,)<\/pre>\n<\/p>\n<h3>Basic Information<\/h3>\n<p>We can now take a closer look at the loaded data to better understand or confirm our understanding of the problem.<\/p>\n<p>We know from the paper that the dataset is reasonably balanced in terms of the two classes. We can confirm this by summarizing the class breakdown of all observations.<\/p>\n<pre class=\"crayon-plain-tag\"># summarize class breakdown\r\nclass1,class2 = len(targets[targets==-1]), len(targets[targets==1])\r\nprint('Class=-1: %d %.3f%%' % (class1, class1\/len(targets)*100))\r\nprint('Class=+1: %d %.3f%%' % (class2, class2\/len(targets)*100))<\/pre>\n<p>Next, we can review the distribution of the sensor strength values for each of the four anchor points by plotting a histogram of the raw values.<\/p>\n<p>This requires that we create one array with all rows of observations so that we can plot the distribution of each column. The <em>vstack()<\/em> NumPy function will do this job for us.<\/p>\n<pre class=\"crayon-plain-tag\"># histogram for each anchor point\r\nall_rows = vstack(sequences)\r\npyplot.figure()\r\nvariables = [0, 1, 2, 3]\r\nfor v in variables:\r\n\tpyplot.subplot(len(variables), 1, v+1)\r\n\tpyplot.hist(all_rows[:, v], bins=20)\r\npyplot.show()<\/pre>\n<p>Finally, another interesting aspect to look at is the distribution of the length of the traces.<\/p>\n<p>We can summarize this distribution using a histogram.<\/p>\n<pre class=\"crayon-plain-tag\"># histogram for trace lengths\r\ntrace_lengths = [len(x) for x in sequences]\r\npyplot.hist(trace_lengths, bins=50)\r\npyplot.show()<\/pre>\n<p>Putting this all together, the complete example of loading and summarizing the data is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># summarize simple information about user movement data\r\nfrom os import listdir\r\nfrom numpy import array\r\nfrom numpy import vstack\r\nfrom pandas import read_csv\r\nfrom matplotlib import pyplot\r\n\r\n# return list of traces, and arrays for targets, groups and paths\r\ndef load_dataset(prefix=''):\r\n\tgrps_dir, data_dir = prefix+'groups\/', prefix+'dataset\/'\r\n\t# load mapping files\r\n\ttargets = read_csv(data_dir + 'MovementAAL_target.csv', header=0)\r\n\tgroups = read_csv(grps_dir + 'MovementAAL_DatasetGroup.csv', header=0)\r\n\tpaths = read_csv(grps_dir + 'MovementAAL_Paths.csv', header=0)\r\n\t# load traces\r\n\tsequences = list()\r\n\ttarget_mapping = None\r\n\tfor name in listdir(data_dir):\r\n\t\tfilename = data_dir + name\r\n\t\tif filename.endswith('_target.csv'):\r\n\t\t\tcontinue\r\n\t\tdf = read_csv(filename, header=0)\r\n\t\tvalues = df.values\r\n\t\tsequences.append(values)\r\n\treturn sequences, targets.values[:,1], groups.values[:,1], paths.values[:,1]\r\n\r\n# load dataset\r\nsequences, targets, groups, paths = load_dataset()\r\n# summarize class breakdown\r\nclass1,class2 = len(targets[targets==-1]), len(targets[targets==1])\r\nprint('Class=-1: %d %.3f%%' % (class1, class1\/len(targets)*100))\r\nprint('Class=+1: %d %.3f%%' % (class2, class2\/len(targets)*100))\r\n# histogram for each anchor point\r\nall_rows = vstack(sequences)\r\npyplot.figure()\r\nvariables = [0, 1, 2, 3]\r\nfor v in variables:\r\n\tpyplot.subplot(len(variables), 1, v+1)\r\n\tpyplot.hist(all_rows[:, v], bins=20)\r\npyplot.show()\r\n# histogram for trace lengths\r\ntrace_lengths = [len(x) for x in sequences]\r\npyplot.hist(trace_lengths, bins=50)\r\npyplot.show()<\/pre>\n<p>Running the example first summarizes the class distribution for the observations.<\/p>\n<p>The results confirm our expectations of the full dataset being nearly perfectly balanced in terms of observations of both class outcomes.<\/p>\n<pre class=\"crayon-plain-tag\">Class=-1: 156 49.682%\r\nClass=+1: 158 50.318%<\/pre>\n<p>Next, a histogram of the sensor strength for each anchor point is created, summarizing the data distributions.<\/p>\n<p>We can see that the distributions for each variable are close to normal showing Gaussian-like shapes. We can also see perhaps too many observations around -1. This might indicate a generic \u201cno strength\u201d observation that could be marked or even filtered out from the sequences.<\/p>\n<p>It might be interesting to investigate whether the distributions change by path type or even dataset number.<\/p>\n<div id=\"attachment_6133\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6133\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Histograms-for-the-sensor-strength-values-for-each-anchor-point.png\" alt=\"Histograms for the sensor strength values for each anchor point\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Histograms-for-the-sensor-strength-values-for-each-anchor-point.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Histograms-for-the-sensor-strength-values-for-each-anchor-point-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Histograms-for-the-sensor-strength-values-for-each-anchor-point-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Histograms-for-the-sensor-strength-values-for-each-anchor-point-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p class=\"wp-caption-text\">Histograms for the sensor strength values for each anchor point<\/p>\n<\/div>\n<p>Finally, a histogram of the sequence lengths is created.<\/p>\n<p>We can see clusters of sequences with lengths around 25, 40, and 60. We can also see that if we wanted to trim long sequences that a maximum length of around 70 time steps might be appropriate. The smallest length appears to be 19.<\/p>\n<div id=\"attachment_6134\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6134\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Histogram-of-sensor-strength-sequence-lengths.png\" alt=\"Histogram of sensor strength sequence lengths\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Histogram-of-sensor-strength-sequence-lengths.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Histogram-of-sensor-strength-sequence-lengths-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Histogram-of-sensor-strength-sequence-lengths-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Histogram-of-sensor-strength-sequence-lengths-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p class=\"wp-caption-text\">Histogram of sensor strength sequence lengths<\/p>\n<\/div>\n<h3>Time Series Plots<\/h3>\n<p>We are working with time series data, so it is important that we actually review some examples of the sequences.<\/p>\n<p>We can group traces by their path and plot an example of one trace for each path. The expectation is that traces for different paths may look different in some way.<\/p>\n<pre class=\"crayon-plain-tag\"># group sequences by paths\r\npaths = [1,2,3,4,5,6]\r\nseq_paths = dict()\r\nfor path in paths:\r\n\tseq_paths[path] = [sequences[j] for j in range(len(paths)) if paths[j]==path]\r\n# plot one example of a trace for each path\r\npyplot.figure()\r\nfor i in paths:\r\n\tpyplot.subplot(len(paths), 1, i)\r\n\t# line plot each variable\r\n\tfor j in [0, 1, 2, 3]:\r\n\t\tpyplot.plot(seq_paths[i][0][:, j], label='Anchor ' + str(j+1))\r\n\tpyplot.title('Path ' + str(i), y=0, loc='left')\r\npyplot.show()<\/pre>\n<p>We can also plot each series from one trace along with the trend predicted by a linear regression model. This will make any trends in the series obvious.<\/p>\n<p>We can fit a linear regression for a given series using the <a href=\"https:\/\/docs.scipy.org\/doc\/numpy\/reference\/generated\/numpy.linalg.lstsq.html\">lstsq() NumPy Function<\/a>.<\/p>\n<p>The function <em>regress()<\/em> below takes a series as a single variable, fits a linear regression model via least squares, and predicts the output for each time step returning a sequence that captures the trend in the data.<\/p>\n<pre class=\"crayon-plain-tag\"># fit a linear regression function and return the predicted values for the series\r\ndef regress(y):\r\n\t# define input as the time step\r\n\tX = array([i for i in range(len(y))]).reshape(len(y), 1)\r\n\t# fit linear regression via least squares\r\n\tb = lstsq(X, y)[0][0]\r\n\t# predict trend on time step\r\n\tyhat = b * X[:,0]\r\n\treturn yhat<\/pre>\n<p>We can use the function to plot the trend for the time series for each variable in a single trace.<\/p>\n<pre class=\"crayon-plain-tag\"># plot series for a single trace with trend\r\nseq = sequences[0]\r\nvariables = [0, 1, 2, 3]\r\npyplot.figure()\r\nfor i in variables:\r\n\tpyplot.subplot(len(variables), 1, i+1)\r\n\t# plot the series\r\n\tpyplot.plot(seq[:,i])\r\n\t# plot the trend\r\n\tpyplot.plot(regress(seq[:,i]))\r\npyplot.show()<\/pre>\n<p>Tying all of this together, the complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># plot series data\r\nfrom os import listdir\r\nfrom numpy import array\r\nfrom numpy import vstack\r\nfrom numpy.linalg import lstsq\r\nfrom pandas import read_csv\r\nfrom matplotlib import pyplot\r\n\r\n# return list of traces, and arrays for targets, groups and paths\r\ndef load_dataset(prefix=''):\r\n\tgrps_dir, data_dir = prefix+'groups\/', prefix+'dataset\/'\r\n\t# load mapping files\r\n\ttargets = read_csv(data_dir + 'MovementAAL_target.csv', header=0)\r\n\tgroups = read_csv(grps_dir + 'MovementAAL_DatasetGroup.csv', header=0)\r\n\tpaths = read_csv(grps_dir + 'MovementAAL_Paths.csv', header=0)\r\n\t# load traces\r\n\tsequences = list()\r\n\ttarget_mapping = None\r\n\tfor name in listdir(data_dir):\r\n\t\tfilename = data_dir + name\r\n\t\tif filename.endswith('_target.csv'):\r\n\t\t\tcontinue\r\n\t\tdf = read_csv(filename, header=0)\r\n\t\tvalues = df.values\r\n\t\tsequences.append(values)\r\n\treturn sequences, targets.values[:,1], groups.values[:,1], paths.values[:,1]\r\n\r\n# fit a linear regression function and return the predicted values for the series\r\ndef regress(y):\r\n\t# define input as the time step\r\n\tX = array([i for i in range(len(y))]).reshape(len(y), 1)\r\n\t# fit linear regression via least squares\r\n\tb = lstsq(X, y)[0][0]\r\n\t# predict trend on time step\r\n\tyhat = b * X[:,0]\r\n\treturn yhat\r\n\r\n# load dataset\r\nsequences, targets, groups, paths = load_dataset()\r\n# group sequences by paths\r\npaths = [1,2,3,4,5,6]\r\nseq_paths = dict()\r\nfor path in paths:\r\n\tseq_paths[path] = [sequences[j] for j in range(len(paths)) if paths[j]==path]\r\n# plot one example of a trace for each path\r\npyplot.figure()\r\nfor i in paths:\r\n\tpyplot.subplot(len(paths), 1, i)\r\n\t# line plot each variable\r\n\tfor j in [0, 1, 2, 3]:\r\n\t\tpyplot.plot(seq_paths[i][0][:, j], label='Anchor ' + str(j+1))\r\n\tpyplot.title('Path ' + str(i), y=0, loc='left')\r\npyplot.show()\r\n# plot series for a single trace with trend\r\nseq = sequences[0]\r\nvariables = [0, 1, 2, 3]\r\npyplot.figure()\r\nfor i in variables:\r\n\tpyplot.subplot(len(variables), 1, i+1)\r\n\t# plot the series\r\n\tpyplot.plot(seq[:,i])\r\n\t# plot the trend\r\n\tpyplot.plot(regress(seq[:,i]))\r\npyplot.show()<\/pre>\n<p>Running the example creates a plot containing six figures, one for each of the six paths. A given figure shows the line plots of a single trace with the four variables of the trace, one for each anchor point.<\/p>\n<p>Perhaps the chosen traces are representative of each path, perhaps not.<\/p>\n<p>We can see some clear differences with regards to:<\/p>\n<ul>\n<li><strong>The grouping of variables over time<\/strong>. Pairs of variables may be grouped together or all variables may be grouped together at a given time.<\/li>\n<li><strong>The trend of variables over time<\/strong>. Variables bunch together towards the middle or spread apart towards the extremes.<\/li>\n<\/ul>\n<p>Ideally, if these changes in behavior are predictive, a predictive model must extract these features, or be presented with a summary of these features as input.<\/p>\n<div id=\"attachment_6135\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6135\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Line-plots-of-one-trace-4-variables-for-each-of-the-six-paths.png\" alt=\"Line plots of one trace (4 variables) for each of the six paths\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Line-plots-of-one-trace-4-variables-for-each-of-the-six-paths.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Line-plots-of-one-trace-4-variables-for-each-of-the-six-paths-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Line-plots-of-one-trace-4-variables-for-each-of-the-six-paths-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Line-plots-of-one-trace-4-variables-for-each-of-the-six-paths-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p class=\"wp-caption-text\">Line plots of one trace (4 variables) for each of the six paths.<\/p>\n<\/div>\n<p>A second plot is created showing the line plots for the four series in a single trace along with the trend lines.<\/p>\n<p>We can see that, at least for this trace, there is a clear trend in the sensor strength data as the user moves around the environment. This may suggest the opportunity to make the data stationary prior to modeling or using the trend for each series in a trace (observations or coefficients) as inputs to a predictive model.<\/p>\n<div id=\"attachment_6136\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6136\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Line-plots-for-the-time-series-in-a-single-trace-with-trend-lines.png\" alt=\"Line plots for the time series in a single trace with trend lines\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Line-plots-for-the-time-series-in-a-single-trace-with-trend-lines.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Line-plots-for-the-time-series-in-a-single-trace-with-trend-lines-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Line-plots-for-the-time-series-in-a-single-trace-with-trend-lines-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Line-plots-for-the-time-series-in-a-single-trace-with-trend-lines-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p class=\"wp-caption-text\">Line plots for the time series in a single trace with trend lines<\/p>\n<\/div>\n<h2>Model Evaluation<\/h2>\n<p>There are many ways to fit and evaluate a model on this data.<\/p>\n<p>Classification accuracy seems like a good first-cut evaluation metric given the balance of the classes. More nuance can be sought in the future by predicting probabilities and exploring thresholds on an ROC curve.<\/p>\n<p>I see two main themes in using this data:<\/p>\n<ul>\n<li><strong>Same Room<\/strong>: Can a model trained on traces in a room predict the outcome of new traces in that room?<\/li>\n<li><strong>Different Room<\/strong>: Can a model trained on traces in one or two rooms predict the outcome of new traces in a different room?<\/li>\n<\/ul>\n<p>The ES1 and ES2 cases described in the paper and summarized above explore these questions and provide a useful starting point.<\/p>\n<p>First, we must partition the loaded traces and targets into the three groups.<\/p>\n<pre class=\"crayon-plain-tag\"># separate traces\r\nseq1 = [sequences[i] for i in range(len(groups)) if groups[i]==1]\r\nseq2 = [sequences[i] for i in range(len(groups)) if groups[i]==2]\r\nseq3 = [sequences[i] for i in range(len(groups)) if groups[i]==3]\r\nprint(len(seq1),len(seq2),len(seq3))\r\n# separate target\r\ntargets1 = [targets[i] for i in range(len(groups)) if groups[i]==1]\r\ntargets2 = [targets[i] for i in range(len(groups)) if groups[i]==2]\r\ntargets3 = [targets[i] for i in range(len(groups)) if groups[i]==3]\r\nprint(len(targets1),len(targets2),len(targets3))<\/pre>\n<p>In the case of ES1, we can use k-fold cross-validation where k=5 to use the same ratio from the paper and the repeated evaluation provides some robustness to the evaluation.<\/p>\n<p>We can use the <a href=\"http:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.model_selection.cross_val_score.html\">cross_val_score() function<\/a> from scikit-learn to evaluate a model and then calculate the mean and standard deviation of the scores.<\/p>\n<pre class=\"crayon-plain-tag\"># evaluate model for ES1\r\nfrom numpy import mean\r\nfrom numpy import std\r\nfrom sklearn.model_selection import cross_val_score\r\n...\r\nscores = cross_val_score(model, X, y, scoring='accuracy', cv=5, n_jobs=-1)\r\nm, s = mean(scores), std(scores)<\/pre>\n<p>In the case of ES2, we can fit the model on datasets 1 and 2 and test model skill on dataset 3 directly.<\/p>\n<h2>Data Preparation<\/h2>\n<p>There is flexibility in how the input data is framed for the prediction problem.<\/p>\n<p>Two approaches come to mind:<\/p>\n<ul>\n<li><strong>Automatic Feature Learning<\/strong>. Deep neural networks are capable of automatic feature learning and recurrent neural networks can directly support multivariate multi-step input data. A recurrent neural network could be used, such as an LSTM or 1D CNN. The sequences could be padded to be the same length, such as 70 time steps, and a Masking layer could be used to ignore the padded time steps.<\/li>\n<li><strong>Feature Engineering<\/strong>. Alternately, the variable length sequences could be summarized as a single fixed length vector and provided to standard machine learning models for prediction. This would require careful feature engineering in order to provide a sufficient description of the trace for the model to learn a mapping to the output class.<\/li>\n<\/ul>\n<p>Both are interesting approaches.<\/p>\n<p>As a first pass, we will prepare the more traditional fixed-length vector input via manual feature engineering.<\/p>\n<p>Below are some ideas on features that could be included in the vector:<\/p>\n<ul>\n<li>First, middle, or last n observations for a variable.<\/li>\n<li>Mean or standard deviation for the first, middle, or last n observations for a variable.<\/li>\n<li>Difference between the last and first n\u2019th observations<\/li>\n<li>Differenced first, middle, or last n observations for a variable.<\/li>\n<li>Linear regression coefficients of all, first, middle, or last n observations for a variable.<\/li>\n<li>Linear regression predicted trend of first, middle, or last n observations for a variable.<\/li>\n<\/ul>\n<p>Additionally, data scaling is probably not required of the raw values as the data has already been scaled to the range -1 to 1. Scaling may be required if new features are added with different units.<\/p>\n<p>Some of the variables do show some trend, suggesting that perhaps a differencing of the variables may help in teasing out a signal.<\/p>\n<p>The distribution of each variable is nearly Gaussian, so some algorithms may benefit from standardization, or perhaps even a Box-Cox transform.<\/p>\n<h2>Algorithm Spot-Check<\/h2>\n<p>In this section, we will spot-check the default configuration for a suite of standard machine learning algorithms with different sets of engineered features.<\/p>\n<p>Spot-checking is a useful technique to flush out quickly whether there is any signal to be learned in the mapping between inputs and outputs with engineered features as most of the tested methods will pick something up. The method can also suggest methods that might be worth further investigating.<\/p>\n<p>A downside is that each method is not given its best chance (configuration) to show what it can do on the problem, meaning any methods that are further investigated will be biased by the first results.<\/p>\n<p>In these tests, we will look at a suite of six different types of algorithms, specifically:<\/p>\n<ul>\n<li>Logistic Regression.<\/li>\n<li>k-Nearest Neighbors.<\/li>\n<li>Decision Tree.<\/li>\n<li>Support Vector Machine.<\/li>\n<li>Random Forest.<\/li>\n<li>Gradient Boosting Machine.<\/li>\n<\/ul>\n<p>We will test the default configurations of these methods on features that focus on the end of the time series variables as they are likely the most predictive of whether a room transition will occur or not.<\/p>\n<h3>Last <em>n<\/em> Observations<\/h3>\n<p>The last <em>n<\/em> observations are likely to be predictive of whether the movement leads to a transition in rooms.<\/p>\n<p>The smallest number of time steps in the trace data is 19, therefore, we will use <em>n=19<\/em> as a starting point.<\/p>\n<p>The function below named <em>create_dataset()<\/em> will create a fixed-length vector using the last <em>n<\/em> observations from each trace in a flat one-dimensional vector, then add the target as the last element of the vector.<\/p>\n<p>This flattening of the trace data is required for simple machine learning algorithms.<\/p>\n<pre class=\"crayon-plain-tag\"># create a fixed 1d vector for each trace with output variable\r\ndef create_dataset(sequences, targets):\r\n\t# create the transformed dataset\r\n\ttransformed = list()\r\n\tn_vars = 4\r\n\tn_steps = 19\r\n\t# process each trace in turn\r\n\tfor i in range(len(sequences)):\r\n\t\tseq = sequences[i]\r\n\t\tvector = list()\r\n\t\t# last n observations\r\n\t\tfor row in range(1, n_steps+1):\r\n\t\t\tfor col in range(n_vars):\r\n\t\t\t\tvector.append(seq[-row, col])\r\n\t\t# add output\r\n\t\tvector.append(targets[i])\r\n\t\t# store\r\n\t\ttransformed.append(vector)\r\n\t# prepare array\r\n\ttransformed = array(transformed)\r\n\ttransformed = transformed.astype('float32')\r\n\treturn transformed<\/pre>\n<p>We can load the dataset as before and sort it into the datasets 1, 2, and 3 as described in the \u201c<em>Model Evaluation<\/em>\u201d section.<\/p>\n<p>We can then call the <em>create_dataset()<\/em> function to create the datasets required for the ES1 and ES2 cases, specifically ES1 combines datasets 1 and 2, whereas ES2 uses datasets 1 and 2 as a training set and dataset 3 as a test set.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># prepare fixed length vector dataset\r\nfrom os import listdir\r\nfrom numpy import array\r\nfrom numpy import savetxt\r\nfrom pandas import read_csv\r\n\r\n# return list of traces, and arrays for targets, groups and paths\r\ndef load_dataset(prefix=''):\r\n\tgrps_dir, data_dir = prefix+'groups\/', prefix+'dataset\/'\r\n\t# load mapping files\r\n\ttargets = read_csv(data_dir + 'MovementAAL_target.csv', header=0)\r\n\tgroups = read_csv(grps_dir + 'MovementAAL_DatasetGroup.csv', header=0)\r\n\tpaths = read_csv(grps_dir + 'MovementAAL_Paths.csv', header=0)\r\n\t# load traces\r\n\tsequences = list()\r\n\ttarget_mapping = None\r\n\tfor name in listdir(data_dir):\r\n\t\tfilename = data_dir + name\r\n\t\tif filename.endswith('_target.csv'):\r\n\t\t\tcontinue\r\n\t\tdf = read_csv(filename, header=0)\r\n\t\tvalues = df.values\r\n\t\tsequences.append(values)\r\n\treturn sequences, targets.values[:,1], groups.values[:,1], paths.values[:,1]\r\n\r\n# create a fixed 1d vector for each trace with output variable\r\ndef create_dataset(sequences, targets):\r\n\t# create the transformed dataset\r\n\ttransformed = list()\r\n\tn_vars = 4\r\n\tn_steps = 19\r\n\t# process each trace in turn\r\n\tfor i in range(len(sequences)):\r\n\t\tseq = sequences[i]\r\n\t\tvector = list()\r\n\t\t# last n observations\r\n\t\tfor row in range(1, n_steps+1):\r\n\t\t\tfor col in range(n_vars):\r\n\t\t\t\tvector.append(seq[-row, col])\r\n\t\t# add output\r\n\t\tvector.append(targets[i])\r\n\t\t# store\r\n\t\ttransformed.append(vector)\r\n\t# prepare array\r\n\ttransformed = array(transformed)\r\n\ttransformed = transformed.astype('float32')\r\n\treturn transformed\r\n\r\n# load dataset\r\nsequences, targets, groups, paths = load_dataset()\r\n# separate traces\r\nseq1 = [sequences[i] for i in range(len(groups)) if groups[i]==1]\r\nseq2 = [sequences[i] for i in range(len(groups)) if groups[i]==2]\r\nseq3 = [sequences[i] for i in range(len(groups)) if groups[i]==3]\r\n# separate target\r\ntargets1 = [targets[i] for i in range(len(groups)) if groups[i]==1]\r\ntargets2 = [targets[i] for i in range(len(groups)) if groups[i]==2]\r\ntargets3 = [targets[i] for i in range(len(groups)) if groups[i]==3]\r\n# create ES1 dataset\r\nes1 = create_dataset(seq1+seq2, targets1+targets2)\r\nprint('ES1: %s' % str(es1.shape))\r\nsavetxt('es1.csv', es1, delimiter=',')\r\n# create ES2 dataset\r\nes2_train = create_dataset(seq1+seq2, targets1+targets2)\r\nes2_test = create_dataset(seq3, targets3)\r\nprint('ES2 Train: %s' % str(es2_train.shape))\r\nprint('ES2 Test: %s' % str(es2_test.shape))\r\nsavetxt('es2_train.csv', es2_train, delimiter=',')\r\nsavetxt('es2_test.csv', es2_test, delimiter=',')<\/pre>\n<p>Running the example creates three new CSV files, specifically \u2018<em>es1.csv<\/em>\u2018, \u2018<em>es2_train.csv<\/em>\u2018, and \u2018<em>es2_test.csv<\/em>\u2018 for the ES1 and ES2 cases respectively.<\/p>\n<p>The shapes of these datasets are also summarized.<\/p>\n<pre class=\"crayon-plain-tag\">ES1: (210, 77)\r\nES2 Train: (210, 77)\r\nES2 Test: (104, 77)<\/pre>\n<p>Next, we can evaluate models on the ES1 dataset.<\/p>\n<p>After some testing, it appears that standardizing the dataset results in better model skill for those methods that rely on distance values (KNN and SVM) and generally has no effect on other methods. Therefore a Pipeline is used to evaluate each algorithm that first standardizes the dataset.<\/p>\n<p>The complete example of spot checking algorithms on the new dataset is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># spot check for ES1\r\nfrom numpy import mean\r\nfrom numpy import std\r\nfrom pandas import read_csv\r\nfrom matplotlib import pyplot\r\nfrom sklearn.model_selection import cross_val_score\r\nfrom sklearn.pipeline import Pipeline\r\nfrom sklearn.preprocessing import StandardScaler\r\nfrom sklearn.linear_model import LogisticRegression\r\nfrom sklearn.neighbors import KNeighborsClassifier\r\nfrom sklearn.tree import DecisionTreeClassifier\r\nfrom sklearn.svm import SVC\r\nfrom sklearn.ensemble import RandomForestClassifier\r\nfrom sklearn.ensemble import GradientBoostingClassifier\r\n# load dataset\r\ndataset = read_csv('es1.csv', header=None)\r\n# split into inputs and outputs\r\nvalues = dataset.values\r\nX, y = values[:, :-1], values[:, -1]\r\n# create a list of models to evaluate\r\nmodels, names = list(), list()\r\n# logistic\r\nmodels.append(LogisticRegression())\r\nnames.append('LR')\r\n# knn\r\nmodels.append(KNeighborsClassifier())\r\nnames.append('KNN')\r\n# cart\r\nmodels.append(DecisionTreeClassifier())\r\nnames.append('CART')\r\n# svm\r\nmodels.append(SVC())\r\nnames.append('SVM')\r\n# random forest\r\nmodels.append(RandomForestClassifier())\r\nnames.append('RF')\r\n# gbm\r\nmodels.append(GradientBoostingClassifier())\r\nnames.append('GBM')\r\n# evaluate models\r\nall_scores = list()\r\nfor i in range(len(models)):\r\n\t# create a pipeline for the model\r\n\ts = StandardScaler()\r\n\tp = Pipeline(steps=[('s',s), ('m',models[i])])\r\n\tscores = cross_val_score(p, X, y, scoring='accuracy', cv=5, n_jobs=-1)\r\n\tall_scores.append(scores)\r\n\t# summarize\r\n\tm, s = mean(scores)*100, std(scores)*100\r\n\tprint('%s %.3f%% +\/-%.3f' % (names[i], m, s))\r\n# plot\r\npyplot.boxplot(all_scores, labels=names)\r\npyplot.show()<\/pre>\n<p>Running the example prints the estimated performance of each algorithm, including the mean and standard deviation over 5-fold cross-validation.<\/p>\n<p>The results suggest SVM might be worth looking at in more detail at 58% accuracy.<\/p>\n<pre class=\"crayon-plain-tag\">LR 55.285% +\/-5.518\r\nKNN 50.897% +\/-5.310\r\nCART 50.501% +\/-10.922\r\nSVM 58.551% +\/-7.707\r\nRF 50.442% +\/-6.355\r\nGBM 55.749% +\/-5.423<\/pre>\n<p>The results are also presented as box-and-whisker plots showing the distribution of scores.<\/p>\n<p>Again, SVM appears to have good average performance and tight variance.<\/p>\n<div id=\"attachment_6137\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6137\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-19-observations.png\" alt=\"Spot-check Algorithms on ES1 with last 19 observations\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-19-observations.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-19-observations-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-19-observations-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-19-observations-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p class=\"wp-caption-text\">Spot-check Algorithms on ES1 with last 19 observations<\/p>\n<\/div>\n<h3>Last <em>n<\/em> Observations With Padding<\/h3>\n<p>We can pad each trace to a fixed length.<\/p>\n<p>This will then provide the flexibility to include more of the prior <em>n<\/em> observations in each sequence. The choice of <em>n<\/em> must also be balanced with the increase in padded values added to shorter sequences that in turn may negatively impact the performance of the model on those sequences.<\/p>\n<p>We can pad each sequence by adding the 0.0 value to the beginning of each variable sequence until a maximum length, e.g. 200 time steps, is reached. We can do this using the <a href=\"https:\/\/docs.scipy.org\/doc\/numpy\/reference\/generated\/numpy.pad.html\">pad() NumPy function<\/a>.<\/p>\n<pre class=\"crayon-plain-tag\">from numpy import pad\r\n...\r\n# pad sequences\r\nmax_length = 200\r\nseq = pad(seq, ((max_length-len(seq),0),(0,0)), 'constant', constant_values=(0.0))<\/pre>\n<p>The updated version of the <em>create_dataset()<\/em> function with padding support is below.<\/p>\n<p>We will try <em>n=25<\/em> to include 25 of the last observations in each sequence in each vector. This value was found with a little trial and error, although you may want to explore whether other configurations result in better skill.<\/p>\n<pre class=\"crayon-plain-tag\"># create a fixed 1d vector for each trace with output variable\r\ndef create_dataset(sequences, targets):\r\n\t# create the transformed dataset\r\n\ttransformed = list()\r\n\tn_vars, n_steps, max_length = 4, 25, 200\r\n\t# process each trace in turn\r\n\tfor i in range(len(sequences)):\r\n\t\tseq = sequences[i]\r\n\t\t# pad sequences\r\n\t\tseq = pad(seq, ((max_length-len(seq),0),(0,0)), 'constant', constant_values=(0.0))\r\n\t\tvector = list()\r\n\t\t# last n observations\r\n\t\tfor row in range(1, n_steps+1):\r\n\t\t\tfor col in range(n_vars):\r\n\t\t\t\tvector.append(seq[-row, col])\r\n\t\t# add output\r\n\t\tvector.append(targets[i])\r\n\t\t# store\r\n\t\ttransformed.append(vector)\r\n\t# prepare array\r\n\ttransformed = array(transformed)\r\n\ttransformed = transformed.astype('float32')\r\n\treturn transformed<\/pre>\n<p>Running the script again with the new function creates updated CSV files.<\/p>\n<pre class=\"crayon-plain-tag\">ES1: (210, 101)\r\nES2 Train: (210, 101)\r\nES2 Test: (104, 101)<\/pre>\n<p>Again, re-running the spot-check script on the data results in a small lift in model skill for SVM and also suggests that KNN might be worth investigating further.<\/p>\n<pre class=\"crayon-plain-tag\">LR 54.344% +\/-6.195\r\nKNN 58.562% +\/-4.456\r\nCART 52.837% +\/-7.650\r\nSVM 59.515% +\/-6.054\r\nRF 50.396% +\/-7.069\r\nGBM 50.873% +\/-5.416<\/pre>\n<p>The box plots for KNN and SVM show good performance and relatively tight standard deviations.<\/p>\n<div id=\"attachment_6138\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6138\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-25-observations.png\" alt=\"Spot-check Algorithms on ES1 with last 25 observations\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-25-observations.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-25-observations-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-25-observations-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-Algorithms-on-ES1-with-last-25-observations-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p class=\"wp-caption-text\">Spot-check Algorithms on ES1 with last 25 observations<\/p>\n<\/div>\n<p>We can update the spot-check to grid search a suite of k values for the KNN algorithm to see if the skill of the model can be further improved with a little tuning.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># spot check for ES1\r\nfrom numpy import mean\r\nfrom numpy import std\r\nfrom pandas import read_csv\r\nfrom matplotlib import pyplot\r\nfrom sklearn.model_selection import cross_val_score\r\nfrom sklearn.neighbors import KNeighborsClassifier\r\nfrom sklearn.pipeline import Pipeline\r\nfrom sklearn.preprocessing import StandardScaler\r\n\r\n# load dataset\r\ndataset = read_csv('es1.csv', header=None)\r\n# split into inputs and outputs\r\nvalues = dataset.values\r\nX, y = values[:, :-1], values[:, -1]\r\n# try a range of k values\r\nall_scores, names = list(), list()\r\nfor k in range(1,22):\r\n\t# evaluate\r\n\tscaler = StandardScaler()\r\n\tmodel = KNeighborsClassifier(n_neighbors=k)\r\n\tpipeline = Pipeline(steps=[('s',scaler), ('m',model)])\r\n\tnames.append(str(k))\r\n\tscores = cross_val_score(pipeline, X, y, scoring='accuracy', cv=5, n_jobs=-1)\r\n\tall_scores.append(scores)\r\n\t# summarize\r\n\tm, s = mean(scores)*100, std(scores)*100\r\n\tprint('k=%d %.3f%% +\/-%.3f' % (k, m, s))\r\n# plot\r\npyplot.boxplot(all_scores, labels=names)\r\npyplot.show()<\/pre>\n<p>Running the example prints the mean and standard deviation of the accuracy with k values from 1 to 21.<\/p>\n<p>We can see that a <em>k=7<\/em> results in the best skill of 62.872%.<\/p>\n<pre class=\"crayon-plain-tag\">k=1 49.534% +\/-4.407\r\nk=2 49.489% +\/-4.201\r\nk=3 56.599% +\/-6.923\r\nk=4 55.660% +\/-6.600\r\nk=5 58.562% +\/-4.456\r\nk=6 59.991% +\/-7.901\r\nk=7 62.872% +\/-8.261\r\nk=8 59.538% +\/-5.528\r\nk=9 57.633% +\/-4.723\r\nk=10 59.074% +\/-7.164\r\nk=11 58.097% +\/-7.583\r\nk=12 58.097% +\/-5.294\r\nk=13 57.179% +\/-5.101\r\nk=14 57.644% +\/-3.175\r\nk=15 59.572% +\/-5.481\r\nk=16 59.038% +\/-1.881\r\nk=17 59.027% +\/-2.981\r\nk=18 60.490% +\/-3.368\r\nk=19 60.014% +\/-2.497\r\nk=20 58.562% +\/-2.018\r\nk=21 58.131% +\/-3.084<\/pre>\n<p>The box and whisker plots of accuracy scores for <em>k<\/em> values show that <em>k<\/em> values around seven, such as five and six, also produce stable and well-performing models on the dataset.<\/p>\n<div id=\"attachment_6139\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6139\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Spot-check-KNN-neighbors-on-ES1-with-last-25-observations.png\" alt=\"Spot-check KNN neighbors on ES1 with last 25 observations\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-KNN-neighbors-on-ES1-with-last-25-observations.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-KNN-neighbors-on-ES1-with-last-25-observations-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-KNN-neighbors-on-ES1-with-last-25-observations-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Spot-check-KNN-neighbors-on-ES1-with-last-25-observations-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p class=\"wp-caption-text\">Spot-check KNN neighbors on ES1 with last 25 observations<\/p>\n<\/div>\n<h3>Evaluate KNN on ES2<\/h3>\n<p>Now that we have some idea of a representation (<em>n=25<\/em>) and a model (KNN, <em>k=7<\/em>) that have some skill over a random prediction, we can test the approach on the harder ES2 dataset.<\/p>\n<p>Each model is trained on the combination of dataset 1 and 2, then evaluated on dataset 3. The k-fold cross-validation procedure is not used, so we would expect the scores to be noisy.<\/p>\n<p>The complete spot checking of algorithms for ES2 is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># spot check for ES2\r\nfrom pandas import read_csv\r\nfrom matplotlib import pyplot\r\nfrom sklearn.metrics import accuracy_score\r\nfrom sklearn.linear_model import LogisticRegression\r\nfrom sklearn.neighbors import KNeighborsClassifier\r\nfrom sklearn.tree import DecisionTreeClassifier\r\nfrom sklearn.svm import SVC\r\nfrom sklearn.ensemble import RandomForestClassifier\r\nfrom sklearn.ensemble import GradientBoostingClassifier\r\nfrom sklearn.pipeline import Pipeline\r\nfrom sklearn.preprocessing import StandardScaler\r\n# load dataset\r\ntrain = read_csv('es2_train.csv', header=None)\r\ntest = read_csv('es2_test.csv', header=None)\r\n# split into inputs and outputs\r\ntrainX, trainy = train.values[:, :-1], train.values[:, -1]\r\ntestX, testy = test.values[:, :-1], test.values[:, -1]\r\n# create a list of models to evaluate\r\nmodels, names = list(), list()\r\n# logistic\r\nmodels.append(LogisticRegression())\r\nnames.append('LR')\r\n# knn\r\nmodels.append(KNeighborsClassifier())\r\nnames.append('KNN')\r\n# knn\r\nmodels.append(KNeighborsClassifier(n_neighbors=7))\r\nnames.append('KNN-7')\r\n# cart\r\nmodels.append(DecisionTreeClassifier())\r\nnames.append('CART')\r\n# svm\r\nmodels.append(SVC())\r\nnames.append('SVM')\r\n# random forest\r\nmodels.append(RandomForestClassifier())\r\nnames.append('RF')\r\n# gbm\r\nmodels.append(GradientBoostingClassifier())\r\nnames.append('GBM')\r\n# evaluate models\r\nall_scores = list()\r\nfor i in range(len(models)):\r\n\t# create a pipeline for the model\r\n\tscaler = StandardScaler()\r\n\tmodel = Pipeline(steps=[('s',scaler), ('m',models[i])])\r\n\t# fit\r\n\t# model = models[i]\r\n\tmodel.fit(trainX, trainy)\r\n\t# predict\r\n\tyhat = model.predict(testX)\r\n\t# evaluate\r\n\tscore = accuracy_score(testy, yhat) * 100\r\n\tall_scores.append(score)\r\n\t# summarize\r\n\tprint('%s %.3f%%' % (names[i], score))\r\n# plot\r\npyplot.bar(names, all_scores)\r\npyplot.show()<\/pre>\n<p>Running the example reports the model accuracy on the ES2 scenario.<\/p>\n<p>We can see that KNN does well and that the KNN with seven neighbors found to perform well on ES1 also performs well on ES2.<\/p>\n<pre class=\"crayon-plain-tag\">LR 45.192%\r\nKNN 54.808%\r\nKNN-7 57.692%\r\nCART 53.846%\r\nSVM 51.923%\r\nRF 53.846%\r\nGBM 52.885%<\/pre>\n<p>A bar chart of the accuracy scores helps to make the relative difference in performance between the methods clearer.<\/p>\n<div id=\"attachment_6140\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6140\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2018\/06\/Bar-chart-of-model-accuracy-on-ES2.png\" alt=\"Bar chart of model accuracy on ES2\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Bar-chart-of-model-accuracy-on-ES2.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Bar-chart-of-model-accuracy-on-ES2-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Bar-chart-of-model-accuracy-on-ES2-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2018\/06\/Bar-chart-of-model-accuracy-on-ES2-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p class=\"wp-caption-text\">Bar chart of model accuracy on ES2<\/p>\n<\/div>\n<p>The chosen representation and model configurations do have skill over a naive prediction with 50% accuracy.<\/p>\n<p>Further tuning may result in models with better skill, and we are a long way from the 95% and 89% accuracy reported in the paper on ES1 and ES2 respectively.<\/p>\n<h3>Extensions<\/h3>\n<p>This section lists some ideas for extending the tutorial that you may wish to explore.<\/p>\n<ul>\n<li><strong>Data Preparation<\/strong>. There is a lot of opportunity to explore further data preparation methods such as normalization, differencing, and power transforms.<\/li>\n<li><strong>Feature Engineering<\/strong>. Further feature engineering may result in better performing models, such as statistics for both the start, middle, and end of each sequence as well as trend information.<\/li>\n<li><strong>Tuning<\/strong>. Only the KNN algorithm was given the opportunity for tuning; other models such as gradient boosting may benefit from fine tuning of hyperparameters.<\/li>\n<li><strong>RNNs<\/strong>. This sequence classification task seems well suited to recurrent neural networks such as LSTMs that support variable length multivariate inputs. Some preliminary testing on this dataset (by myself) showed highly unstable results, but more extensive investigation may give better and even superior results.<\/li>\n<\/ul>\n<p>If you explore any of these extensions, I\u2019d love to know.<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Papers<\/h3>\n<ul>\n<li><a href=\"https:\/\/pdfs.semanticscholar.org\/40c2\/393e1874c3fd961fdfff02402c24ccf1c3d7.pdf#page=13\">Predicting user movements in heterogeneous indoor environments by reservoir computing<\/a>, 2011.<\/li>\n<li><a href=\"https:\/\/link.springer.com\/article\/10.1007\/s00521-013-1364-4\">An experimental characterization of reservoir computing in ambient assisted living applications<\/a>, 2014.<\/li>\n<\/ul>\n<h3>API<\/h3>\n<ul>\n<li><a href=\"http:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.model_selection.cross_val_score.html\">sklearn.model_selection.cross_val_score API<\/a><\/li>\n<li><a href=\"https:\/\/docs.scipy.org\/doc\/numpy\/reference\/generated\/numpy.linalg.lstsq.html\">numpy.linalg.lstsq API<\/a><\/li>\n<li><a href=\"https:\/\/docs.scipy.org\/doc\/numpy\/reference\/generated\/numpy.pad.html\">numpy.pad API<\/a><\/li>\n<\/ul>\n<h3>Articles<\/h3>\n<ul>\n<li><a href=\"https:\/\/archive.ics.uci.edu\/ml\/datasets\/Indoor+User+Movement+Prediction+from+RSS+data\">Indoor User Movement Prediction from RSS data Data Set, UCI Machine Learning Repository<\/a><\/li>\n<li><a href=\"http:\/\/wnlab.isti.cnr.it\/paolo\/index.php\/dataset\/6rooms\">Predicting User Movements in Heterogeneous Indoor Environments by Reservoir Computing, Paolo Barsocchi homepage<\/a>.<\/li>\n<li><a href=\"https:\/\/github.com\/Laurae2\/Indoor_Prediction\">Indoor User Movement Prediction from RSS data Data Set<\/a>, Laurae [French].<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered the indoor movement prediction time series classification problem and how to engineer features and evaluate machine learning algorithms for the problem.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>The time series classification problem of predicting the movement between rooms based on sensor strength.<\/li>\n<li>How to investigate the data in order to better understand the problem and how to engineer features from the raw data for predictive modeling.<\/li>\n<li>How to spot check a suite of classification algorithms and tune one algorithm to further lift performance on the problem.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/indoor-movement-time-series-classification-with-machine-learning-algorithms\/\">Indoor Movement Time Series Classification with Machine Learning Algorithms<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/indoor-movement-time-series-classification-with-machine-learning-algorithms\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee Indoor movement prediction involves using wireless sensor strength data to predict the location and motion of subjects within a building. It is [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/09\/09\/indoor-movement-time-series-classification-with-machine-learning-algorithms\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":1034,"comment_status":"registered_only","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1033"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=1033"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1033\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/1034"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=1033"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=1033"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=1033"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}