{"id":3254,"date":"2020-03-19T18:00:22","date_gmt":"2020-03-19T18:00:22","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2020\/03\/19\/basic-data-cleaning-for-machine-learning-that-you-must-perform\/"},"modified":"2020-03-19T18:00:22","modified_gmt":"2020-03-19T18:00:22","slug":"basic-data-cleaning-for-machine-learning-that-you-must-perform","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2020\/03\/19\/basic-data-cleaning-for-machine-learning-that-you-must-perform\/","title":{"rendered":"Basic Data Cleaning for Machine Learning (That You Must Perform)"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>Data cleaning is a critically important step in any machine learning project.<\/p>\n<p>In tabular data, there are many different statistical analysis and data visualization techniques you can use to explore your data in order to identify data cleaning operations you may want to perform.<\/p>\n<p>Before jumping to the sophisticated methods, there are some very basic data cleaning operations that you probably should perform on every single machine learning project. These are so basic that they are often overlooked by seasoned machine learning practitioners, yet are so critical that if skipped, models may break or report overly optimistic performance results.<\/p>\n<p>In this tutorial, you will discover basic data cleaning you should always perform on your dataset.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>How to identify and remove column variables that only have a single value.<\/li>\n<li>How to identify and consider column variables with very few unique values.<\/li>\n<li>How to identify and remove rows that contain duplicate observations.<\/li>\n<\/ul>\n<p>Let&rsquo;s get started.<\/p>\n<div id=\"attachment_9849\" style=\"width: 809px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-9849\" class=\"size-full wp-image-9849\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/03\/Basic-Data-Cleaning-You-Must-Perform-in-Machine-Learning.jpg\" alt=\"Basic Data Cleaning You Must Perform in Machine Learning\" width=\"799\" height=\"533\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/03\/Basic-Data-Cleaning-You-Must-Perform-in-Machine-Learning.jpg 799w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/03\/Basic-Data-Cleaning-You-Must-Perform-in-Machine-Learning-300x200.jpg 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/03\/Basic-Data-Cleaning-You-Must-Perform-in-Machine-Learning-768x512.jpg 768w\" sizes=\"(max-width: 799px) 100vw, 799px\"><\/p>\n<p id=\"caption-attachment-9849\" class=\"wp-caption-text\">Basic Data Cleaning You Must Perform in Machine Learning<br \/>Photo by <a href=\"https:\/\/flickr.com\/photos\/allenmcgregor\/5322599282\/\">Allen McGregor<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into five parts; they are:<\/p>\n<ol>\n<li>Identify Columns That Contain a Single Value<\/li>\n<li>Delete Columns That Contain a Single Value<\/li>\n<li>Consider Columns That Have Very Few Values<\/li>\n<li>Identify Rows that Contain Duplicate Data<\/li>\n<li>Delete Rows that Contain Duplicate Data<\/li>\n<\/ol>\n<h2>Identify Columns That Contain a Single Value<\/h2>\n<p>Columns that have a single observation or value are probably useless for modeling.<\/p>\n<p>Here, a single value means that each row for that column has the same value. For example, the column <em>X1<\/em> has the value 1.0 for all rows in the dataset:<\/p>\n<pre class=\"crayon-plain-tag\">X1\r\n1.0\r\n1.0\r\n1.0\r\n1.0\r\n1.0\r\n...<\/pre>\n<p>Columns that have a single value for all rows do not contain any information for modeling.<\/p>\n<p>Depending on the choice of data preparation and modeling algorithms, variables with a single value can also cause errors or unexpected results.<\/p>\n<p>You can detect rows that have this property using the <a href=\"https:\/\/docs.scipy.org\/doc\/numpy\/reference\/generated\/numpy.unique.html\">unique() NumPy function<\/a> that will report the number of unique values in each column.<\/p>\n<p>The example below loads the oil-spill classification dataset that contains 50 variables and summarizes the number of unique values for each column.<\/p>\n<pre class=\"crayon-plain-tag\"># summarize the number of unique values for each column using numpy\r\nfrom urllib.request import urlopen\r\nfrom numpy import loadtxt\r\nfrom numpy import unique\r\n# define the location of the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/oil-spill.csv'\r\n# load the dataset\r\ndata = loadtxt(urlopen(path), delimiter=',')\r\n# summarize the number of unique values in each column\r\nfor i in range(data.shape[1]):\r\n\tprint(i, len(unique(data[:, i])))<\/pre>\n<p>Running the example loads the dataset directly from the URL and prints the number of unique values for each column.<\/p>\n<p>We can see that column index 22 only has a single value and should be removed.<\/p>\n<pre class=\"crayon-plain-tag\">0 238\r\n1 297\r\n2 927\r\n3 933\r\n4 179\r\n5 375\r\n6 820\r\n7 618\r\n8 561\r\n9 57\r\n10 577\r\n11 59\r\n12 73\r\n13 107\r\n14 53\r\n15 91\r\n16 893\r\n17 810\r\n18 170\r\n19 53\r\n20 68\r\n21 9\r\n22 1\r\n23 92\r\n24 9\r\n25 8\r\n26 9\r\n27 308\r\n28 447\r\n29 392\r\n30 107\r\n31 42\r\n32 4\r\n33 45\r\n34 141\r\n35 110\r\n36 3\r\n37 758\r\n38 9\r\n39 9\r\n40 388\r\n41 220\r\n42 644\r\n43 649\r\n44 499\r\n45 2\r\n46 937\r\n47 169\r\n48 286\r\n49 2<\/pre>\n<p>A simpler approach is to use the <a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.nunique.html\">nunique() Pandas function<\/a> that does the hard work for you.<\/p>\n<p>Below is the same example using the Pandas function.<\/p>\n<pre class=\"crayon-plain-tag\"># summarize the number of unique values for each column using numpy\r\nfrom pandas import read_csv\r\n# define the location of the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/oil-spill.csv'\r\n# load the dataset\r\ndf = read_csv(path, header=None)\r\n# summarize the number of unique values in each column\r\nprint(df.nunique())<\/pre>\n<p>Running the example, we get the same result, the column index, and the number of unique values for each column.<\/p>\n<pre class=\"crayon-plain-tag\">0     238\r\n1     297\r\n2     927\r\n3     933\r\n4     179\r\n5     375\r\n6     820\r\n7     618\r\n8     561\r\n9      57\r\n10    577\r\n11     59\r\n12     73\r\n13    107\r\n14     53\r\n15     91\r\n16    893\r\n17    810\r\n18    170\r\n19     53\r\n20     68\r\n21      9\r\n22      1\r\n23     92\r\n24      9\r\n25      8\r\n26      9\r\n27    308\r\n28    447\r\n29    392\r\n30    107\r\n31     42\r\n32      4\r\n33     45\r\n34    141\r\n35    110\r\n36      3\r\n37    758\r\n38      9\r\n39      9\r\n40    388\r\n41    220\r\n42    644\r\n43    649\r\n44    499\r\n45      2\r\n46    937\r\n47    169\r\n48    286\r\n49      2\r\ndtype: int64<\/pre>\n<\/p>\n<h2>Delete Columns That Contain a Single Value<\/h2>\n<p>Variables or columns that have a single value should probably be removed from your dataset<\/p>\n<p>Columns are relatively easy to remove from a NumPy array or Pandas DataFrame.<\/p>\n<p>One approach is to record all columns that have a single unique value, then delete them from the Pandas DataFrame by calling the <a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.drop.html\">drop() function<\/a>.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># delete columns with a single unique value\r\nfrom pandas import read_csv\r\n# define the location of the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/oil-spill.csv'\r\n# load the dataset\r\ndf = read_csv(path, header=None)\r\nprint(df.shape)\r\n# get number of unique values for each column\r\ncounts = df.nunique()\r\n# record columns to delete\r\nto_del = [i for i,v in enumerate(counts) if v == 1]\r\nprint(to_del)\r\n# drop useless columns\r\ndf.drop(to_del, axis=1, inplace=True)\r\nprint(df.shape)<\/pre>\n<p>Running the example first loads the dataset and reports the number of rows and columns.<\/p>\n<p>The number of unique values for each column is calculated, and those columns that have a single unique value are identified. In this case, column index 22.<\/p>\n<p>The identified columns are then removed from the DataFrame, and the number of rows and columns in the DataFrame are reported to confirm the change.<\/p>\n<pre class=\"crayon-plain-tag\">(937, 50)\r\n[22]\r\n(937, 49)<\/pre>\n<\/p>\n<h2>Consider Columns That Have Very Few Values<\/h2>\n<p>In the previous section, we saw that some columns in the example dataset had very few unique values.<\/p>\n<p>For example, there were columns that only had 2, 4, and 9 unique values. This might make sense for ordinal or categorical variables. In this case, the dataset only contains numerical variables. As such, only having 2, 4, or 9 unique numerical values in a column might be surprising.<\/p>\n<p>These columns may or may not contribute to the skill of a model.<\/p>\n<p>Depending on the choice of data preparation and modeling algorithms, variables with very few numerical values can also cause errors or unexpected results. For example, I have seen them cause errors when using power transforms for data preparation and when fitting linear models that assume a &ldquo;<em>sensible<\/em>&rdquo; data probability distribution.<\/p>\n<p>To help highlight columns of this type, you can calculate the number of unique values for each variable as a percentage of the total number of rows in the dataset.<\/p>\n<p>Let&rsquo;s do this manually using NumPy. The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># summarize the percentage of unique values for each column using numpy\r\nfrom urllib.request import urlopen\r\nfrom numpy import loadtxt\r\nfrom numpy import unique\r\n# define the location of the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/oil-spill.csv'\r\n# load the dataset\r\ndata = loadtxt(urlopen(path), delimiter=',')\r\n# summarize the number of unique values in each column\r\nfor i in range(data.shape[1]):\r\n\tnum = len(unique(data[:, i]))\r\n\tpercentage = float(num) \/ data.shape[0] * 100\r\n\tprint('%d, %d, %.1f%%' % (i, num, percentage))<\/pre>\n<p>Running the example reports the column index and the number of unique values for each column, followed by the percentage of unique values out of all rows in the dataset.<\/p>\n<p>Here, we can see that some columns have a very low percentage of unique values, such as below 1 percent.<\/p>\n<pre class=\"crayon-plain-tag\">0, 238, 25.4%\r\n1, 297, 31.7%\r\n2, 927, 98.9%\r\n3, 933, 99.6%\r\n4, 179, 19.1%\r\n5, 375, 40.0%\r\n6, 820, 87.5%\r\n7, 618, 66.0%\r\n8, 561, 59.9%\r\n9, 57, 6.1%\r\n10, 577, 61.6%\r\n11, 59, 6.3%\r\n12, 73, 7.8%\r\n13, 107, 11.4%\r\n14, 53, 5.7%\r\n15, 91, 9.7%\r\n16, 893, 95.3%\r\n17, 810, 86.4%\r\n18, 170, 18.1%\r\n19, 53, 5.7%\r\n20, 68, 7.3%\r\n21, 9, 1.0%\r\n22, 1, 0.1%\r\n23, 92, 9.8%\r\n24, 9, 1.0%\r\n25, 8, 0.9%\r\n26, 9, 1.0%\r\n27, 308, 32.9%\r\n28, 447, 47.7%\r\n29, 392, 41.8%\r\n30, 107, 11.4%\r\n31, 42, 4.5%\r\n32, 4, 0.4%\r\n33, 45, 4.8%\r\n34, 141, 15.0%\r\n35, 110, 11.7%\r\n36, 3, 0.3%\r\n37, 758, 80.9%\r\n38, 9, 1.0%\r\n39, 9, 1.0%\r\n40, 388, 41.4%\r\n41, 220, 23.5%\r\n42, 644, 68.7%\r\n43, 649, 69.3%\r\n44, 499, 53.3%\r\n45, 2, 0.2%\r\n46, 937, 100.0%\r\n47, 169, 18.0%\r\n48, 286, 30.5%\r\n49, 2, 0.2%<\/pre>\n<p>We can update the example to only summarize those variables that have unique values that are less than 1 percent of the number of rows.<\/p>\n<pre class=\"crayon-plain-tag\"># summarize the percentage of unique values for each column using numpy\r\nfrom urllib.request import urlopen\r\nfrom numpy import loadtxt\r\nfrom numpy import unique\r\n# define the location of the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/oil-spill.csv'\r\n# load the dataset\r\ndata = loadtxt(urlopen(path), delimiter=',')\r\n# summarize the number of unique values in each column\r\nfor i in range(data.shape[1]):\r\n\tnum = len(unique(data[:, i]))\r\n\tpercentage = float(num) \/ data.shape[0] * 100\r\n\tif percentage &lt; 1:\r\n\t\tprint('%d, %d, %.1f%%' % (i, num, percentage))<\/pre>\n<p>Running the example, we can see that 11 of the 50 variables have numerical variables that have unique values that are less than 1 percent of the number of rows.<\/p>\n<p>This does not mean that these rows and columns should be deleted, but they require further attention.<\/p>\n<p>For example:<\/p>\n<ul>\n<li>Perhaps the unique values can be encoded as ordinal values?<\/li>\n<li>Perhaps the unique values can be encoded as categorical values?<\/li>\n<li>Perhaps compare model skill with each variable removed from the dataset?<\/li>\n<\/ul>\n<pre class=\"crayon-plain-tag\">21, 9, 1.0%\r\n22, 1, 0.1%\r\n24, 9, 1.0%\r\n25, 8, 0.9%\r\n26, 9, 1.0%\r\n32, 4, 0.4%\r\n36, 3, 0.3%\r\n38, 9, 1.0%\r\n39, 9, 1.0%\r\n45, 2, 0.2%\r\n49, 2, 0.2%<\/pre>\n<p>For example, if we wanted to delete all 11 columns with unique values less than 1 percent of rows; the example below demonstrates this.<\/p>\n<pre class=\"crayon-plain-tag\"># delete columns where number of unique values is less than 1% of the rows\r\nfrom pandas import read_csv\r\n# define the location of the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/oil-spill.csv'\r\n# load the dataset\r\ndf = read_csv(path, header=None)\r\nprint(df.shape)\r\n# get number of unique values for each column\r\ncounts = df.nunique()\r\n# record columns to delete\r\nto_del = [i for i,v in enumerate(counts) if (float(v)\/df.shape[0]*100) &lt; 1]\r\nprint(to_del)\r\n# drop useless columns\r\ndf.drop(to_del, axis=1, inplace=True)\r\nprint(df.shape)<\/pre>\n<p>Running the example first loads the dataset and reports the number of rows and columns.<\/p>\n<p>The number of unique values for each column is calculated, and those columns that have a number of unique values less than 1 percent of the rows are identified. In this case, 11 columns.<\/p>\n<p>The identified columns are then removed from the DataFrame, and the number of rows and columns in the DataFrame are reported to confirm the change.<\/p>\n<pre class=\"crayon-plain-tag\">(937, 50)\r\n[21, 22, 24, 25, 26, 32, 36, 38, 39, 45, 49]\r\n(937, 39)<\/pre>\n<\/p>\n<h2>Identify Rows That Contain Duplicate Data<\/h2>\n<p>Rows that have identical data are probably useless, if not dangerously misleading during model evaluation.<\/p>\n<p>Here, a duplicate row is a row where each value in each column for that row appears in identically the same order (same column values) in another row.<\/p>\n<p>From a probabilistic perspective, you can think of duplicate data as adjusting the priors for a class label or data distribution. This may help an algorithm like Naive Bayes if you wish to purposefully bias the priors. Typically, this is not the case and machine learning algorithms will perform better by identifying and removing rows with duplicate data.<\/p>\n<p>From an algorithm evaluation perspective, duplicate rows will result in misleading performance. For example, if you are using a train\/test split or <a href=\"https:\/\/machinelearningmastery.com\/k-fold-cross-validation\/\">k-fold cross-validation<\/a>, then it is possible for a duplicate row or rows to appear in both train and test datasets and any evaluation of the model on these rows will be (or should be) correct. This will result in an optimistically biased estimate of performance on unseen data.<\/p>\n<p>If you think this is not the case for your dataset or chosen model, design a controlled experiment to test it. This could be achieved by evaluating model skill with the raw dataset and the dataset with duplicates removed and comparing performance. Another experiment might involve augmenting the dataset with different numbers of randomly selected duplicate examples.<\/p>\n<p>The pandas function <a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.duplicated.html\">duplicated()<\/a> will report whether a given row is duplicated or not. All rows are marked as either False to indicate that it is not a duplicate or True to indicate that it is a duplicate. If there are duplicates, the first occurrence of the row is marked False (by default), as we might expect.<\/p>\n<p>The example below checks for duplicates.<\/p>\n<pre class=\"crayon-plain-tag\"># locate rows of duplicate data\r\nfrom pandas import read_csv\r\n# define the location of the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/iris.csv'\r\n# load the dataset\r\ndf = read_csv(path, header=None)\r\n# calculate duplicates\r\ndups = df.duplicated()\r\n# report if there are any duplicates\r\nprint(dups.any())\r\n# list all duplicate rows\r\nprint(df[dups])<\/pre>\n<p>Running the example first loads the dataset, then calculates row duplicates.<\/p>\n<p>First, the presence of any duplicate rows is reported, and in this case, we can see that there are duplicates (True).<\/p>\n<p>Then all duplicate rows are reported. In this case, we can see that three duplicate rows that were identified are printed.<\/p>\n<pre class=\"crayon-plain-tag\">True\r\n       0    1    2    3               4\r\n34   4.9  3.1  1.5  0.1     Iris-setosa\r\n37   4.9  3.1  1.5  0.1     Iris-setosa\r\n142  5.8  2.7  5.1  1.9  Iris-virginica<\/pre>\n<\/p>\n<h2>Delete Rows That Contain Duplicate Data<\/h2>\n<p>Rows of duplicate data should probably be deleted from your dataset prior to modeling.<\/p>\n<p>There are many ways to achieve this, although Pandas provides the <a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.drop_duplicates.html\">drop_duplicates() function<\/a> that achieves exactly this.<\/p>\n<p>The example below demonstrates deleting duplicate rows from a dataset.<\/p>\n<pre class=\"crayon-plain-tag\"># delete rows of duplicate data from the dataset\r\nfrom pandas import read_csv\r\n# define the location of the dataset\r\npath = 'https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/iris.csv'\r\n# load the dataset\r\ndf = read_csv(path, header=None)\r\nprint(df.shape)\r\n# delete duplicate rows\r\ndf.drop_duplicates(inplace=True)\r\nprint(df.shape)<\/pre>\n<p>Running the example first loads the dataset and reports the number of rows and columns.<\/p>\n<p>Next, the rows of duplicated data are identified and removed from the DataFrame. Then the shape of the DataFrame is reported to confirm the change.<\/p>\n<pre class=\"crayon-plain-tag\">(150, 5)\r\n(147, 5)<\/pre>\n<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Tutorials<\/h3>\n<ul>\n<li><a href=\"https:\/\/machinelearningmastery.com\/load-machine-learning-data-python\/\">How To Load Machine Learning Data in Python<\/a><\/li>\n<li><a href=\"https:\/\/machinelearningmastery.com\/data-cleaning-turn-messy-data-into-tidy-data\/\">Data Cleaning: Turn Messy Data into Tidy Data<\/a><\/li>\n<\/ul>\n<h3>APIs<\/h3>\n<ul>\n<li><a href=\"https:\/\/docs.scipy.org\/doc\/numpy\/reference\/generated\/numpy.unique.html\">numpy.unique API<\/a>.<\/li>\n<li><a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.nunique.html\">pandas.DataFrame.nunique API<\/a>.<\/li>\n<li><a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.drop.html\">pandas.DataFrame.drop API<\/a>.<\/li>\n<li><a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.duplicated.html\">pandas.DataFrame.duplicated API<\/a>.<\/li>\n<li><a href=\"https:\/\/pandas.pydata.org\/pandas-docs\/stable\/reference\/api\/pandas.DataFrame.drop_duplicates.html\">pandas.DataFrame.drop_duplicates API<\/a>.<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered basic data cleaning you should always perform on your dataset.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to identify and remove column variables that only have a single value.<\/li>\n<li>How to identify and consider column variables with very few unique values.<\/li>\n<li>How to identify and remove rows that contain duplicate observations.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/basic-data-cleaning-for-machine-learning\/\">Basic Data Cleaning for Machine Learning (That You Must Perform)<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/basic-data-cleaning-for-machine-learning\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee Data cleaning is a critically important step in any machine learning project. In tabular data, there are many different statistical analysis and [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2020\/03\/19\/basic-data-cleaning-for-machine-learning-that-you-must-perform\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":3255,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/3254"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=3254"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/3254\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/3255"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=3254"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=3254"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=3254"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}