{"id":2759,"date":"2019-10-31T18:00:06","date_gmt":"2019-10-31T18:00:06","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/10\/31\/a-gentle-introduction-to-expectation-maximization-em-algorithm\/"},"modified":"2019-10-31T18:00:06","modified_gmt":"2019-10-31T18:00:06","slug":"a-gentle-introduction-to-expectation-maximization-em-algorithm","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/10\/31\/a-gentle-introduction-to-expectation-maximization-em-algorithm\/","title":{"rendered":"A Gentle Introduction to Expectation-Maximization (EM Algorithm)"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p>Maximum likelihood estimation is an approach to density estimation for a dataset by searching across probability distributions and their parameters.<\/p>\n<p>It is a general and effective approach that underlies many machine learning algorithms, although it requires that the training dataset is complete, e.g. all relevant interacting random variables are present. Maximum likelihood becomes intractable if there are variables that interact with those in the dataset but were hidden or not observed, so-called latent variables.<\/p>\n<p>The expectation-maximization algorithm is an approach for performing maximum likelihood estimation in the presence of latent variables. It does this by first estimating the values for the latent variables, then optimizing the model, then repeating these two steps until convergence. It is an effective and general approach and is most commonly used for density estimation with missing data, such as clustering algorithms like the Gaussian Mixture Model.<\/p>\n<p>In this post, you will discover the expectation-maximization algorithm.<\/p>\n<p>After reading this post, you will know:<\/p>\n<ul>\n<li>Maximum likelihood estimation is challenging on data in the presence of latent variables.<\/li>\n<li>Expectation maximization provides an iterative solution to maximum likelihood estimation with latent variables.<\/li>\n<li>Gaussian mixture models are an approach to density estimation where the parameters of the distributions are fit using the expectation-maximization algorithm.<\/li>\n<\/ul>\n<p>Discover bayes opimization, naive bayes, maximum likelihood, distributions, cross entropy, and much more <a href=\"https:\/\/machinelearningmastery.com\/probability-for-machine-learning\/\" rel=\"nofollow\">in my new book<\/a>, with 28 step-by-step tutorials and full Python source code.<\/p>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_8937\" style=\"width: 649px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8937\" class=\"size-full wp-image-8937\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/11\/A-Gentle-Introduction-to-Expectation-Maximization-EM-Algorithm.jpg\" alt=\"A Gentle Introduction to Expectation Maximization (EM Algorithm)\" width=\"639\" height=\"427\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/11\/A-Gentle-Introduction-to-Expectation-Maximization-EM-Algorithm.jpg 639w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/11\/A-Gentle-Introduction-to-Expectation-Maximization-EM-Algorithm-300x200.jpg 300w\" sizes=\"(max-width: 639px) 100vw, 639px\"><\/p>\n<p id=\"caption-attachment-8937\" class=\"wp-caption-text\">A Gentle Introduction to Expectation Maximization (EM Algorithm)<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/valcker\/36015631850\/\">valcker<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Overview<\/h2>\n<p>This tutorial is divided into four parts; they are:<\/p>\n<ol>\n<li>Problem of Latent Variables for Maximum Likelihood<\/li>\n<li>Expectation-Maximization Algorithm<\/li>\n<li>Gaussian Mixture Model and the EM Algorithm<\/li>\n<li>Example of Gaussian Mixture Model<\/li>\n<\/ol>\n<h2>Problem of Latent Variables for Maximum Likelihood<\/h2>\n<p>A common modeling problem involves how to estimate a joint probability distribution for a dataset.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.com\/probability-density-estimation\/\">Density estimation<\/a> involves selecting a probability distribution function and the parameters of that distribution that best explain the joint probability distribution of the observed data.<\/p>\n<p>There are many techniques for solving this problem, although a common approach is called maximum likelihood estimation, or simply \u201c<em>maximum likelihood<\/em>.\u201d<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.com\/what-is-maximum-likelihood-estimation-in-machine-learning\/\">Maximum Likelihood Estimation<\/a> involves treating the problem as an optimization or search problem, where we seek a set of parameters that results in the best fit for the joint probability of the data sample.<\/p>\n<p>A limitation of maximum likelihood estimation is that it assumes that the dataset is complete, or fully observed. This does not mean that the model has access to all data; instead, it assumes that all variables that are relevant to the problem are present.<\/p>\n<p>This is not always the case. There may be datasets where only some of the relevant variables can be observed, and some cannot, and although they influence other random variables in the dataset, they remain hidden.<\/p>\n<p>More generally, these unobserved or hidden variables are referred to as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Latent_variable\">latent variables<\/a>.<\/p>\n<blockquote>\n<p>Many real-world problems have hidden variables (sometimes called latent variables), which are not observable in the data that are available for learning.<\/p>\n<\/blockquote>\n<p>\u2014 Page 816, <a href=\"https:\/\/amzn.to\/2Y7yCpO\">Artificial Intelligence: A Modern Approach<\/a>, 3rd edition, 2009.<\/p>\n<p>Conventional maximum likelihood estimation does not work well in the presence of latent variables.<\/p>\n<blockquote>\n<p>\u2026 if we have missing data and\/or latent variables, then computing the [maximum likelihood] estimate becomes hard.<\/p>\n<\/blockquote>\n<p>\u2014 Page 349, <a href=\"https:\/\/amzn.to\/2xKSTCP\">Machine Learning: A Probabilistic Perspective<\/a>, 2012.<\/p>\n<p>Instead, an alternate formulation of maximum likelihood is required for searching for the appropriate model parameters in the presence of latent variables.<\/p>\n<p>The Expectation-Maximization algorithm is one such approach.<\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<p><center><\/p>\n<h3>Want to Learn Probability for Machine Learning<\/h3>\n<p>Take my free 7-day email crash course now (with sample code).<\/p>\n<p>Click to sign-up and also get a free PDF Ebook version of the course.<\/p>\n<p><a href=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/16cf92561172a2%3A164f8be4f346dc\/4623731828588544\/\" target=\"_blank\" style=\"background: rgb(255, 206, 10); color: rgb(255, 255, 255); text-decoration: none; font-family: Helvetica, Arial, sans-serif; font-weight: bold; font-size: 16px; line-height: 20px; padding: 10px; display: inline-block; max-width: 300px; border-radius: 5px; text-shadow: rgba(0, 0, 0, 0.25) 0px -1px 1px; box-shadow: rgba(255, 255, 255, 0.5) 0px 1px 3px inset, rgba(0, 0, 0, 0.5) 0px 1px 3px;\" rel=\"noopener noreferrer\">Download Your FREE Mini-Course<\/a><script data-leadbox=\"16cf92561172a2:164f8be4f346dc\" data-url=\"https:\/\/machinelearningmastery.lpages.co\/leadbox\/16cf92561172a2%3A164f8be4f346dc\/4623731828588544\/\" data-config=\"%7B%7D\" type=\"text\/javascript\" src=\"https:\/\/machinelearningmastery.lpages.co\/leadbox-1568216021.js\"><\/script><\/p>\n<p><\/center><\/p>\n<div class=\"woo-sc-hr\"><\/div>\n<h2>Expectation-Maximization Algorithm<\/h2>\n<p>The Expectation-Maximization Algorithm, or EM algorithm for short, is an approach for maximum likelihood estimation in the presence of latent variables.<\/p>\n<blockquote>\n<p>A general technique for finding maximum likelihood estimators in latent variable models is the expectation-maximization (EM) algorithm.<\/p>\n<\/blockquote>\n<p>\u2014 Page 424, <a href=\"https:\/\/amzn.to\/2JwHE7I\">Pattern Recognition and Machine Learning<\/a>, 2006.<\/p>\n<p>The EM algorithm is an iterative approach that cycles between two modes. The first mode attempts to estimate the missing or latent variables, called the estimation-step or E-step. The second mode attempts to optimize the parameters of the model to best explain the data, called the maximization-step or M-step.<\/p>\n<ul>\n<li><strong>E-Step<\/strong>. Estimate the missing variables in the dataset.<\/li>\n<li><strong>M-Step<\/strong>. Maximize the parameters of the model in the presence of the data.<\/li>\n<\/ul>\n<p>The EM algorithm can be applied quite widely, although is perhaps most well known in machine learning for use in unsupervised learning problems, such as density estimation and clustering.<\/p>\n<p>Perhaps the most discussed application of the EM algorithm is for clustering with a mixture model.<\/p>\n<h2>Gaussian Mixture Model and the EM Algorithm<\/h2>\n<p>A <a href=\"https:\/\/en.wikipedia.org\/wiki\/Mixture_model\">mixture model<\/a> is a model comprised of an unspecified combination of multiple probability distribution functions.<\/p>\n<p>A statistical procedure or learning algorithm is used to estimate the parameters of the probability distributions to best fit the density of a given training dataset.<\/p>\n<p>The Gaussian Mixture Model, or GMM for short, is a mixture model that uses a combination of Gaussian (Normal) probability distributions and requires the estimation of the mean and standard deviation parameters for each.<\/p>\n<p>There are many techniques for estimating the parameters for a GMM, although a maximum likelihood estimate is perhaps the most common.<\/p>\n<p>Consider the case where a dataset is comprised of many points that happen to be generated by two different processes. The points for each process have a Gaussian probability distribution, but the data is combined and the distributions are similar enough that it is not obvious to which distribution a given point may belong.<\/p>\n<p>The processes used to generate the data point represents a latent variable, e.g. process 0 and process 1. It influences the data but is not observable. As such, the EM algorithm is an appropriate approach to use to estimate the parameters of the distributions.<\/p>\n<p>In the EM algorithm, the estimation-step would estimate a value for the process latent variable for each data point, and the maximization step would optimize the parameters of the probability distributions in an attempt to best capture the density of the data. The process is repeated until a good set of latent values and a maximum likelihood is achieved that fits the data.<\/p>\n<ul>\n<li><strong>E-Step<\/strong>. Estimate the expected value for each latent variable.<\/li>\n<li><strong>M-Step<\/strong>. Optimize the parameters of the distribution using maximum likelihood.<\/li>\n<\/ul>\n<p>We can imagine how this optimization procedure could be constrained to just the distribution means, or generalized to a mixture of many different Gaussian distributions.<\/p>\n<h2>Example of Gaussian Mixture Model<\/h2>\n<p>We can make the application of the EM algorithm to a Gaussian Mixture Model concrete with a worked example.<\/p>\n<p>First, let\u2019s contrive a problem where we have a dataset where points are generated from one of two Gaussian processes. The points are one-dimensional, the mean of the first distribution is 20, the mean of the second distribution is 40, and both distributions have a standard deviation of 5.<\/p>\n<p>We will draw 3,000 points from the first process and 7,000 points from the second process and mix them together.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# generate a sample\r\nX1 = normal(loc=20, scale=5, size=3000)\r\nX2 = normal(loc=40, scale=5, size=7000)\r\nX = hstack((X1, X2))<\/pre>\n<p>We can then plot a histogram of the points to give an intuition for the dataset. We expect to see a bimodal distribution with a peak for each of the means of the two distributions.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of a bimodal constructed from two gaussian processes\r\nfrom numpy import hstack\r\nfrom numpy.random import normal\r\nfrom matplotlib import pyplot\r\n# generate a sample\r\nX1 = normal(loc=20, scale=5, size=3000)\r\nX2 = normal(loc=40, scale=5, size=7000)\r\nX = hstack((X1, X2))\r\n# plot the histogram\r\npyplot.hist(X, bins=50, density=True)\r\npyplot.show()<\/pre>\n<p>Running the example creates the dataset and then creates a histogram plot for the data points.<\/p>\n<p>The plot clearly shows the expected bimodal distribution with a peak for the first process around 20 and a peak for the second process around 40.<\/p>\n<p>We can see that for many of the points in the middle of the two peaks that it is ambiguous as to which distribution they were drawn from.<\/p>\n<div id=\"attachment_8936\" style=\"width: 1290px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-8936\" class=\"size-full wp-image-8936\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2019\/08\/Histogram-of-Dataset-Constructed-from-Two-Different-Gaussian-Processes.png\" alt=\"Histogram of Dataset Constructed From Two Different Gaussian Processes\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/Histogram-of-Dataset-Constructed-from-Two-Different-Gaussian-Processes.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/Histogram-of-Dataset-Constructed-from-Two-Different-Gaussian-Processes-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/Histogram-of-Dataset-Constructed-from-Two-Different-Gaussian-Processes-768x576.png 768w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2019\/08\/Histogram-of-Dataset-Constructed-from-Two-Different-Gaussian-Processes-1024x768.png 1024w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-8936\" class=\"wp-caption-text\">Histogram of Dataset Constructed From Two Different Gaussian Processes<\/p>\n<\/div>\n<p>We can model the problem of estimating the density of this dataset using a Gaussian Mixture Model.<\/p>\n<p>The <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.mixture.GaussianMixture.html\">GaussianMixture<\/a> scikit-learn class can be used to model this problem and estimate the parameters of the distributions using the expectation-maximization algorithm.<\/p>\n<p>The class allows us to specify the suspected number of underlying processes used to generate the data via the <em>n_components<\/em> argument when defining the model. We will set this to 2 for the two processes or distributions.<\/p>\n<p>If the number of processes was not known, a range of different numbers of components could be tested and the model with the best fit could be chosen, where models could be evaluated using scores such as Akaike or Bayesian Information Criterion (AIC or BIC).<\/p>\n<p>There are also many ways we can configure the model to incorporate other information we may know about the data, such as how to estimate initial values for the distributions. In this case, we will randomly guess the initial parameters, by setting the <em>init_params<\/em> argument to \u2018random\u2019.<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# fit model\r\nmodel = GaussianMixture(n_components=2, init_params='random')\r\nmodel.fit(X)<\/pre>\n<p>Once the model is fit, we can access the learned parameters via arguments on the model, such as the means, covariances, mixing weights, and more.<\/p>\n<p>More usefully, we can use the fit model to estimate the latent parameters for existing and new data points.<\/p>\n<p>For example, we can estimate the latent variable for the points in the training dataset and we would expect the first 3,000 points to belong to the first process (<em>value=0<\/em>) and the next 7,000 data points to belong to the second process (<em>value=1<\/em>).<\/p>\n<pre class=\"crayon-plain-tag\">...\r\n# predict latent values\r\nyhat = model.predict(X)\r\n# check latent value for first few points, expect 0\r\nprint(yhat[:100])\r\n# check latent value for last few points, expect 1\r\nprint(yhat[-100:])<\/pre>\n<p>Tying all of this together, the complete example is listed below.<\/p>\n<pre class=\"crayon-plain-tag\"># example of fitting a gaussian mixture model with expectation maximization\r\nfrom numpy import hstack\r\nfrom numpy.random import normal\r\nfrom sklearn.mixture import GaussianMixture\r\n# generate a sample\r\nX1 = normal(loc=20, scale=5, size=3000)\r\nX2 = normal(loc=40, scale=5, size=7000)\r\nX = hstack((X1, X2))\r\n# reshape into a table with one column\r\nX = X.reshape((len(X), 1))\r\n# fit model\r\nmodel = GaussianMixture(n_components=2, init_params='random')\r\nmodel.fit(X)\r\n# predict latent values\r\nyhat = model.predict(X)\r\n# check latent value for first few points, expect 0\r\nprint(yhat[:100])\r\n# check latent value for last few points, expect 1\r\nprint(yhat[-100:])<\/pre>\n<p>Running the example fits the Gaussian mixture model on the prepared dataset using the EM algorithm. Once fit, the model is used to predict the latent variable values for the examples in the training dataset.<\/p>\n<p>Your specific results may vary given the stochastic nature of the learning algorithm.<\/p>\n<p>In this case, we can see that at least for the first few and last few examples in the dataset, that the model mostly predicts the correct value for the latent variable. It\u2019s a generally challenging problem and it is expected that the points between the peaks of the distribution will remain ambiguous and assigned to one process or another holistically.<\/p>\n<pre class=\"crayon-plain-tag\">[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\r\n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\r\n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]\r\n[0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1\r\n 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0\r\n 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]<\/pre>\n<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Books<\/h3>\n<ul>\n<li>Section 8.5 The EM Algorithm, <a href=\"https:\/\/amzn.to\/2YVqu8s\">The Elements of Statistical Learning<\/a>, 2016.<\/li>\n<li>Chapter 9 Mixture Models and EM, <a href=\"https:\/\/amzn.to\/2JwHE7I\">Pattern Recognition and Machine Learning<\/a>, 2006.<\/li>\n<li>Section 6.12 The EM Algorithm, <a href=\"https:\/\/amzn.to\/2jWd51p\">Machine Learning<\/a>, 1997.<\/li>\n<li>Chapter 11 Mixture models and the EM algorithm, <a href=\"https:\/\/amzn.to\/2xKSTCP\">Machine Learning: A Probabilistic Perspective<\/a>, 2012.<\/li>\n<li>Section 9.3 Clustering And Probability Density Estimation, <a href=\"https:\/\/amzn.to\/2lnW5S7\">Data Mining: Practical Machine Learning Tools and Techniques<\/a>, 4th edition, 2016.<\/li>\n<li>Section 20.3 Learning With Hidden Variables: The EM Algorithm, <a href=\"https:\/\/amzn.to\/2Y7yCpO\">Artificial Intelligence: A Modern Approach<\/a>, 3rd edition, 2009.<\/li>\n<\/ul>\n<h3>API<\/h3>\n<ul>\n<li><a href=\"https:\/\/scikit-learn.org\/stable\/modules\/mixture.html\">Gaussian mixture models, scikit-learn API<\/a>.<\/li>\n<li><a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.mixture.GaussianMixture.html\">sklearn.mixture.GaussianMixture API<\/a>.<\/li>\n<\/ul>\n<h3>Articles<\/h3>\n<ul>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Maximum_likelihood_estimation\">Maximum likelihood estimation, Wikipedia<\/a>.<\/li>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Expectation%E2%80%93maximization_algorithm\">Expectation-maximization algorithm, Wikipedia<\/a>.<\/li>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Mixture_model\">Mixture model, Wikipedia<\/a>.<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this post, you discovered the expectation-maximization algorithm.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>Maximum likelihood estimation is challenging on data in the presence of latent variables.<\/li>\n<li>Expectation maximization provides an iterative solution to maximum likelihood estimation with latent variables.<\/li>\n<li>Gaussian mixture models are an approach to density estimation where the parameters of the distributions are fit using the expectation-maximization algorithm.<\/li>\n<\/ul>\n<p>Do you have any questions?<br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/expectation-maximization-em-algorithm\/\">A Gentle Introduction to Expectation-Maximization (EM Algorithm)<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/expectation-maximization-em-algorithm\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee Maximum likelihood estimation is an approach to density estimation for a dataset by searching across probability distributions and their parameters. It is [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/10\/31\/a-gentle-introduction-to-expectation-maximization-em-algorithm\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":2760,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2759"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2759"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2759\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/2760"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2759"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2759"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2759"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}