{"id":4959,"date":"2021-08-27T06:34:07","date_gmt":"2021-08-27T06:34:07","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2021\/08\/27\/the-inverse-problem-in-random-dynamical-systems\/"},"modified":"2021-08-27T06:34:07","modified_gmt":"2021-08-27T06:34:07","slug":"the-inverse-problem-in-random-dynamical-systems","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2021\/08\/27\/the-inverse-problem-in-random-dynamical-systems\/","title":{"rendered":"The Inverse Problem in Random Dynamical Systems"},"content":{"rendered":"<p>Author: Vincent Granville<\/p>\n<div>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9484111501?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9484111501?profile=RESIZE_710x\" width=\"500\" class=\"align-center\"><\/a><\/p>\n<p>We are dealing here with random variables recursively defined by <em>X<\/em><span style=\"font-size: 8pt;\"><em>n<\/em>+1<\/span> = <em>g<\/em>(<em>X<\/em><span style=\"font-size: 8pt;\">n<\/span>), with <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span> being the initial condition. The examples discussed here are simple, discrete and one-dimensional: the purpose is to illustrate the concepts so that it can be understood and useful to a large audience, not just to mathematicians. I wrote many articles about dynamical systems, see for example <a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/an-easy-way-to-solve-complex-optimization-problems\" target=\"_blank\" rel=\"noopener\">here<\/a>. The originality in this article is that the systems discussed are now random, as <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span> is a random variable. Applications include the design of non-periodic pseudorandom number generators, and cryptography. Also, such systems, especially more complex ones such as fully stochastic dynamical systems, are routinely used in financial modeling of commodity prices.<\/p>\n<p>We focus on mappings <em>g<\/em> on the fixed interval [0, 1]. That is, the support domain of <em>X<span style=\"font-size: 8pt;\">n<\/span><\/em> is [0, 1], and <em>g<\/em> is a many-to-one mapping onto [0,1]. The most trivial example, known as the dyadic or Bernoulli map, is when g(x) = 2<em>x<\/em> &#8211; INT(2<em>x<\/em>) = { 2<em>x<\/em> } where the curly brackets represent the fractional part function (see <a href=\"https:\/\/en.wikipedia.org\/wiki\/Fractional_part\" target=\"_blank\" rel=\"noopener\">here<\/a>). This is sometimes denoted as <em>g<\/em>(<em>x<\/em>) = 2<em>x<\/em> mod 1. The most well-known and possibly oldest example is the logistic map (see <a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/logistic-map-chaos-randomness-and-quantum-algorithms\" target=\"_blank\" rel=\"noopener\">here<\/a>) with <em>g<\/em>(<em>x<\/em>) = 4<em>x<\/em>(1 &#8211; <em>x<\/em>).<\/p>\n<p>We start with a simple exercise that requires very little mathematical knowledge, but a good amount of out-of-the-box thinking. The solution is provided. The discussion is about a specific, original problem, referred to as the inverse problem, and introduced in section 2. The reasons for being interested in the inverse problem are also discussed. Finally, I provide an Excel spreadsheet with all my simulations, for replication purposes.<\/p>\n<p><span style=\"font-size: 14pt;\"><strong>1. The standard problem<\/strong><\/span><\/p>\n<p>One of the main problems in dynamical systems is to find if the distribution of <em>X<span style=\"font-size: 8pt;\">n<\/span><\/em> converges, and find the limit, called invariant measure, invariant distribution, fixed-point distribution, or attractor. The attractor, depending on <em>g<\/em>, is typically the same regardless of the initial condition <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span>, except for some special initial conditions causing problems (this set of bad initial conditions has Lebesgue measure zero, and we ignore it here). As an example, with the Bernoulli map\u00a0<em>g<\/em>(<em>x<\/em>) = { 2<em>x<\/em> }, all rational numbers (and many other numbers) are bad initial conditions. They are however far outnumbered by good initial conditions. It is typically very difficult to determine if a specific initial condition is a good one. Proving that\u00a0<span><em>\u03c0<\/em>\/4 is a good initial condition for the Bernoulli\u00a0map would be a major accomplishment, making you instantly famous in the mathematical community, and proving that the digits of\u00a0<em>\u03c0<\/em> in base 2, behave exactly like independently\u00a0and identically distributed Bernoulli\u00a0random variables. Good initial conditions for the Bernoulli\u00a0map are called <a href=\"https:\/\/en.wikipedia.org\/wiki\/Normal_number\" target=\"_blank\" rel=\"noopener\">normal numbers<\/a> in base 2.<\/span><\/p>\n<p>It is also assumed that the dynamical system is <a href=\"https:\/\/en.wikipedia.org\/wiki\/Ergodicity\" target=\"_blank\" rel=\"noopener\">ergodic<\/a>: all systems investigated here are ergodic; I won&#8217;t elaborate on this concept, but the curious, math-savvy reader can check the meaning on Wikipedia. Finding the attractor is a difficult problem, and it usually requires solving a stochastic integral equation. Except in rare occasions (discussed\u00a0<a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/number-representation-systems-explained-in-one-picture\" target=\"_blank\" rel=\"noopener\">here<\/a>\u00a0and in my book, <a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/fee-book-applied-stochastic-processes\" target=\"_blank\" rel=\"noopener\">here<\/a>), no exact solution is known, and one needs to use numerical methods to find an approximation. This is illustrated in section 1.1., with the attractor found (approximately) using simulations in Excel. In section 2., we focus on the much easier inverse problem, which is the main topic of this article.<\/p>\n<p><strong>1.1. Standard problem: example<\/strong><\/p>\n<p>Let&#8217;s start with <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span> defined as follows: <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span> = <em>U<\/em> \/ (1 &#8211; <em>U<\/em>)^<em><span>\u03b1<\/span><\/em>, where <em>U<\/em> is a uniform deviate on [0, 1], <em>\u03b1<\/em>\u00a0= 0.25, and ^ denotes the power operator (2^3 = 8). We use <em>g<\/em>(<em>x<\/em>) = { 4<em>x<\/em>(1 &#8211; <em>x<\/em>) }, where { } denotes the fractional part function. Essentially, this is the logistic map. I produced 10,000 deviates for <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span>, and then applied the mapping <em>g<\/em> iteratively to each of these deviates, up to <em>X<span style=\"font-size: 8pt;\">n<\/span><\/em> with <em>n<\/em> = 53. The scatterplot below represents the empirical percentile distribution function (PDF), respectively for <em>X<\/em><span style=\"font-size: 8pt;\">3<\/span> in blue, and <em>X<\/em><span style=\"font-size: 8pt;\">53<\/span> in orange. These PDF&#8217;s, for <em>X<\/em><span style=\"font-size: 8pt;\">2<\/span>, <em>X<\/em><span style=\"font-size: 8pt;\">3<\/span>, and so on, slowly converge to a limit, corresponding to the attractor. The orange S-curve (<em>n<\/em> = 53) is extremely close to the limiting PDF, and additional iterations (that is, increasing <em>n<\/em>) barely provide any change. So we found the limit (approximately) using simulations. Note that the cumulative distribution function (CDF) is the inverse of the PDF. All this was done with Excel alone.<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9483753458?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9483753458?profile=RESIZE_710x\" width=\"250\" class=\"align-center\"><\/a><\/p>\n<p><span style=\"font-size: 14pt;\"><strong>2. The inverse problem<\/strong><\/span><\/p>\n<p>The inverse problem consists of finding <em>g<\/em>, assuming the attractor distribution (the orange curve in the above figure) is known. Typically, there are many possible solutions. One of the possible reasons for solving the inverse problem is to get a sequence of random variables <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span>, <em>X<\/em><span style=\"font-size: 8pt;\">2<\/span>, and so on, that exhibits little or no auto-correlations. For instance, the lag-1 auto-correlation (between <em>X<span style=\"font-size: 8pt;\">n<\/span><\/em> and <em>X<\/em><span style=\"font-size: 8pt;\"><em>n<\/em>+1<\/span>) for the Bernoulli map, is 1\/2, which is way too high depending on the applications you have in mind. It is important in cryptography applications to remove these auto-correlations. The solution proposed here also satisfies the following property: <em>X<\/em><span style=\"font-size: 8pt;\">2<\/span> = <em>g<\/em>(<em>X<\/em><span style=\"font-size: 8pt;\">1<\/span>), <em>X<\/em><span style=\"font-size: 8pt;\">3<\/span> = <em>g<\/em>(<em>X<\/em><span style=\"font-size: 8pt;\">2<\/span>), <em>X<\/em><span style=\"font-size: 8pt;\">4<\/span> = <em>g<\/em>(<em>X<\/em><span style=\"font-size: 8pt;\">3<\/span>) and so on, all have the same pre-specified attractor distribution, regardless of the (non-singular) distribution of <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span>.\u00a0<\/p>\n<p><strong>2.1. Exercise<\/strong><\/p>\n<p>Before diving into a solution, if you have time, I ask you to solve the following simple inverse problem.\u00a0<\/p>\n<p>Find a mapping <em>g<\/em> such that if <em>X<\/em><span style=\"font-size: 8pt;\">n+1<\/span> = <em>g<\/em>(X<span style=\"font-size: 8pt;\">n<\/span>), the attractor distribution is uniform on [0, 1]. Can you find one yielding very low auto-correlations between the successive <em>X<span style=\"font-size: 8pt;\">n<\/span><\/em>&#8216;s? Hint: <em>g<\/em> may not be continuous.\u00a0<\/p>\n<p><strong>2.2. A general solution to the inverse problem<\/strong><\/p>\n<p>A potential solution to the problem in section 2.1 is <em>g<\/em>(<em>x<\/em>) = { <em>bx<\/em> } where <em>b<\/em> is an integer larger than 1. This is because the uniform distribution on [0, 1] is the attractor for this map. The case <em>b<\/em> = 2 corresponds to the Bernoulli map discussed earlier. Regardless of <em>b<\/em>, INT(<em>bX<span style=\"font-size: 8pt;\">n<\/span><\/em>) represents the <em>n<\/em>-th digit of <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span>, in base <em>b<\/em>. The lag-1 autocorrelation between <em>X<span style=\"font-size: 8pt;\">n<\/span><\/em> and X<span style=\"font-size: 8pt;\"><em>n<\/em>+1<\/span>, is then equal to 1 \/ <em>b<\/em>. Thus, the higher <em>b<\/em>, the better. Note that if you use Excel for simulations, avoid even integer values for <em>b<\/em>, as Excel has an internal glitch that will make your simulations meaningless after <em>n<\/em> = 45 iterations or so.\u00a0<\/p>\n<p>Now, a general solution offered here, for any pre-specified attractor and any non-singular distribution for <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span>, is based on a result proved <a href=\"https:\/\/mathoverflow.net\/questions\/402341\/invariant-distributions-for-iterated-random-variables-stochastic-dynamical-syst\" target=\"_blank\" rel=\"noopener\">here<\/a>. If\u00a0<em>g<\/em>\u00a0is the solution in question, then all <em>X<span style=\"font-size: 8pt;\">n<\/span><\/em> (with <em>n<\/em>\u00a0 &gt;\u00a0 1) have the same distribution as the pre-specified attractor. I provide an Excel spreadsheet showing how it works for a specific example.<\/p>\n<p>First, let&#8217;s assume that <em>g<\/em>* is a solution when the attractor is the uniform distribution on [0, 1]. For instance <em>g<\/em>*(<em>x<\/em>) = { <em>bx<\/em> } as discussed earlier. Let <em>F<\/em> be the CDF of the target attractor, and assume its support domain is [0, 1]. Then a solution <em>g<\/em> is given by<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9484094872?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9484094872?profile=RESIZE_710x\" width=\"150\" class=\"align-center\"><\/a><\/p>\n<p>For instance, if <em>F<\/em>(<em>x<\/em>) = <em>x<\/em>^2, with <em>x<\/em> in [0, 1], then <em>g<\/em>(<em>x<\/em>) = SQRT( {\u00a0<em>bx<\/em>^2 } ) works, assuming <em>b<\/em> is an integer larger than 1. The scatterplot below shows the empirical CDF of <em>X<\/em><span style=\"font-size: 8pt;\">2<\/span> (blue dots, based on 10,000 deviates) versus the CDF of the target attractor with distribution <em>F<\/em> (red curve): they are almost indistinguishable. I used <em>b<\/em> = 3, and for <em>X<\/em><span style=\"font-size: 8pt;\">1<\/span>, I used the same distribution as in section 1.1. The detailed computations are available in my spreadsheet, <a href=\"http:\/\/datashaping.com\/Recur3.xlsx\" target=\"_blank\" rel=\"noopener\">here<\/a>\u00a0(13 MB download).<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9484100060?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/9484100060?profile=RESIZE_710x\" width=\"300\" class=\"align-center\"><\/a><\/p>\n<p>The summary statistics and the above plot are found in columns BD to BH, in my spreadsheet.<\/p>\n<\/p>\n<p><span><em>To receive a weekly digest of our new articles, subscribe to our newsletter,\u00a0<a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/check-out-our-dsc-newsletter\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/em><\/span><\/p>\n<p><span><em><strong>About the author<\/strong>:\u00a0 Vincent Granville is a d<span class=\"lt-line-clamp__raw-line\">ata science pioneer, mathematician, book author (Wiley), patent owner, former post-doc at Cambridge University, former VC-funded executive, with 20+ years of corporate experience including CNET, NBC, Visa, Wells Fargo, Microsoft, eBay. Vincent is also self-publisher at\u00a0<a href=\"http:\/\/datashaping.com\/\" target=\"_blank\" rel=\"noopener\">DataShaping.com<\/a>, and founded and co-founded a few start-ups, including one with a successful exit (Data Science Central acquired by Tech Target).<\/span>\u00a0You can access Vincent&#8217;s articles and books,\u00a0<a href=\"http:\/\/datashaping.com\/\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/em><\/span><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:1064897\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Vincent Granville We are dealing here with random variables recursively defined by Xn+1 = g(Xn), with X1 being the initial condition. The examples discussed [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2021\/08\/27\/the-inverse-problem-in-random-dynamical-systems\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":461,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4959"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=4959"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4959\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/459"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=4959"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=4959"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=4959"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}