{"id":1395,"date":"2018-12-14T06:32:45","date_gmt":"2018-12-14T06:32:45","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2018\/12\/14\/a-generalized-stochastic-calculus\/"},"modified":"2018-12-14T06:32:45","modified_gmt":"2018-12-14T06:32:45","slug":"a-generalized-stochastic-calculus","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2018\/12\/14\/a-generalized-stochastic-calculus\/","title":{"rendered":"A generalized stochastic calculus"},"content":{"rendered":"<p>Author: David Harris<\/p>\n<div>\n<p>In 1963 Benoit Mandelbrot published an article called \u201cThe Variation of Certain Speculative Prices.\u201d\u00a0 It is a response to the forming theory that would become Modern Portfolio Theory.\u00a0 Oversimplified, Mandelbrot\u2019s argument could be summarized as \u201cif this is your theory, then this cannot be your data, and this is your data.\u201d\u00a0 This issue has haunted models such as Black-Scholes, the CAPM, the APT and Fama-French.\u00a0 None of them have survived validation tests.\u00a0 Indeed, a good argument can be made that the test by Fama and MacBeth in 1973 should have brought this class of discussion to an end, but it didn\u2019t.\u00a0 I am going to argue that each of these models shares a mathematical problem that I believe has previously gone unnoticed.\u00a0 The solution I have found to this problem has been to construct a new stochastic calculus for this class of problems.<\/p>\n<p>\u00a0<\/p>\n<p>The structure of this paper is first to explore the properties of estimators that may not be obvious to an economist or even a data scientist who is not working in their own domain.\u00a0 The second part is to examine why models such as the CAPM lack a Bayesian counterpart.\u00a0 The third is to investigate the issue of information and its role in creating a new calculus.<\/p>\n<p>\u00a0<\/p>\n<p>Each of the mean-variance models, as well as the APT and the Fama-French model, is built on top of Frequentist axioms.\u00a0 This article is not an attack on Frequentism, but simply an observation about it.\u00a0 I say this because there are cases in probability and statistics where the choice of axioms will also determine the asserted value of the solution from the data.\u00a0 That, in part, appears to be the case here.\u00a0 Before getting into the math in any technical sense, I will provide two examples.\u00a0 One of these is Bayesian versus Frequentist, but the other is Frequentist but with differing assumed loss functions.\u00a0 The purpose is to illustrate the sensitivity of a result to the axioms and assumptions in use.<\/p>\n<p>\u00a0<\/p>\n<p>For the first illustration, consider a wheel marked with numbers; a roulette wheel will do.\u00a0 Think of this as an inverse game of roulette with information.\u00a0 The gamblers cannot see where the ball falls and they place their gambles only after the croupier observes the number.<\/p>\n<p>\u00a0<\/p>\n<p>This example is relatively common in decision theory.\u00a0 After the ball lands and a number is chosen two fair coins are tossed.\u00a0 If a coin comes up heads, then the croupier will reveal to the gamblers the value one unit to the left of where the ball landed.\u00a0 If the obverse comes up, then the croupier will reveal the value of one unit to the right of where the ball landed.\u00a0 This creates a sample space with three possible outcomes, {L,L}, {R,R} and {L,R}.\u00a0 Our concern is the optimal action to take regarding choosing a number to place a gamble on.<\/p>\n<p>\u00a0<\/p>\n<p>Let us assume the number came up 17.\u00a0 The signals from the croupier will be either {16,16}, {18,18} or {16,18}.\u00a0 It is obvious that if it is {16,18}, then you must bet on 17.\u00a0 However, the Bayesian solution and the Frequentist solution do not match otherwise.\u00a0 Both would choose 17 in the case of {16,18}, but they differ in the case where the numbers given by the croupier are the same.<\/p>\n<p>\u00a0<\/p>\n<p>The minimum variance unbiased estimators in the case of either {16,16} and {18,18} are 16 and 18 respectively.\u00a0 In the case of {16,16}, the Bayesian probability model supports either element of the solution set {15,17} equally.\u00a0 For the {18,18} case the same support is found for {17,19}.\u00a0 It does not minimize the variance, but it does maximize the winning frequency.\u00a0 In expectation, the Bayesian gambler would win 75% of the time, while the Frequentist would win 50% of the time.\u00a0 Note how close this discussion is to a discussion of\u00a0<a href=\"https:\/\/www.wolframalpha.com\/input\/?i=martingale&#038;assumption=%7B%22C%22,+%22martingale%22%7D+-%3E+%7B%22MathWorld%22%7D\" target=\"_blank\" rel=\"noopener\">martingale.<\/a><\/p>\n<p>\u00a0<\/p>\n<p>This illustration shows two crucial facts.\u00a0 The first is that the choice of axiom systems can determine the course of action, and not agree with the decision function of the other system.\u00a0 The second is an illustration of the Dutch Book Theorem.<\/p>\n<p>\u00a0<\/p>\n<p>The Dutch Book Theorem is similar to the no-arbitrage assumption but weaker in its base assumptions.\u00a0 However, it has an unexpected result.\u00a0 You can always use Bayesian methods for gambling, and you cannot use Frequentist methods for gambling.\u00a0 Models such as Black-Scholes and the Capital Asset Pricing Model are instructions on how to gamble in a specific type and set of lotteries.\u00a0 They are built on Frequentist axioms.\u00a0 Now imagine an economist testing the above game where nobody is behaving as he or she are supposed to act under the MVUE.\u00a0 Economists may reject the model or may argue that people are behaving irrationally, but really it is because of the axioms used, not the behavior.\u00a0 Any large-scale test of behavior would come out \u201cwrong.\u201d<\/p>\n<p>\u00a0<\/p>\n<p>A market-maker or bookie using Frequentist rules would also suffer, not just the economists.\u00a0 Long-run optimal behavior would grant 1:1 odds.\u00a0 Market makers using Ito models should, from time to time, have people eat their lunch.\u00a0 It is no wonder hedge funds abound.\u00a0 Dangerously, the eleven trillion dollars in outstanding over-the-counter options premiums are mispriced if they are built on Ito methodologies.\u00a0 By theorem, anything built on an Ito methodology will be systematically mispriced, even if every assumption is valid.<\/p>\n<p>\u00a0<\/p>\n<p>The second example originated in a paper by Welch in 1939 and was expanded on by Morey et al., in their article \u201cThe Fallacy of Placing Confidence in Confidence Intervals.\u201d\u00a0 Their paper was on intellectual fallacies people make, serious and common ones, in using confidence intervals.\u00a0 They also describe Bayesian credible intervals, although the Bayesian case is not essential here.<\/p>\n<p>\u00a0<\/p>\n<p>In the story, a submarine sinks to the bottom of the ocean and rescuers mount an effort to save the crew.\u00a0 Unfortunately, time is running out, and there is only going to be one chance at a successful rescue.\u00a0 Fortunately, there is a statistician present, and that statistician can construct a confidence interval as to where the rescue hatch is.\u00a0 Unfortunately, there exist an infinite number of possible confidence intervals to choose from.\u00a0 The fact that they don\u2019t match raises questions about which procedure to choose.\u00a0 Morey et al. describe three procedures plus a Bayesian one.\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>Their description of the problem is:<\/p>\n<p><em>A 10-meter-long research submersible with several people on board has lost contact with its surface support vessel. The submersible has a rescue hatch exactly halfway along its length, to which the support vessel will drop a rescue line. Because the rescuers only get one rescue attempt, it is crucial that when the line is dropped to the craft in the deep water that the line<\/em> be <em>as close as possible to this hatch. The researchers on the support vessel do not know where the submersible is, but they do know that it forms two distinctive bubbles. These bubbles could form anywhere along the craft\u2019s length, independently, with equal probability, and float to the surface where they can be seen by the support vessel.<\/em><\/p>\n<p>\u00a0<\/p>\n<p>One thing to note is that if the bubbles are precisely 10 meters apart, the location of the rescue hatch is known with perfect certainty, while if there is no distance between the bubbles, then the hatch must be within plus or minus five meters.\u00a0 The possibility of using a xy-plane rather than just an x-axis is ignored in this example.<\/p>\n<p>\u00a0<\/p>\n<p>A couple of facts are relevant here.\u00a0 The first is that the definition of a confidence interval is that it covers the parameter at least a certain fixed percentage of the time upon infinite repetition.\u00a0 The second is that any function that covers the parameter at least that often is a valid confidence procedure.\u00a0 The third is that the procedure needs good long-run features and does not consider the likelihood so is not conditioned on the current specifics.\u00a0 Fourth, the size of the likelihood is 10-d, where d is the distance between the bubbles.\u00a0 Finally, because the sample size will only ever be two (n=2) and a narrow interval is desirable, the statistician chooses a fifty percent interval rather than the more traditional ninety-five percent interval.<\/p>\n<p>\u00a0<\/p>\n<p>The first procedure considered by the statistician is to add or subtract approximately 1.46 to the average location of the bubbles.\u00a0 Because the width of the submarine is fixed and the sampling distribution of the mean is the triangular distribution, adding or subtracting five minus five divided by the square root of two guarantees that coverage will occur at least 50% of the time.<\/p>\n<p>\u00a0<\/p>\n<p>The second procedure is non-parametric.\u00a0 Noting that twenty-five percent of the observations must be within d\/2 of the median, taking the median plus or minus d\/2 is also a fifty percent confidence procedure.\u00a0 That also coincides with the solution under the Student t-distribution for n=2.\u00a0 So this is also the most common procedure taught to undergraduates.<\/p>\n<p>\u00a0<\/p>\n<p>The third confidence procedure is to take the inverse of the uniformly most powerful test.\u00a0 If d<5, then use the non-parametric method of the mean plus or minus d\/2, otherwise use the mean plus or minus five minus d\/2.<\/p>\n<p>\u00a0<\/p>\n<p>Each of these procedures covers the hatch fifty percent of the time, but are any of these procedures appropriate?\u00a0<\/p>\n<p>\u00a0<\/p>\n<p>If the bubbles are nine meters apart, then there is only a one-meter range the hatch could be in, which is also the likelihood function.\u00a0 The first procedure covers it with a width of 2.92 meters, the second at nine meters, and the third at one meter.\u00a0 All three cover the likelihood one hundred percent of the time, though the first two are wider than necessary and depending on the precision required may cause failure unnecessarily.<\/p>\n<p>\u00a0<\/p>\n<p>On the other hand, if the bubbles are one meter apart, then coverage of the first procedure is still 2.92 meters wide, the second and third are one meter wide.\u00a0 The likelihood is nine meters wide.\u00a0 From a Bayesian perspective, the first procedure has roughly a thirty-two percent chance of covering the hatch, while the latter two procedures have roughly an eleven percent chance of covering the hatch.\u00a0 The final two procedures look the most accurate when they have the least amount of information on the true location.<\/p>\n<p>\u00a0<\/p>\n<p>The lesson, however, isn\u2019t that one should use a Bayesian procedure.\u00a0 The lesson is that confidence intervals are the result of minimizing some loss function.\u00a0 Each of these procedures minimizes a different type of loss.\u00a0 Confidence procedures do not measure the accuracy of a result, nor do they give a probability a result is in a range.\u00a0 They provide a frequency with which a parameter will be covered as the number of repetitions becomes arbitrarily large.<\/p>\n<p>\u00a0<\/p>\n<p>Loss functions are vital to this newly proposed calculus.\u00a0 Losses are also subjective.\u00a0 From the crew of the submersible\u2019s perspective, this is an all-or-nothing loss function, and the properties upon repetition do not matter to them.\u00a0 From their perspective, they do not care that at least fifty of one hundred crews are saved, they only care if they are saved this one time.\u00a0 On the other hand, the financial outcomes of the corporation running the rescue may create differing considerations beyond the immediate risk to life and limb.\u00a0 They do, rationally, care about the long-run properties of the procedure used as well as the loss of life, any specific compensation scheme such as insurance and the loss structure created by a failed estimate.<\/p>\n<p>\u00a0<\/p>\n<p>It is unlikely that any of the stated procedures meet the needs of the corporation.\u00a0 Because the long-run only applies to the crew if they are rescued and provided any spouses they may have allows them to go to sea again, a Frequentist procedure may not meet the needs of the crew.\u00a0 When building financial models, the subjective loss structure is usually swept under the rug.\u00a0 The proposed calculus forces a review of the outcomes when the statistician or economist make an incorrect estimate.<\/p>\n<p>\u00a0<\/p>\n<p>The failure to consider the purpose of models such as Black-Scholes in gambling and the failure to account for a proper loss function makes the utility and the appropriate evaluation of these financial models doubtful.\u00a0 A good part of this may have been that these models became more important than the authors likely intended.\u00a0 In a sense, economics took them too seriously, especially since they lack validation.<\/p>\n<p>\u00a0<\/p>\n<p>The second overall purpose of this essay is to look at why there is no Bayesian counterpart to these models.\u00a0 While it has been the case from time to time that researchers have used Bayesian methods to test these models, there is a problem with doing this with Frequentist models.\u00a0 As with the roulette example above, when constructed in a different paradigm, the two models make two different predictions.\u00a0 To examine a Frequentist model with Bayesian methods may be to check the wrong predictions.\u00a0 After all, a Bayesian check that the bookies should offer 1:1 odds would fail for the Bayesian just as readily as for the Frequentist, but the proper Bayesian prediction isn\u2019t for 1:1 odds.\u00a0 A Bayesian methodology requires a complete rebuilding of the model.\u00a0 That is where a calculus problem begins to happen.<\/p>\n<p>\u00a0<\/p>\n<p>Ito methods are Frequentist methods and assume the parameters are known with certainty.\u00a0 It follows the thinking of the null hypothesis.\u00a0 When one asserts a null hypothesis one is asserting, with perfect certainty, the true value of the parameters.\u00a0 The difference between the modeling and testing is that the experiment is built with the intent to reject the null.\u00a0 For the modeling to hold, the assumption of complete information on the parameters turns out to be very important.<\/p>\n<p>\u00a0<\/p>\n<p>The importance shows up when it is realized that parameters are random numbers in Bayesian thinking, so one must assert that one does not know the parameters.\u00a0 Knowledge of the true value of the parameters is a big deal.\u00a0 That is a lot of information about the world.\u00a0 If that assumption is dropped, the calculations also have to account for the uncertainty in the parameters and not just the uncertainty from the chance variable.\u00a0 That is a different class of problem.\u00a0 The Capital Market Line vanishes from existence with that added uncertainty.<\/p>\n<p>\u00a0<\/p>\n<p>To see why one can consider the intertemporal relationship between the present value of wealth and the future value of wealth in the CAPM.\u00a0 Its equation is that the future value is equal to the present value times a reward for investing plus a random shock to the appraised future value.\u00a0 While it is commonly presumed to be a normal shock, it doesn\u2019t matter as long as the shock has a center of location of zero and a finite variance.\u00a0 Why it doesn\u2019t matter is that is known in Frequentist statistics that value of the reward in this equation has no estimator that converges to the population parameter that also is consistent with mean-variance finance.\u00a0 If the CAPM parameters are not known with certainty, then they cannot be estimated with an estimator consistent with the theory.\u00a0 Median or quantile regression will produce an estimator, but not a mean-based estimator.<\/p>\n<p>\u00a0\u00a0<\/p>\n<p>On the other hand, a different problem exists for a Bayesian model.\u00a0 It is common for economists using a Frequentist model to treat returns as data.\u00a0 Indeed, they are studied as data.\u00a0 However, in the Bayesian paradigm, they are a function of prices and volumes.\u00a0 In reality, in the Frequentist paradigm, they are as well, but the models treat them as primitive constructions.<\/p>\n<p>\u00a0<\/p>\n<p>So, if returns are the product of the ratio of prices times the ratio of volumes minus one, then returns are a statistic.\u00a0 As such, its distribution needs to be derived from the distribution of the prices and the distribution of the volumes.\u00a0 It also puts it in line with the rest of economics where the discussion is on prices and quantities and not returns.<\/p>\n<p>\u00a0<\/p>\n<p>Because stocks are sold in a double auction, there is no winner\u2019s curse and as such the rational behavior is to bid the expected value for going concerns.\u00a0 Using the standard mean-variance assumption of many buyers and sellers, the distribution of expectations will converge to the Gaussian distribution as the number becomes large enough.\u00a0 If the equilibrium price is treated as (0,0), then the ratio of two normal densities is the Cauchy distribution, which has no first moment.\u00a0 As a consequence, models like the CAPM are impossible since there is no mean or variance for mean-variance finance to operate in.\u00a0 It also takes apart the Fama-French model because beta does not exist and least squares regression never converges to the population parameter.<\/p>\n<p>\u00a0<\/p>\n<p>Because the density is truncated at -100%, the center of location is the mode.\u00a0 That is a very different conceptualization of regression.\u00a0 It also has implications for artificial intelligence.\u00a0 AI models are function approximators, but the danger is that they approximate the least squares solution rather than a Bayesian modal solution.\u00a0 \u00a0Many standard minimizations in machine learning and AI are guaranteed to miss the Bayesian solution by their construction.<\/p>\n<p>\u00a0<\/p>\n<p>That also brings up an information problem.\u00a0 The Cauchy distribution and the truncated Cauchy distribution lack points that are sufficient statistics for either parameter.\u00a0 While their pivotal quantity is sufficient and normally distributed for the Cauchy distribution, it is possibly not for the truncated Cauchy distribution, and while conditioning on an ancillary statistic makes the inference valid, this is not useful for projection.<\/p>\n<p>The proposed calculus solves this in the Bayesian model by noting that the posterior predictive density contains the effect of the entire posterior in each point via marginalization. \u00a0There is no information loss possible using the predictive distribution.\u00a0 It conjectures two possible solutions for the Frequentist calculus.\u00a0 It is a conjecture for two reasons.<\/p>\n<p>\u00a0<\/p>\n<p>The first is that they are built on Frequentist predictive intervals, which are built on confidence procedures.\u00a0 As seen above, confidence procedures are not unique, and a change in assumptions would intrinsically change the distribution of predictions, even though the data is unchanged.\u00a0 I left it as a conjecture so that measure theorists could tear it apart.\u00a0 The Bayesian method is not a conjecture, however.\u00a0 The second is that the initial conjectured method depends upon open intervals and many standard results collapse without compactness.<\/p>\n<p>\u00a0<\/p>\n<p>The new calculus differs from the Bayesian decision theory that it is constructed on in that it constructs an objective estimator using the indirect utility function.\u00a0 Whereas Bayesian decision theory proper is purely subjective, this provides a solution to arrive at an objective solution, subject to the information in the prior density.\u00a0 It can also be gambled upon.\u00a0 I hope to put an options pricing model in this blog soon as well.\u00a0 It is complete, but I am editing it to fit the calculus better.<\/p>\n<p>\u00a0<\/p>\n<p>The full paper can be found at <a href=\"https:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=3197451\" target=\"_blank\" rel=\"noopener\">the social science research network.<\/a><\/p>\n<p>\u00a0<\/p>\n<p>All criticism of the paper is welcome.\u00a0 I hope you enjoyed my first ever blog post, if it is possible to enjoy a post on economics.<\/p>\n<p><strong>Bibliography<\/strong><\/p>\n<p>Fama, E. F. and MacBeth, J. D. (1973). Risk, return, and equilibrium:<br \/>Empirical tests. The Journal of Political Economy, 81(3):607-636.<\/p>\n<p>Mandelbrot, B. (1963). The variation of certain speculative prices. The<br \/>Journal of Business, 36(4):394-419.<\/p>\n<p><span>Morey, R., Hoekstra, R., Rouder, J., Lee, M., &#038; Wagenmakers, E.-J. (2015). The fallacy of placing confidence in confidence intervals. Psychonomic Bulletin &#038; Review, 1\u201321.<\/span><\/p>\n<p><span>Welch, B. L. (1939). On confidence limits and sufficiency, with particular reference to parameters of location.\u00a0The Annals of Mathematical Statistics,\u00a010(1), 58\u201369.<\/span><\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:784642\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: David Harris In 1963 Benoit Mandelbrot published an article called \u201cThe Variation of Certain Speculative Prices.\u201d\u00a0 It is a response to the forming theory [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2018\/12\/14\/a-generalized-stochastic-calculus\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":464,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1395"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=1395"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/1395\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/457"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=1395"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=1395"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=1395"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}