{"id":4411,"date":"2021-02-18T18:00:11","date_gmt":"2021-02-18T18:00:11","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2021\/02\/18\/simulated-annealing-from-scratch-in-python\/"},"modified":"2021-02-18T18:00:11","modified_gmt":"2021-02-18T18:00:11","slug":"simulated-annealing-from-scratch-in-python","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2021\/02\/18\/simulated-annealing-from-scratch-in-python\/","title":{"rendered":"Simulated Annealing From Scratch in Python"},"content":{"rendered":"<p>Author: Jason Brownlee<\/p>\n<div>\n<p><strong>Simulated Annealing<\/strong> is a stochastic global search optimization algorithm.<\/p>\n<p>This means that it makes use of randomness as part of the search process. This makes the algorithm appropriate for nonlinear objective functions where other local search algorithms do not operate well.<\/p>\n<p>Like the stochastic hill climbing local search algorithm, it modifies a single solution and searches the relatively local area of the search space until the local optima is located. Unlike the hill climbing algorithm, it may accept worse solutions as the current working solution.<\/p>\n<p>The likelihood of accepting worse solutions starts high at the beginning of the search and decreases with the progress of the search, giving the algorithm the opportunity to first locate the region for the global optima, escaping local optima, then hill climb to the optima itself.<\/p>\n<p>In this tutorial, you will discover the simulated annealing optimization algorithm for function optimization.<\/p>\n<p>After completing this tutorial, you will know:<\/p>\n<ul>\n<li>Simulated annealing is a stochastic global search algorithm for function optimization.<\/li>\n<li>How to implement the simulated annealing algorithm from scratch in Python.<\/li>\n<li>How to use the simulated annealing algorithm and inspect the results of the algorithm.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_11813\" style=\"width: 809px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11813\" loading=\"lazy\" class=\"size-full wp-image-11813\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2021\/02\/Simulated-Annealing-From-Scratch-in-Python.jpg\" alt=\"Simulated Annealing From Scratch in Python\" width=\"799\" height=\"533\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2021\/02\/Simulated-Annealing-From-Scratch-in-Python.jpg 799w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2021\/02\/Simulated-Annealing-From-Scratch-in-Python-300x200.jpg 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2021\/02\/Simulated-Annealing-From-Scratch-in-Python-768x512.jpg 768w\" sizes=\"(max-width: 799px) 100vw, 799px\"><\/p>\n<p id=\"caption-attachment-11813\" class=\"wp-caption-text\">Simulated Annealing From Scratch in Python<br \/>Photo by <a href=\"https:\/\/www.flickr.com\/photos\/infomastern\/14170987191\/\">Susanne Nilsson<\/a>, some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is divided into three parts; they are:<\/p>\n<ol>\n<li>Simulated Annealing<\/li>\n<li>Implement Simulated Annealing<\/li>\n<li>Simulated Annealing Worked Example<\/li>\n<\/ol>\n<h2>Simulated Annealing<\/h2>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Simulated_annealing\">Simulated Annealing<\/a> is a stochastic global search optimization algorithm.<\/p>\n<p>The algorithm is inspired by <a href=\"https:\/\/en.wikipedia.org\/wiki\/Annealing_(metallurgy)\">annealing in metallurgy<\/a> where metal is heated to a high temperature quickly, then cooled slowly, which increases its strength and makes it easier to work with.<\/p>\n<p>The annealing process works by first exciting the atoms in the material at a high temperature, allowing the atoms to move around a lot, then decreasing their excitement slowly, allowing the atoms to fall into a new, more stable configuration.<\/p>\n<blockquote>\n<p>When hot, the atoms in the material are more free to move around, and, through random motion, tend to settle into better positions. A slow cooling brings the material to an ordered, crystalline state.<\/p>\n<\/blockquote>\n<p>\u2014 Page 128, <a href=\"https:\/\/amzn.to\/34Nb7nv\">Algorithms for Optimization<\/a>, 2019.<\/p>\n<p>The simulated annealing optimization algorithm can be thought of as a modified version of stochastic hill climbing.<\/p>\n<p>Stochastic hill climbing maintains a single candidate solution and takes steps of a random but constrained size from the candidate in the search space. If the new point is better than the current point, then the current point is replaced with the new point. This process continues for a fixed number of iterations.<\/p>\n<p>Simulated annealing executes the search in the same way. The main difference is that new points that are not as good as the current point (worse points) are accepted sometimes.<\/p>\n<p>A worse point is accepted probabilistically where the likelihood of accepting a solution worse than the current solution is a function of the temperature of the search and how much worse the solution is than the current solution.<\/p>\n<blockquote>\n<p>The algorithm varies from Hill-Climbing in its decision of when to replace S, the original candidate solution, with R, its newly tweaked child. Specifically: if R is better than S, we\u2019ll always replace S with R as usual. But if R is worse than S, we may still replace S with R with a certain probability<\/p>\n<\/blockquote>\n<p>\u2014 Page 23, <a href=\"https:\/\/amzn.to\/3iQkCaw\">Essentials of Metaheuristics<\/a>, 2011.<\/p>\n<p>The initial temperature for the search is provided as a hyperparameter and decreases with the progress of the search. A number of different schemes (annealing schedules) may be used to decrease the temperature during the search from the initial value to a very low value, although it is common to calculate temperature as a function of the iteration number.<\/p>\n<p>A popular example for calculating temperature is the so-called \u201c<em>fast simulated annealing<\/em>,\u201d calculated as follows<\/p>\n<ul>\n<li>temperature = initial_temperature \/ (iteration_number + 1)<\/li>\n<\/ul>\n<p>We add one to the iteration number in the case that iteration numbers start at zero, to avoid a divide by zero error.<\/p>\n<p>The acceptance of worse solutions uses the temperature as well as the difference between the objective function evaluation of the worse solution and the current solution. A value is calculated between 0 and 1 using this information, indicating the likelihood of accepting the worse solution. This distribution is then sampled using a random number, which, if less than the value, means the worse solution is accepted.<\/p>\n<blockquote>\n<p>It is this acceptance probability, known as the Metropolis criterion, that allows the algorithm to escape from local minima when the temperature is high.<\/p>\n<\/blockquote>\n<p>\u2014 Page 128, <a href=\"https:\/\/amzn.to\/34Nb7nv\">Algorithms for Optimization<\/a>, 2019.<\/p>\n<p>This is called the metropolis acceptance criterion and for minimization is calculated as follows:<\/p>\n<ul>\n<li>criterion = exp( -(objective(new) \u2013 objective(current)) \/ temperature)<\/li>\n<\/ul>\n<p>Where <em>exp()<\/em> is <a href=\"https:\/\/en.wikipedia.org\/wiki\/E_(mathematical_constant)\">e<\/a> (the mathematical constant) raised to a power of the provided argument, and <em>objective(new)<\/em>, and <em>objective(current)<\/em> are the objective function evaluation of the new (worse) and current candidate solutions.<\/p>\n<p>The effect is that poor solutions have more chances of being accepted early in the search and less likely of being accepted later in the search. The intent is that the high temperature at the beginning of the search will help the search locate the basin for the global optima and the low temperature later in the search will help the algorithm hone in on the global optima.<\/p>\n<blockquote>\n<p>The temperature starts high, allowing the process to freely move about the search space, with the hope that in this phase the process will find a good region with the best local minimum. The temperature is then slowly brought down, reducing the stochasticity and forcing the search to converge to a minimum<\/p>\n<\/blockquote>\n<p>\u2014 Page 128, <a href=\"https:\/\/amzn.to\/34Nb7nv\">Algorithms for Optimization<\/a>, 2019.<\/p>\n<p>Now that we are familiar with the simulated annealing algorithm, let\u2019s look at how to implement it from scratch.<\/p>\n<h2>Implement Simulated Annealing<\/h2>\n<p>In this section, we will explore how we might implement the simulated annealing optimization algorithm from scratch.<\/p>\n<p>First, we must define our objective function and the bounds on each input variable to the objective function. The objective function is just a Python function we will name <em>objective()<\/em>. The bounds will be a 2D array with one dimension for each input variable that defines the minimum and maximum for the variable.<\/p>\n<p>For example, a one-dimensional objective function and bounds would be defined as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># objective function\r\ndef objective(x):\r\n\treturn 0\r\n\r\n# define range for input\r\nbounds = asarray([[-5.0, 5.0]])<\/pre>\n<p>Next, we can generate our initial point as a random point within the bounds of the problem, then evaluate it using the objective function.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# generate an initial point\r\nbest = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])\r\n# evaluate the initial point\r\nbest_eval = objective(best)<\/pre>\n<p>We need to maintain the \u201c<em>current<\/em>\u201d solution that is the focus of the search and that may be replaced with better solutions.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# current working solution\r\ncurr, curr_eval = best, best_eval<\/pre>\n<p>Now we can loop over a predefined number of iterations of the algorithm defined as \u201c<em>n_iterations<\/em>\u201c, such as 100 or 1,000.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# run the algorithm\r\nfor i in range(n_iterations):\r\n\t...<\/pre>\n<p>The first step of the algorithm iteration is to generate a new candidate solution from the current working solution, e.g. take a step.<\/p>\n<p>This requires a predefined \u201c<em>step_size<\/em>\u201d parameter, which is relative to the bounds of the search space. We will take a random step with a Gaussian distribution where the mean is our current point and the standard deviation is defined by the \u201c<em>step_size<\/em>\u201c. That means that about 99 percent of the steps taken will be within <em>3 * step_size<\/em> of the current point.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# take a step\r\ncandidate = solution + randn(len(bounds)) * step_size<\/pre>\n<p>We don\u2019t have to take steps in this way. You may wish to use a uniform distribution between 0 and the step size. For example:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# take a step\r\ncandidate = solution + rand(len(bounds)) * step_size<\/pre>\n<p>Next, we need to evaluate it.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# evaluate candidate point\r\ncandidte_eval = objective(candidate)<\/pre>\n<p>We then need to check if the evaluation of this new point is as good as or better than the current best point, and if it is, replace our current best point with this new point.<\/p>\n<p>This is separate from the current working solution that is the focus of the search.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# check for new best solution\r\nif candidate_eval &lt; best_eval:\r\n\t# store new best point\r\n\tbest, best_eval = candidate, candidate_eval\r\n\t# report progress\r\n\tprint('&gt;%d f(%s) = %.5f' % (i, best, best_eval))<\/pre>\n<p>Next, we need to prepare to replace the current working solution.<\/p>\n<p>The first step is to calculate the difference between the objective function evaluation of the current solution and the current working solution.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# difference between candidate and current point evaluation\r\ndiff = candidate_eval - curr_eval<\/pre>\n<p>Next, we need to calculate the current temperature, using the fast annealing schedule, where \u201c<em>temp<\/em>\u201d is the initial temperature provided as an argument.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# calculate temperature for current epoch\r\nt = temp \/ float(i + 1)<\/pre>\n<p>We can then calculate the likelihood of accepting a solution with worse performance than our current working solution.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# calculate metropolis acceptance criterion\r\nmetropolis = exp(-diff \/ t)<\/pre>\n<p>Finally, we can accept the new point as the current working solution if it has a better objective function evaluation (the difference is negative) or if the objective function is worse, but we probabilistically decide to accept it.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# check if we should keep the new point\r\nif diff &lt; 0 or rand() &lt; metropolis:\r\n\t# store the new current point\r\n\tcurr, curr_eval = candidate, candidate_eval<\/pre>\n<p>And that\u2019s it.<\/p>\n<p>We can implement this simulated annealing algorithm as a reusable function that takes the name of the objective function, the bounds of each input variable, the total iterations, step size, and initial temperature as arguments, and returns the best solution found and its evaluation.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># simulated annealing algorithm\r\ndef simulated_annealing(objective, bounds, n_iterations, step_size, temp):\r\n\t# generate an initial point\r\n\tbest = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])\r\n\t# evaluate the initial point\r\n\tbest_eval = objective(best)\r\n\t# current working solution\r\n\tcurr, curr_eval = best, best_eval\r\n\t# run the algorithm\r\n\tfor i in range(n_iterations):\r\n\t\t# take a step\r\n\t\tcandidate = curr + randn(len(bounds)) * step_size\r\n\t\t# evaluate candidate point\r\n\t\tcandidate_eval = objective(candidate)\r\n\t\t# check for new best solution\r\n\t\tif candidate_eval &lt; best_eval:\r\n\t\t\t# store new best point\r\n\t\t\tbest, best_eval = candidate, candidate_eval\r\n\t\t\t# report progress\r\n\t\t\tprint('&gt;%d f(%s) = %.5f' % (i, best, best_eval))\r\n\t\t# difference between candidate and current point evaluation\r\n\t\tdiff = candidate_eval - curr_eval\r\n\t\t# calculate temperature for current epoch\r\n\t\tt = temp \/ float(i + 1)\r\n\t\t# calculate metropolis acceptance criterion\r\n\t\tmetropolis = exp(-diff \/ t)\r\n\t\t# check if we should keep the new point\r\n\t\tif diff &lt; 0 or rand() &lt; metropolis:\r\n\t\t\t# store the new current point\r\n\t\t\tcurr, curr_eval = candidate, candidate_eval\r\n\treturn [best, best_eval]<\/pre>\n<p>Now that we know how to implement the simulated annealing algorithm in Python, let\u2019s look at how we might use it to optimize an objective function.<\/p>\n<h2>Simulated Annealing Worked Example<\/h2>\n<p>In this section, we will apply the simulated annealing optimization algorithm to an objective function.<\/p>\n<p>First, let\u2019s define our objective function.<\/p>\n<p>We will use a simple one-dimensional x^2 objective function with the bounds [-5, 5].<\/p>\n<p>The example below defines the function, then creates a line plot of the response surface of the function for a grid of input values, and marks the optima at f(0.0) = 0.0 with a red line<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># convex unimodal optimization function\r\nfrom numpy import arange\r\nfrom matplotlib import pyplot\r\n\r\n# objective function\r\ndef objective(x):\r\n\treturn x[0]**2.0\r\n\r\n# define range for input\r\nr_min, r_max = -5.0, 5.0\r\n# sample input range uniformly at 0.1 increments\r\ninputs = arange(r_min, r_max, 0.1)\r\n# compute targets\r\nresults = [objective([x]) for x in inputs]\r\n# create a line plot of input vs result\r\npyplot.plot(inputs, results)\r\n# define optimal input value\r\nx_optima = 0.0\r\n# draw a vertical line at the optimal input\r\npyplot.axvline(x=x_optima, ls='--', color='red')\r\n# show the plot\r\npyplot.show()<\/pre>\n<p>Running the example creates a line plot of the objective function and clearly marks the function optima.<\/p>\n<div id=\"attachment_11809\" style=\"width: 1290px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11809\" loading=\"lazy\" class=\"size-full wp-image-11809\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-with-Optima-Marked-with-a-Dashed-Red-Line.png\" alt=\"Line Plot of Objective Function With Optima Marked With a Dashed Red Line\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-with-Optima-Marked-with-a-Dashed-Red-Line.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-with-Optima-Marked-with-a-Dashed-Red-Line-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-with-Optima-Marked-with-a-Dashed-Red-Line-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-with-Optima-Marked-with-a-Dashed-Red-Line-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-11809\" class=\"wp-caption-text\">Line Plot of Objective Function With Optima Marked With a Dashed Red Line<\/p>\n<\/div>\n<p>Before we apply the optimization algorithm to the problem, let\u2019s take a moment to understand the acceptance criterion a little better.<\/p>\n<p>First, the fast annealing schedule is an exponential function of the number of iterations. We can make this clear by creating a plot of the temperature for each algorithm iteration.<\/p>\n<p>We will use an initial temperature of 10 and 100 algorithm iterations, both arbitrarily chosen.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># explore temperature vs algorithm iteration for simulated annealing\r\nfrom matplotlib import pyplot\r\n# total iterations of algorithm\r\niterations = 100\r\n# initial temperature\r\ninitial_temp = 10\r\n# array of iterations from 0 to iterations - 1\r\niterations = [i for i in range(iterations)]\r\n# temperatures for each iterations\r\ntemperatures = [initial_temp\/float(i + 1) for i in iterations]\r\n# plot iterations vs temperatures\r\npyplot.plot(iterations, temperatures)\r\npyplot.xlabel('Iteration')\r\npyplot.ylabel('Temperature')\r\npyplot.show()<\/pre>\n<p>Running the example calculates the temperature for each algorithm iteration and creates a plot of algorithm iteration (x-axis) vs. temperature (y-axis).<\/p>\n<p>We can see that temperature drops rapidly, exponentially, not linearly, such that after 20 iterations it is below 1 and stays low for the remainder of the search.<\/p>\n<div id=\"attachment_11810\" style=\"width: 1290px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11810\" loading=\"lazy\" class=\"size-full wp-image-11810\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Temperature-vs-Algorithm-Iteration-for-Fast-Annealing.png\" alt=\"Line Plot of Temperature vs. Algorithm Iteration for Fast Annealing\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Temperature-vs-Algorithm-Iteration-for-Fast-Annealing.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Temperature-vs-Algorithm-Iteration-for-Fast-Annealing-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Temperature-vs-Algorithm-Iteration-for-Fast-Annealing-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Temperature-vs-Algorithm-Iteration-for-Fast-Annealing-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-11810\" class=\"wp-caption-text\">Line Plot of Temperature vs. Algorithm Iteration for Fast Annealing<\/p>\n<\/div>\n<p>Next, we can get a better idea of how the metropolis acceptance criterion changes over time with the temperature.<\/p>\n<p>Recall that the criterion is a function of temperature, but is also a function of how different the objective evaluation of the new point is compared to the current working solution. As such, we will plot the criterion for a few different \u201c<em>differences in objective function value<\/em>\u201d to see the effect it has on acceptance probability.<\/p>\n<p>The complete example is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># explore metropolis acceptance criterion for simulated annealing\r\nfrom math import exp\r\nfrom matplotlib import pyplot\r\n# total iterations of algorithm\r\niterations = 100\r\n# initial temperature\r\ninitial_temp = 10\r\n# array of iterations from 0 to iterations - 1\r\niterations = [i for i in range(iterations)]\r\n# temperatures for each iterations\r\ntemperatures = [initial_temp\/float(i + 1) for i in iterations]\r\n# metropolis acceptance criterion\r\ndifferences = [0.01, 0.1, 1.0]\r\nfor d in differences:\r\n\tmetropolis = [exp(-d\/t) for t in temperatures]\r\n\t# plot iterations vs metropolis\r\n\tlabel = 'diff=%.2f' % d\r\n\tpyplot.plot(iterations, metropolis, label=label)\r\n# inalize plot\r\npyplot.xlabel('Iteration')\r\npyplot.ylabel('Metropolis Criterion')\r\npyplot.legend()\r\npyplot.show()<\/pre>\n<p>Running the example calculates the metropolis acceptance criterion for each algorithm iteration using the temperature shown for each iteration (shown in the previous section).<\/p>\n<p>The plot has three lines for three differences between the new worse solution and the current working solution.<\/p>\n<p>We can see that the worse the solution is (the larger the difference), the less likely the model is to accept the worse solution regardless of the algorithm iteration, as we might expect. We can also see that in all cases, the likelihood of accepting worse solutions decreases with algorithm iteration.<\/p>\n<div id=\"attachment_11811\" style=\"width: 1290px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11811\" loading=\"lazy\" class=\"size-full wp-image-11811\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Metropolis-Acceptance-Criterion-vs-Algorithm-Iteration-for-Simulated-Annealing.png\" alt=\"Line Plot of Metropolis Acceptance Criterion vs. Algorithm Iteration for Simulated Annealing\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Metropolis-Acceptance-Criterion-vs-Algorithm-Iteration-for-Simulated-Annealing.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Metropolis-Acceptance-Criterion-vs-Algorithm-Iteration-for-Simulated-Annealing-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Metropolis-Acceptance-Criterion-vs-Algorithm-Iteration-for-Simulated-Annealing-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Metropolis-Acceptance-Criterion-vs-Algorithm-Iteration-for-Simulated-Annealing-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-11811\" class=\"wp-caption-text\">Line Plot of Metropolis Acceptance Criterion vs. Algorithm Iteration for Simulated Annealing<\/p>\n<\/div>\n<p>Now that we are more familiar with the behavior of the temperature and metropolis acceptance criterion over time, let\u2019s apply simulated annealing to our test problem.<\/p>\n<p>First, we will seed the pseudorandom number generator.<\/p>\n<p>This is not required in general, but in this case, I want to ensure we get the same results (same sequence of random numbers) each time we run the algorithm so we can plot the results later.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# seed the pseudorandom number generator\r\nseed(1)<\/pre>\n<p>Next, we can define the configuration of the search.<\/p>\n<p>In this case, we will search for 1,000 iterations of the algorithm and use a step size of 0.1. Given that we are using a Gaussian function for generating the step, this means that about 99 percent of all steps taken will be within a distance of (0.1 * 3) of a given point, e.g. three standard deviations.<\/p>\n<p>We will also use an initial temperature of 10.0. The search procedure is more sensitive to the annealing schedule than the initial temperature, as such, initial temperature values are almost arbitrary.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\nn_iterations = 1000\r\n# define the maximum step size\r\nstep_size = 0.1\r\n# initial temperature\r\ntemp = 10<\/pre>\n<p>Next, we can perform the search and report the results.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# perform the simulated annealing search\r\nbest, score = simulated_annealing(objective, bounds, n_iterations, step_size, temp)\r\nprint('Done!')\r\nprint('f(%s) = %f' % (best, score))<\/pre>\n<p>Tying this all together, the complete example is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># simulated annealing search of a one-dimensional objective function\r\nfrom numpy import asarray\r\nfrom numpy import exp\r\nfrom numpy.random import randn\r\nfrom numpy.random import rand\r\nfrom numpy.random import seed\r\n\r\n# objective function\r\ndef objective(x):\r\n\treturn x[0]**2.0\r\n\r\n# simulated annealing algorithm\r\ndef simulated_annealing(objective, bounds, n_iterations, step_size, temp):\r\n\t# generate an initial point\r\n\tbest = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])\r\n\t# evaluate the initial point\r\n\tbest_eval = objective(best)\r\n\t# current working solution\r\n\tcurr, curr_eval = best, best_eval\r\n\t# run the algorithm\r\n\tfor i in range(n_iterations):\r\n\t\t# take a step\r\n\t\tcandidate = curr + randn(len(bounds)) * step_size\r\n\t\t# evaluate candidate point\r\n\t\tcandidate_eval = objective(candidate)\r\n\t\t# check for new best solution\r\n\t\tif candidate_eval &lt; best_eval:\r\n\t\t\t# store new best point\r\n\t\t\tbest, best_eval = candidate, candidate_eval\r\n\t\t\t# report progress\r\n\t\t\tprint('&gt;%d f(%s) = %.5f' % (i, best, best_eval))\r\n\t\t# difference between candidate and current point evaluation\r\n\t\tdiff = candidate_eval - curr_eval\r\n\t\t# calculate temperature for current epoch\r\n\t\tt = temp \/ float(i + 1)\r\n\t\t# calculate metropolis acceptance criterion\r\n\t\tmetropolis = exp(-diff \/ t)\r\n\t\t# check if we should keep the new point\r\n\t\tif diff &lt; 0 or rand() &lt; metropolis:\r\n\t\t\t# store the new current point\r\n\t\t\tcurr, curr_eval = candidate, candidate_eval\r\n\treturn [best, best_eval]\r\n\r\n# seed the pseudorandom number generator\r\nseed(1)\r\n# define range for input\r\nbounds = asarray([[-5.0, 5.0]])\r\n# define the total iterations\r\nn_iterations = 1000\r\n# define the maximum step size\r\nstep_size = 0.1\r\n# initial temperature\r\ntemp = 10\r\n# perform the simulated annealing search\r\nbest, score = simulated_annealing(objective, bounds, n_iterations, step_size, temp)\r\nprint('Done!')\r\nprint('f(%s) = %f' % (best, score))<\/pre>\n<p>Running the example reports the progress of the search including the iteration number, the input to the function, and the response from the objective function each time an improvement was detected.<\/p>\n<p>At the end of the search, the best solution is found and its evaluation is reported.<\/p>\n<p><strong>Note<\/strong>: Your <a href=\"https:\/\/machinelearningmastery.com\/different-results-each-time-in-machine-learning\/\">results may vary<\/a> given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.<\/p>\n<p>In this case, we can see about 20 improvements over the 1,000 iterations of the algorithm and a solution that is very close to the optimal input of 0.0 that evaluates to f(0.0) = 0.0.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;34 f([-0.78753544]) = 0.62021\r\n&gt;35 f([-0.76914239]) = 0.59158\r\n&gt;37 f([-0.68574854]) = 0.47025\r\n&gt;39 f([-0.64797564]) = 0.41987\r\n&gt;40 f([-0.58914623]) = 0.34709\r\n&gt;41 f([-0.55446029]) = 0.30743\r\n&gt;42 f([-0.41775702]) = 0.17452\r\n&gt;43 f([-0.35038542]) = 0.12277\r\n&gt;50 f([-0.15799045]) = 0.02496\r\n&gt;66 f([-0.11089772]) = 0.01230\r\n&gt;67 f([-0.09238208]) = 0.00853\r\n&gt;72 f([-0.09145261]) = 0.00836\r\n&gt;75 f([-0.05129162]) = 0.00263\r\n&gt;93 f([-0.02854417]) = 0.00081\r\n&gt;144 f([0.00864136]) = 0.00007\r\n&gt;149 f([0.00753953]) = 0.00006\r\n&gt;167 f([-0.00640394]) = 0.00004\r\n&gt;225 f([-0.00044965]) = 0.00000\r\n&gt;503 f([-0.00036261]) = 0.00000\r\n&gt;512 f([0.00013605]) = 0.00000\r\nDone!\r\nf([0.00013605]) = 0.000000<\/pre>\n<p>It can be interesting to review the progress of the search as a line plot that shows the change in the evaluation of the best solution each time there is an improvement.<\/p>\n<p>We can update the <em>simulated_annealing()<\/em> to keep track of the objective function evaluations each time there is an improvement and return this list of scores.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># simulated annealing algorithm\r\ndef simulated_annealing(objective, bounds, n_iterations, step_size, temp):\r\n\t# generate an initial point\r\n\tbest = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])\r\n\t# evaluate the initial point\r\n\tbest_eval = objective(best)\r\n\t# current working solution\r\n\tcurr, curr_eval = best, best_eval\r\n\t# run the algorithm\r\n\tfor i in range(n_iterations):\r\n\t\t# take a step\r\n\t\tcandidate = curr + randn(len(bounds)) * step_size\r\n\t\t# evaluate candidate point\r\n\t\tcandidate_eval = objective(candidate)\r\n\t\t# check for new best solution\r\n\t\tif candidate_eval &lt; best_eval:\r\n\t\t\t# store new best point\r\n\t\t\tbest, best_eval = candidate, candidate_eval\r\n\t\t\t# keep track of scores\r\n\t\t\tscores.append(best_eval)\r\n\t\t\t# report progress\r\n\t\t\tprint('&gt;%d f(%s) = %.5f' % (i, best, best_eval))\r\n\t\t# difference between candidate and current point evaluation\r\n\t\tdiff = candidate_eval - curr_eval\r\n\t\t# calculate temperature for current epoch\r\n\t\tt = temp \/ float(i + 1)\r\n\t\t# calculate metropolis acceptance criterion\r\n\t\tmetropolis = exp(-diff \/ t)\r\n\t\t# check if we should keep the new point\r\n\t\tif diff &lt; 0 or rand() &lt; metropolis:\r\n\t\t\t# store the new current point\r\n\t\t\tcurr, curr_eval = candidate, candidate_eval\r\n\treturn [best, best_eval, scores]<\/pre>\n<p>We can then create a line plot of these scores to see the relative change in objective function for each improvement found during the search.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n# line plot of best scores\r\npyplot.plot(scores, '.-')\r\npyplot.xlabel('Improvement Number')\r\npyplot.ylabel('Evaluation f(x)')\r\npyplot.show()<\/pre>\n<p>Tying this together, the complete example of performing the search and plotting the objective function scores of the improved solutions during the search is listed below.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># simulated annealing search of a one-dimensional objective function\r\nfrom numpy import asarray\r\nfrom numpy import exp\r\nfrom numpy.random import randn\r\nfrom numpy.random import rand\r\nfrom numpy.random import seed\r\nfrom matplotlib import pyplot\r\n\r\n# objective function\r\ndef objective(x):\r\n\treturn x[0]**2.0\r\n\r\n# simulated annealing algorithm\r\ndef simulated_annealing(objective, bounds, n_iterations, step_size, temp):\r\n\t# generate an initial point\r\n\tbest = bounds[:, 0] + rand(len(bounds)) * (bounds[:, 1] - bounds[:, 0])\r\n\t# evaluate the initial point\r\n\tbest_eval = objective(best)\r\n\t# current working solution\r\n\tcurr, curr_eval = best, best_eval\r\n\tscores = list()\r\n\t# run the algorithm\r\n\tfor i in range(n_iterations):\r\n\t\t# take a step\r\n\t\tcandidate = curr + randn(len(bounds)) * step_size\r\n\t\t# evaluate candidate point\r\n\t\tcandidate_eval = objective(candidate)\r\n\t\t# check for new best solution\r\n\t\tif candidate_eval &lt; best_eval:\r\n\t\t\t# store new best point\r\n\t\t\tbest, best_eval = candidate, candidate_eval\r\n\t\t\t# keep track of scores\r\n\t\t\tscores.append(best_eval)\r\n\t\t\t# report progress\r\n\t\t\tprint('&gt;%d f(%s) = %.5f' % (i, best, best_eval))\r\n\t\t# difference between candidate and current point evaluation\r\n\t\tdiff = candidate_eval - curr_eval\r\n\t\t# calculate temperature for current epoch\r\n\t\tt = temp \/ float(i + 1)\r\n\t\t# calculate metropolis acceptance criterion\r\n\t\tmetropolis = exp(-diff \/ t)\r\n\t\t# check if we should keep the new point\r\n\t\tif diff &lt; 0 or rand() &lt; metropolis:\r\n\t\t\t# store the new current point\r\n\t\t\tcurr, curr_eval = candidate, candidate_eval\r\n\treturn [best, best_eval, scores]\r\n\r\n# seed the pseudorandom number generator\r\nseed(1)\r\n# define range for input\r\nbounds = asarray([[-5.0, 5.0]])\r\n# define the total iterations\r\nn_iterations = 1000\r\n# define the maximum step size\r\nstep_size = 0.1\r\n# initial temperature\r\ntemp = 10\r\n# perform the simulated annealing search\r\nbest, score, scores = simulated_annealing(objective, bounds, n_iterations, step_size, temp)\r\nprint('Done!')\r\nprint('f(%s) = %f' % (best, score))\r\n# line plot of best scores\r\npyplot.plot(scores, '.-')\r\npyplot.xlabel('Improvement Number')\r\npyplot.ylabel('Evaluation f(x)')\r\npyplot.show()<\/pre>\n<p>Running the example performs the search and reports the results as before.<\/p>\n<p>A line plot is created showing the objective function evaluation for each improvement during the hill climbing search. We can see about 20 changes to the objective function evaluation during the search with large changes initially and very small to imperceptible changes towards the end of the search as the algorithm converged on the optima.<\/p>\n<div id=\"attachment_11812\" style=\"width: 1290px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-11812\" loading=\"lazy\" class=\"size-full wp-image-11812\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-Evaluation-for-Each-Improvement-During-the-Simulated-Annealing-Search.png\" alt=\"Line Plot of Objective Function Evaluation for Each Improvement During the Simulated Annealing Search\" width=\"1280\" height=\"960\" srcset=\"http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-Evaluation-for-Each-Improvement-During-the-Simulated-Annealing-Search.png 1280w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-Evaluation-for-Each-Improvement-During-the-Simulated-Annealing-Search-300x225.png 300w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-Evaluation-for-Each-Improvement-During-the-Simulated-Annealing-Search-1024x768.png 1024w, http:\/\/3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com\/wp-content\/uploads\/2020\/11\/Line-Plot-of-Objective-Function-Evaluation-for-Each-Improvement-During-the-Simulated-Annealing-Search-768x576.png 768w\" sizes=\"(max-width: 1280px) 100vw, 1280px\"><\/p>\n<p id=\"caption-attachment-11812\" class=\"wp-caption-text\">Line Plot of Objective Function Evaluation for Each Improvement During the Simulated Annealing Search<\/p>\n<\/div>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h3>Papers<\/h3>\n<ul>\n<li>\n<a href=\"https:\/\/science.sciencemag.org\/content\/220\/4598\/671.abstract\">Optimization by Simulated Annealing<\/a>, 1983.<\/li>\n<\/ul>\n<h3>Books<\/h3>\n<ul>\n<li>\n<a href=\"https:\/\/amzn.to\/34Nb7nv\">Algorithms for Optimization<\/a>, 2019.<\/li>\n<li>\n<a href=\"https:\/\/amzn.to\/3iQkCaw\">Essentials of Metaheuristics<\/a>, 2011.<\/li>\n<\/ul>\n<h3>Articles<\/h3>\n<ul>\n<li>\n<a href=\"https:\/\/en.wikipedia.org\/wiki\/Simulated_annealing\">Simulated annealing, Wikipedia<\/a>.<\/li>\n<li>\n<a href=\"https:\/\/en.wikipedia.org\/wiki\/Annealing_(metallurgy)\">Annealing (metallurgy), Wikipedia<\/a>.<\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you discovered the simulated annealing optimization algorithm for function optimization.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>Simulated annealing is a stochastic global search algorithm for function optimization.<\/li>\n<li>How to implement the simulated annealing algorithm from scratch in Python.<\/li>\n<li>How to use the simulated annealing algorithm and inspect the results of the algorithm.<\/li>\n<\/ul>\n<p><strong>Do you have any questions?<\/strong><br \/>\nAsk your questions in the comments below and I will do my best to answer.<\/p>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/simulated-annealing-from-scratch-in-python\/\">Simulated Annealing From Scratch in Python<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/simulated-annealing-from-scratch-in-python\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Jason Brownlee Simulated Annealing is a stochastic global search optimization algorithm. This means that it makes use of randomness as part of the search [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2021\/02\/18\/simulated-annealing-from-scratch-in-python\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":4412,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4411"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=4411"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4411\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/4412"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=4411"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=4411"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=4411"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}