{"id":5408,"date":"2022-02-11T06:28:06","date_gmt":"2022-02-11T06:28:06","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2022\/02\/11\/calculating-derivatives-in-pytorch\/"},"modified":"2022-02-11T06:28:06","modified_gmt":"2022-02-11T06:28:06","slug":"calculating-derivatives-in-pytorch","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2022\/02\/11\/calculating-derivatives-in-pytorch\/","title":{"rendered":"Calculating Derivatives in PyTorch"},"content":{"rendered":"<p>Author: Muhammad Asad Iqbal Khan<\/p>\n<div>\n<p>Derivatives are one of the most fundamental concepts in calculus. They describe how changes in the variable inputs affect the function outputs. The objective of this article is to provide a high-level introduction to calculating derivatives in PyTorch for those who are new to the framework. PyTorch offers a convenient way to calculate derivatives for user-defined functions.<\/p>\n<p>While we always have to deal with backpropagation (an algorithm known to be the backbone of a neural network) in neural networks, which optimizes the parameters to minimize the error in order to achieve higher classification accuracy; concept learned in this article will be used in later posts on deep learning for image processing and other computer vision problems.<\/p>\n<p>After going through this tutorial, you\u2019ll learn:<\/p>\n<ul>\n<li>How to calculate derivatives in PyTorch.<\/li>\n<li>How to use autograd in PyTorch to perform auto differentiation on tensors.<\/li>\n<li>About the computation graph that involves different nodes and leaves, allowing you to calculate the gradients in a simple possible manner (using the chain rule).<\/li>\n<li>How to calculate partial derivatives in PyTorch.<\/li>\n<li>How to implement the derivative of functions with respect to multiple values.<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_13162\" style=\"width: 810px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-13162\" class=\"aligncenter size-full wp-image-13199\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/jossuha-theophile-H-CZjCQfsFw-unsplash.jpg\" alt=\"\" width=\"800\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/jossuha-theophile-H-CZjCQfsFw-unsplash.jpg 1920w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/jossuha-theophile-H-CZjCQfsFw-unsplash-300x200.jpg 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/jossuha-theophile-H-CZjCQfsFw-unsplash-1024x683.jpg 1024w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/jossuha-theophile-H-CZjCQfsFw-unsplash-768x512.jpg 768w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/jossuha-theophile-H-CZjCQfsFw-unsplash-1536x1024.jpg 1536w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/jossuha-theophile-H-CZjCQfsFw-unsplash-600x400.jpg 600w\" sizes=\"(max-width: 1920px) 100vw, 1920px\"><\/p>\n<p id=\"caption-attachment-13162\" class=\"wp-caption-text\">Calculating Derivatives in PyTorch<br \/>Picture by <a href=\"https:\/\/unsplash.com\/photos\/H-CZjCQfsFw\">Jossuha Th\u00e9ophile<\/a>. Some rights reserved.<\/p>\n<\/div>\n<h2><strong>Differentiation in Autograd<\/strong><\/h2>\n<p>The autograd \u2013 an auto differentiation module in PyTorch \u2013 is used to calculate the derivatives and optimize the parameters in neural networks. It is intended primarily for gradient computations.<\/p>\n<p>Before we start, let\u2019s load up some necessary libraries we\u2019ll use in this tutorial.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import matplotlib.pyplot as plt\r\nimport torch<\/pre>\n<p>Now, let\u2019s use a simple tensor and set the <code>requires_grad<\/code> parameter to true. This allows us to perform automatic differentiation and lets PyTorch evaluate the derivatives using the given value which, in this case, is 3.0.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">x = torch.tensor(3.0, requires_grad = True)\r\nprint(\"creating a tensor x: \", x)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">creating a tensor x:  tensor(3., requires_grad=True)<\/pre>\n<p>We\u2019ll use a simple equation $y=3x^2$ as an example and take the derivative with respect to variable <code>x<\/code>. So, let\u2019s create another tensor according to the given equation. Also, we\u2019ll apply a neat method <code>.backward<\/code> on the variable <code>y<\/code> that forms acyclic graph storing the computation history, and evaluate the result with <code>.grad<\/code> for the given value.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">y = 3 * x ** 2\r\nprint(\"Result of the equation is: \", y)\r\ny.backward()\r\nprint(\"Dervative of the equation at x = 3 is: \", x.grad)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Result of the equation is:  tensor(27., grad_fn=&lt;MulBackward0&gt;)\r\nDervative of the equation at x = 3 is:  tensor(18.)<\/pre>\n<p>As you can see, we have obtained a value of 36, which is correct.<\/p>\n<h2><strong>Computational Graph<\/strong><\/h2>\n<p>PyTorch generates derivatives by building a backwards graph behind the scenes, while tensors and backwards functions are the graph\u2019s nodes. In a graph, PyTorch computes the derivative of a tensor depending on whether it is a leaf or not.<\/p>\n<p>PyTorch will not evaluate a tensor\u2019s derivative if its leaf attribute is set to True. We won\u2019t go into much detail about how the backwards graph is created and utilized, because the goal here is to give you a high-level knowledge of how PyTorch makes use of the graph to calculate derivatives.<\/p>\n<p>So, let\u2019s check how the tensors <code>x<\/code> and <code>y<\/code> look internally once they are created. For <code>x<\/code>:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">print('data attribute of the tensor:',x.data)\r\nprint('grad attribute of the tensor::',x.grad)\r\nprint('grad_fn attribute of the tensor::',x.grad_fn)\r\nprint(\"is_leaf attribute of the tensor::\",x.is_leaf)\r\nprint(\"requires_grad attribute of the tensor::\",x.requires_grad)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">data attribute of the tensor: tensor(3.)\r\ngrad attribute of the tensor:: tensor(18.)\r\ngrad_fn attribute of the tensor:: None\r\nis_leaf attribute of the tensor:: True\r\nrequires_grad attribute of the tensor:: True<\/pre>\n<p>and for <code>y<\/code>:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">print('data attribute of the tensor:',y.data)\r\nprint('grad attribute of the tensor:',y.grad)\r\nprint('grad_fn attribute of the tensor:',y.grad_fn)\r\nprint(\"is_leaf attribute of the tensor:\",y.is_leaf)\r\nprint(\"requires_grad attribute of the tensor:\",y.requires_grad)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">print('data attribute of the tensor:',y.data)\r\nprint('grad attribute of the tensor:',y.grad)\r\nprint('grad_fn attribute of the tensor:',y.grad_fn)\r\nprint(\"is_leaf attribute of the tensor:\",y.is_leaf)\r\nprint(\"requires_grad attribute of the tensor:\",y.requires_grad)<\/pre>\n<p>As you can see, each tensor has been assigned with a particular set of attributes.<\/p>\n<p>The <code>data<\/code> attribute stores the tensor\u2019s data while the <code>grad_fn<\/code> attribute tells about the node in the graph. Likewise, the <code>.grad<\/code> attribute holds the result of the derivative. Now that you have learnt some basics about the autograd and computational graph in PyTorch, let\u2019s take a little more complicated equation $y=6x^2+2x+4$ and calculate the derivative. The derivative of the equation is given by:<\/p>\n<p>$$frac{dy}{dx} = 12x+2$$<\/p>\n<p>Evaluating the derivative at $x = 3$,<\/p>\n<p>$$left.frac{dy}{dx}rightvert_{x=3} = 12times 3+2 = 38$$<\/p>\n<p>Now, let\u2019s see how PyTorch does that,<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">x = torch.tensor(3.0, requires_grad = True)\r\ny = 6 * x ** 2 + 2 * x + 4\r\nprint(\"Result of the equation is: \", y)\r\ny.backward()\r\nprint(\"Derivative of the equation at x = 3 is: \", x.grad)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Result of the equation is:  tensor(64., grad_fn=&lt;AddBackward0&gt;)\r\nDerivative of the equation at x = 3 is:  tensor(38.)<\/pre>\n<p>The derivative of the equation is 38, which is correct.<\/p>\n<h2><strong>Implementing Partial Derivatives of Functions<\/strong><\/h2>\n<p>PyTorch also allows us to calculate partial derivatives of functions. For example, if we have to apply partial derivation to the following function,<\/p>\n<p>$$f(u,v) = u^3+v^2+4uv$$<\/p>\n<p>Its derivative with respect to $u$ is,<\/p>\n<p>$$frac{partial f}{partial u} = 3u^2 + 4v$$<\/p>\n<p>Similarly, the derivative with respect to $v$ will be,<\/p>\n<p>$$frac{partial f}{partial v} = 2v + 4u$$<\/p>\n<p>Now, let\u2019s do it the PyTorch way, where $u = 3$ and $v = 4$.<\/p>\n<p>We\u2019ll create <code>u<\/code>, <code>v<\/code> and <code>f<\/code> tensors and apply the <code>.backward<\/code> attribute on <code>f<\/code> in order to compute the derivative. Finally, we\u2019ll evaluate the derivative using the <code>.grad<\/code> with respect to the values of <code>u<\/code> and <code>v<\/code>.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">u = torch.tensor(3., requires_grad=True)\r\nv = torch.tensor(4., requires_grad=True)\r\n\r\nf = u**3 + v**2 + 4*u*v\r\n\r\nprint(u)\r\nprint(v)\r\nprint(f)\r\n\r\nf.backward()\r\nprint(\"Partial derivative with respect to u: \", u.grad)\r\nprint(\"Partial derivative with respect to u: \", v.grad)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">tensor(3., requires_grad=True)\r\ntensor(4., requires_grad=True)\r\ntensor(91., grad_fn=&lt;AddBackward0&gt;)\r\nPartial derivative with respect to u:  tensor(43.)\r\nPartial derivative with respect to u:  tensor(20.)<\/pre>\n<\/p>\n<h2><strong>Derivative of Functions with Multiple Values<\/strong><\/h2>\n<p>What if we have a function with multiple values and we need to calculate the derivative with respect to its multiple values? For this, we\u2019ll make use of the sum attribute to (1) produce a scalar valued function, and then (2) take the derivative. This is how we can see the \u2018function vs. derivative\u2019 plot:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># compute the derivative of the function with multiple values\r\nx = torch.linspace(-20, 20, 20, requires_grad = True)\r\nY = x ** 2\r\ny = torch.sum(Y)\r\ny.backward()\r\n\r\n# ploting the function and derivative\r\nfunction_line, = plt.plot(x.detach().numpy(), Y.detach().numpy(), label = 'Function')\r\nfunction_line.set_color(\"red\")\r\nderivative_line, = plt.plot(x.detach().numpy(), x.grad.detach().numpy(), label = 'Derivative')\r\nderivative_line.set_color(\"green\")\r\nplt.xlabel('x')\r\nplt.legend()\r\nplt.show()<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13200\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/pytorch-deriv.png\" alt=\"\" width=\"375\" height=\"262\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/pytorch-deriv.png 375w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/01\/pytorch-deriv-300x210.png 300w\" sizes=\"(max-width: 375px) 100vw, 375px\"><\/p>\n<p>In the two <code>plot()<\/code> function above, we extract the values from PyTorch tensors so we can visualize them. The <code>.detach<\/code> method doesn\u2019t allow the graph to further track the operations. This makes it easy for us to convert a tensor to a numpy array.<\/p>\n<h2><strong>Summary<\/strong><\/h2>\n<p>In this tutorial, you learned how to implement derivatives on various functions in PyTorch.<\/p>\n<p>Particularly, you learned:<\/p>\n<ul>\n<li>How to calculate derivatives in PyTorch.<\/li>\n<li>How to use autograd in PyTorch to perform auto differentiation on tensors.<\/li>\n<li>About the computation graph that involves different nodes and leaves, allowing you to calculate the gradients in a simple possible manner (using the chain rule).<\/li>\n<li>How to calculate partial derivatives in PyTorch.<\/li>\n<li>How to implement the derivative of functions with respect to multiple values.<\/li>\n<\/ul>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/calculating-derivatives-in-pytorch\/\">Calculating Derivatives in PyTorch<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/calculating-derivatives-in-pytorch\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Muhammad Asad Iqbal Khan Derivatives are one of the most fundamental concepts in calculus. They describe how changes in the variable inputs affect the [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2022\/02\/11\/calculating-derivatives-in-pytorch\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":5409,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5408"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=5408"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5408\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/5409"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=5408"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=5408"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=5408"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}