{"id":5558,"date":"2022-04-13T06:29:02","date_gmt":"2022-04-13T06:29:02","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2022\/04\/13\/scientific-functions-in-numpy-and-scipy\/"},"modified":"2022-04-13T06:29:02","modified_gmt":"2022-04-13T06:29:02","slug":"scientific-functions-in-numpy-and-scipy","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2022\/04\/13\/scientific-functions-in-numpy-and-scipy\/","title":{"rendered":"Scientific Functions in NumPy and SciPy"},"content":{"rendered":"<p>Author: Adrian Tam<\/p>\n<div>\n<p>Python is a general-purpose computation language, but it is very welcomed in scientific computing. It can replace R and Matlab in many cases, thanks to some libraries in the Python ecosystem. In machine learning, we use some mathematical or statistical functions extensively, and often, we will find NumPy and SciPy useful. In the following, we will have a brief overview of what NumPy and SciPy provide and some tips for using them.<\/p>\n<p>After finishing this tutorial, you will know:<\/p>\n<ul>\n<li>What NumPy and SciPy provide for your project<\/li>\n<li>How to quickly speed up NumPy code using numba<\/li>\n<\/ul>\n<p>Let\u2019s get started!<\/p>\n<div id=\"attachment_13446\" style=\"width: 810px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-13446\" class=\"size-full wp-image-13446\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/pexels-nothing-ahead-4494641-scaled.jpg\" alt=\"\" width=\"800\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/pexels-nothing-ahead-4494641-scaled.jpg 2560w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/pexels-nothing-ahead-4494641-300x200.jpg 300w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/pexels-nothing-ahead-4494641-1024x683.jpg 1024w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/pexels-nothing-ahead-4494641-768x512.jpg 768w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/pexels-nothing-ahead-4494641-1536x1024.jpg 1536w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/pexels-nothing-ahead-4494641-2048x1365.jpg 2048w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/pexels-nothing-ahead-4494641-600x400.jpg 600w\" sizes=\"(max-width: 2560px) 100vw, 2560px\"><\/p>\n<p id=\"caption-attachment-13446\" class=\"wp-caption-text\">Scientific Functions in NumPy and SciPy<br \/>Photo by <a href=\"https:\/\/www.pexels.com\/photo\/magnifying-glass-on-textbook-4494641\/\">Nothing Ahead<\/a>. Some rights reserved.<\/p>\n<\/div>\n<h2>Overview<\/h2>\n<p>This tutorial is divided into three parts:<\/p>\n<ul>\n<li>NumPy as a tensor library<\/li>\n<li>Functions from SciPy<\/li>\n<li>Speeding up with numba<\/li>\n<\/ul>\n<h2>NumPy as a Tensor Library<\/h2>\n<p>While the list and tuple in Python are how we manage arrays natively, NumPy provides us the array capabilities closer to C or Java in the sense that we can enforce all elements of the same data type and, in the case of high dimensional arrays, in a regular shape in each dimension. Moreover, carrying out the same operation in the NumPy array is usually faster than in Python natively because the code in NumPy is highly optimized.<\/p>\n<p>There are a thousand functions provided by NumPy, and you should consult NumPy\u2019s documentation for the details. Some common usage can be found in the following cheat sheet:<\/p>\n<div id=\"attachment_13443\" style=\"width: 610px\" class=\"wp-caption aligncenter\">\n<a href=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/cheatsheet.png\"><img decoding=\"async\" aria-describedby=\"caption-attachment-13443\" loading=\"lazy\" class=\"size-mpcs-course-thumbnail wp-image-13443\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/cheatsheet-600x400.png\" alt=\"\" width=\"600\" height=\"400\"><\/a><\/p>\n<p id=\"caption-attachment-13443\" class=\"wp-caption-text\">NumPy Cheat Sheet. Copyright 2022 MachineLearningMastery.com<\/p>\n<\/div>\n<p>There are some cool features from NumPy that are worth mentioning as they are helpful for machine learning projects.<\/p>\n<p>For instance, if we want to plot a 3D curve, we would compute $z=f(x,y)$ for a range of $x$ and $y$ and then plot the result in the $xyz$-space. We can generate the range with:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numpy as np\r\nx = np.linspace(-1, 1, 100)\r\ny = np.linspace(-2, 2, 100)<\/pre>\n<p>For $z=f(x,y)=sqrt{1-x^2-(y\/2)^2}$, we may need a nested for-loop to scan each value on arrays <code>x<\/code> and <code>y<\/code> and do the computation. But in NumPy, we can use <code>meshgrid<\/code> to expand two 1D arrays into two 2D arrays in the sense that by matching the indices, we get all the combinations as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import matplotlib.pyplot as plt \r\nimport numpy as np\r\n\r\nx = np.linspace(-1, 1, 100)\r\ny = np.linspace(-2, 2, 100)\r\n\r\n# convert vector into 2D arrays\r\nxx, yy = np.meshgrid(x,y)\r\n# computation on matching\r\nz = np.sqrt(1 - xx**2 - (yy\/2)**2)\r\n\r\nfig = plt.figure(figsize=(8,8))\r\nax = plt.axes(projection='3d')\r\nax.set_xlim([-2,2])\r\nax.set_ylim([-2,2])\r\nax.set_zlim([0,2])\r\nax.plot_surface(xx, yy, z, cmap=\"cividis\")\r\nax.view_init(45, 35)\r\nplt.show()<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13444\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/numpy1.png\" alt=\"\" width=\"461\" height=\"467\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/numpy1.png 461w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/numpy1-296x300.png 296w\" sizes=\"(max-width: 461px) 100vw, 461px\"><\/p>\n<p>In the above, the 2D array <code>xx<\/code> produced by <code>meshgrid()<\/code> has identical values on the same column, and <code>yy<\/code> has identical values on the same row. Hence element-wise operations on <code>xx<\/code> and <code>yy<\/code> are essentially operations on the $xy$-plane. This is why it works and why we can plot the ellipsoid above.<\/p>\n<p>Another nice feature in NumPy is a function to expand the dimension. Convolutional layers in the neural network usually expect 3D images, namely, pixels in 2D, and the different color channels as the third dimension. It works for color images using RGB channels, but we have only one channel in grayscale images. For example, the digits dataset in scikit-learn:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from sklearn.datasets import load_digits\r\nimages = load_digits()[\"images\"]\r\nprint(images.shape)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(1797, 8, 8)<\/pre>\n<p>This shows that there are 1797 images from this dataset, and each is in 8\u00d78 pixels. This is a grayscale dataset that shows each pixel is a value of darkness. We add the 4th axis to this array (i.e., convert a 3D array into a 4D array) so each image is in 8x8x1 pixels:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\n\r\n# image has axes 0, 1, and 2, adding axis 3\r\nimages = np.expand_dims(images, 3)\r\nprint(images.shape)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(1797, 8, 8, 1)<\/pre>\n<p>A handy feature in working with the NumPy array is Boolean indexing and fancy indexing. For example, if we have a 2D array:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numpy as np\r\n\r\nX = np.array([\r\n    [ 1.299,  0.332,  0.594, -0.047,  0.834],\r\n    [ 0.842,  0.441, -0.705, -1.086, -0.252],\r\n    [ 0.785,  0.478, -0.665, -0.532, -0.673],\r\n    [ 0.062,  1.228, -0.333,  0.867,  0.371]\r\n])<\/pre>\n<p>we can check if all values in a column are positive:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\ny = (X &gt; 0).all(axis=0)\r\nprint(y)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">array([ True,  True, False, False, False])<\/pre>\n<p>This shows only the first two columns are all positive. Note that it is a length-5 one-dimensional array, which is the same size as axis 1 of array <code>X<\/code>. If we use this Boolean array as an index on axis 1, we select the subarray for only where the index is positive:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\ny = X[:, (X &gt; 0).all(axis=0)\r\nprint(y)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">array([[1.299, 0.332],\r\n       [0.842, 0.441],\r\n       [0.785, 0.478],\r\n       [0.062, 1.228]])<\/pre>\n<p>If a list of integers is used in lieu of the Boolean array above, we select from <code>X<\/code> according to the index matching the list. NumPy calls this fancy indexing. So below, we can select the first two columns twice and form a new array:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\ny = X[:, [0,1,1,0]]\r\nprint(y)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">array([[1.299, 0.332, 0.332, 1.299],\r\n       [0.842, 0.441, 0.441, 0.842],\r\n       [0.785, 0.478, 0.478, 0.785],\r\n       [0.062, 1.228, 1.228, 0.062]])<\/pre>\n<\/p>\n<h2>Functions from SciPy<\/h2>\n<p>SciPy is a sister project of NumPy. Hence, you will mostly see SciPy functions expecting NumPy arrays as arguments or returning one. SciPy provides a lot more functions that are less commonly used or more advanced.<\/p>\n<p>SciPy functions are organized under submodules. Some common submodules are:<\/p>\n<ul>\n<li>\n<code>scipy.cluster.hierarchy<\/code>: Hierarchical clustering<\/li>\n<li>\n<code>scipy.fft<\/code>: Fast Fourier transform<\/li>\n<li>\n<code>scipy.integrate<\/code>: Numerical integration<\/li>\n<li>\n<code>scipy.interpolate<\/code>: Interpolation and spline functions<\/li>\n<li>\n<code>scipy.linalg<\/code>: Linear algebra<\/li>\n<li>\n<code>scipy.optimize<\/code>: Numerical optimization<\/li>\n<li>\n<code>scipy.signal<\/code>: Signal processing<\/li>\n<li>\n<code>scipy.sparse<\/code>: Sparse matrix representation<\/li>\n<li>\n<code>scipy.special<\/code>: Some exotic mathematical functions<\/li>\n<li>\n<code>scipy.stats<\/code>: Statistics, including probability distributions<\/li>\n<\/ul>\n<p>But never assume SciPy can cover everything. For time series analysis, for example, it is better to depend on the <code>statsmodels<\/code>\u00a0module instead.<\/p>\n<p>We have covered a lot of examples using\u00a0<code>scipy.optimize<\/code> in other posts. It is a great tool to find the minimum of a function using, for example, Newton\u2019s method. Both NumPy and SciPy have the <code>linalg<\/code> submodule for linear algebra, but those in SciPy are more advanced, such as the function to do QR decomposition or matrix exponentials.<\/p>\n<p>Maybe the most used feature of SciPy is the\u00a0<code>stats<\/code>\u00a0module. In both NumPy and SciPy, we can generate multivariate Gaussian random numbers with non-zero correlation.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numpy as np\r\nfrom scipy.stats import multivariate_normal\r\nimport matplotlib.pyplot as plt\r\n\r\nmean = [0, 0]             # zero mean\r\ncov = [[1, 0.8],[0.8, 1]] # covariance matrix\r\nX1 = np.random.default_rng().multivariate_normal(mean, cov, 5000)\r\nX2 = multivariate_normal.rvs(mean, cov, 5000)\r\n\r\nfig = plt.figure(figsize=(12,6))\r\nax = plt.subplot(121)\r\nax.scatter(X1[:,0], X1[:,1], s=1)\r\nax.set_xlim([-4,4])\r\nax.set_ylim([-4,4])\r\nax.set_title(\"NumPy\")\r\n\r\nax = plt.subplot(122)\r\nax.scatter(X2[:,0], X2[:,1], s=1)\r\nax.set_xlim([-4,4])\r\nax.set_ylim([-4,4])\r\nax.set_title(\"SciPy\")\r\n\r\nplt.show()<\/pre>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-13445\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/numpy2.png\" alt=\"\" width=\"708\" height=\"373\" srcset=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/numpy2.png 708w, https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/04\/numpy2-300x158.png 300w\" sizes=\"(max-width: 708px) 100vw, 708px\"><\/p>\n<p>But if we want to reference the distribution function itself, it is best to depend on SciPy. For example, the famous 68-95-99.7 rule is referring to the standard normal distribution, and we can get the exact percentage from SciPy\u2019s cumulative distribution functions:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from scipy.stats import norm\r\nn = norm.cdf([1,2,3,-1,-2,-3])\r\nprint(n)\r\nprint(n[:3] - n[-3:])<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">[0.84134475 0.97724987 0.9986501  0.15865525 0.02275013 0.0013499 ]\r\n[0.68268949 0.95449974 0.9973002 ]<\/pre>\n<p>So we see that we expect a 68.269% probability that values fall within one standard deviation from the mean in a normal distribution. Conversely, we have the percentage point function as the inverse function of the cumulative distribution function:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">...\r\nprint(norm.ppf(0.99))<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">2.3263478740408408<\/pre>\n<p>So this means if the values are in a normal distribution, we expect a 99% probability (one-tailed probability) that the value will not be more than 2.32 standard deviations beyond the mean.<\/p>\n<p>These are examples of how SciPy can give you an extra mile over what NumPy gives you.<\/p>\n<h2>Speeding Up with numba<\/h2>\n<p>NumPy is faster than native Python because many of the operations are implemented in C and use optimized algorithms. But there are times when we want to do something, but NumPy is still too slow.<\/p>\n<p>It may help if you ask\u00a0<code>numba<\/code>\u00a0to further optimize it by parallelizing or moving the operation to GPU if you have one. You need to install the\u00a0<code>numba<\/code>\u00a0module first:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">pip install numba<\/pre>\n<p>And it may take a while if you need to compile <code>numba<\/code> into a Python module. Afterward, if you have a function that is purely NumPy operations, you can add the <code>numba<\/code>\u00a0decorator to speed it up:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numba\r\n\r\n@numba.jit(nopython=True)\r\ndef numpy_only_function(...)\r\n    ...<\/pre>\n<p>What it does is use a just-in-time compiler to vectorize the operation so it can run faster. You can see the best performance improvement if your function is running many times in your program (e.g., the update function in gradient descent) because the overhead of running the compiler can be amortized.<\/p>\n<p>For example, below is an implementation of the t-SNE algorithm to transform 784-dimensional data into 2-dimensional. We are not going to explain the t-SNE algorithm in detail, but it needs many iterations to converge. The following code shows how we can use <code>numba<\/code> to optimize the inner loop functions (and it demonstrates some NumPy usage as well). It takes a few minutes to finish. You may try to remove the <code>@numba.jit<\/code> decorators afterward. It will take a considerably longer time.<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import datetime\r\n\r\nimport tensorflow as tf\r\nimport matplotlib.pyplot as plt\r\nimport numpy as np\r\nimport numba\r\n\r\ndef tSNE(X, ndims=2, perplexity=30, seed=0, max_iter=500, stop_lying_iter=100, mom_switch_iter=400):\r\n    \"\"\"The t-SNE algorithm\r\n\r\n\tArgs:\r\n\t\tX: the high-dimensional coordinates\r\n\t\tndims: number of dimensions in output domain\r\n    Returns:\r\n        Points of X in low dimension\r\n    \"\"\"\r\n    momentum = 0.5\r\n    final_momentum = 0.8\r\n    eta = 200.0\r\n    N, _D = X.shape\r\n    np.random.seed(seed)\r\n\r\n    # normalize input\r\n    X -= X.mean(axis=0) # zero mean\r\n    X \/= np.abs(X).max() # min-max scaled\r\n\r\n    # compute input similarity for exact t-SNE\r\n    P = computeGaussianPerplexity(X, perplexity)\r\n    # symmetrize and normalize input similarities\r\n    P = P + P.T\r\n    P \/= P.sum()\r\n    # lie about the P-values\r\n    P *= 12.0\r\n    # initialize solution\r\n    Y = np.random.randn(N, ndims) * 0.0001\r\n    # perform main training loop\r\n    gains = np.ones_like(Y)\r\n    uY = np.zeros_like(Y)\r\n    for i in range(max_iter):\r\n        # compute gradient, update gains\r\n        dY = computeExactGradient(P, Y)\r\n        gains = np.where(np.sign(dY) != np.sign(uY), gains+0.2, gains*0.8).clip(0.1)\r\n        # gradient update with momentum and gains\r\n        uY = momentum * uY - eta * gains * dY\r\n        Y = Y + uY\r\n        # make the solution zero-mean\r\n        Y -= Y.mean(axis=0)\r\n        # Stop lying about the P-values after a while, and switch momentum\r\n        if i == stop_lying_iter:\r\n            P \/= 12.0\r\n        if i == mom_switch_iter:\r\n            momentum = final_momentum\r\n        # print progress\r\n        if (i % 50) == 0:\r\n            C = evaluateError(P, Y)\r\n            now = datetime.datetime.now()\r\n            print(f\"{now} - Iteration {i}: Error = {C}\")\r\n    return Y\r\n\r\n@numba.jit(nopython=True)\r\ndef computeExactGradient(P, Y):\r\n    \"\"\"Gradient of t-SNE cost function\r\n\r\n\tArgs:\r\n        P: similarity matrix\r\n        Y: low-dimensional coordinates\r\n    Returns:\r\n        dY, a numpy array of shape (N,D)\r\n\t\"\"\"\r\n    N, _D = Y.shape\r\n    # compute squared Euclidean distance matrix of Y, the Q matrix, and the normalization sum\r\n    DD = computeSquaredEuclideanDistance(Y)\r\n    Q = 1\/(1+DD)\r\n    sum_Q = Q.sum()\r\n    # compute gradient\r\n    mult = (P - (Q\/sum_Q)) * Q\r\n    dY = np.zeros_like(Y)\r\n    for n in range(N):\r\n        for m in range(N):\r\n            if n==m: continue\r\n            dY[n] += (Y[n] - Y[m]) * mult[n,m]\r\n    return dY\r\n\r\n@numba.jit(nopython=True)\r\ndef evaluateError(P, Y):\r\n    \"\"\"Evaluate t-SNE cost function\r\n\r\n    Args:\r\n        P: similarity matrix\r\n        Y: low-dimensional coordinates\r\n    Returns:\r\n        Total t-SNE error C\r\n    \"\"\"\r\n    DD = computeSquaredEuclideanDistance(Y)\r\n    # Compute Q-matrix and normalization sum\r\n    Q = 1\/(1+DD)\r\n    np.fill_diagonal(Q, np.finfo(np.float32).eps)\r\n    Q \/= Q.sum()\r\n    # Sum t-SNE error: sum P log(P\/Q)\r\n    error = P * np.log( (P + np.finfo(np.float32).eps) \/ (Q + np.finfo(np.float32).eps) )\r\n    return error.sum()\r\n\r\n@numba.jit(nopython=True)\r\ndef computeGaussianPerplexity(X, perplexity):\r\n    \"\"\"Compute Gaussian Perplexity\r\n\r\n    Args:\r\n        X: numpy array of shape (N,D)\r\n        perplexity: double\r\n    Returns:\r\n        Similarity matrix P\r\n    \"\"\"\r\n    # Compute the squared Euclidean distance matrix\r\n    N, _D = X.shape\r\n    DD = computeSquaredEuclideanDistance(X)\r\n    # Compute the Gaussian kernel row by row\r\n    P = np.zeros_like(DD)\r\n    for n in range(N):\r\n        found = False\r\n        beta = 1.0\r\n        min_beta = -np.inf\r\n        max_beta = np.inf\r\n        tol = 1e-5\r\n\r\n        # iterate until we get a good perplexity\r\n        n_iter = 0\r\n        while not found and n_iter &lt; 200:\r\n            # compute Gaussian kernel row\r\n            P[n] = np.exp(-beta * DD[n])\r\n            P[n,n] = np.finfo(np.float32).eps\r\n            # compute entropy of current row\r\n            # Gaussians to be row-normalized to make it a probability\r\n            # then H = sum_i -P[i] log(P[i])\r\n            #        = sum_i -P[i] (-beta * DD[n] - log(sum_P))\r\n            #        = sum_i P[i] * beta * DD[n] + log(sum_P)\r\n            sum_P = P[n].sum()\r\n            H = beta * (DD[n] @ P[n]) \/ sum_P + np.log(sum_P)\r\n            # Evaluate if entropy within tolerance level\r\n            Hdiff = H - np.log2(perplexity)\r\n            if -tol &lt; Hdiff &lt; tol:\r\n                found = True\r\n                break\r\n            if Hdiff &gt; 0:\r\n                min_beta = beta\r\n                if max_beta in (np.inf, -np.inf):\r\n                    beta *= 2\r\n                else:\r\n                    beta = (beta + max_beta) \/ 2\r\n            else:\r\n                max_beta = beta\r\n                if min_beta in (np.inf, -np.inf):\r\n                    beta \/= 2\r\n                else:\r\n                    beta = (beta + min_beta) \/ 2\r\n            n_iter += 1\r\n        # normalize this row\r\n        P[n] \/= P[n].sum()\r\n    assert not np.isnan(P).any()\r\n    return P\r\n\r\n@numba.jit(nopython=True)\r\ndef computeSquaredEuclideanDistance(X):\r\n    \"\"\"Compute squared distance\r\n    Args:\r\n        X: numpy array of shape (N,D)\r\n    Returns:\r\n        numpy array of shape (N,N) of squared distances\r\n    \"\"\"\r\n    N, _D = X.shape\r\n    DD = np.zeros((N,N))\r\n    for i in range(N-1):\r\n        for j in range(i+1, N):\r\n            diff = X[i] - X[j]\r\n            DD[j][i] = DD[i][j] = diff @ diff\r\n    return DD\r\n\r\n(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()\r\n# pick 1000 samples from the dataset\r\nrows = np.random.choice(X_test.shape[0], 1000, replace=False)\r\nX_data = X_train[rows].reshape(1000, -1).astype(\"float\")\r\nX_label = y_train[rows]\r\n# run t-SNE to transform into 2D and visualize in scatter plot\r\nY = tSNE(X_data, 2, 30, 0, 500, 100, 400)\r\nplt.figure(figsize=(8,8))\r\nplt.scatter(Y[:,0], Y[:,1], c=X_label)\r\nplt.show()<\/pre>\n<\/p>\n<h2>Further Reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h4>API documentations<\/h4>\n<ul>\n<li><a href=\"https:\/\/numpy.org\/doc\/stable\/user\/index.html#user\">NumPy user guide<\/a><\/li>\n<li><a href=\"https:\/\/docs.scipy.org\/doc\/scipy\/tutorial\/index.html#user-guide\">SciPy user guide<\/a><\/li>\n<li><a href=\"https:\/\/numba.pydata.org\/numba-doc\/dev\/index.html\">Numba documentation<\/a><\/li>\n<\/ul>\n<h2><strong>Summary<\/strong><\/h2>\n<p>In this tutorial, you saw a brief overview of the functions provided by NumPy and SciPy.<\/p>\n<p>Specifically, you learned:<\/p>\n<ul>\n<li>How to work with NumPy arrays<\/li>\n<li>A few functions provided by SciPy to help<\/li>\n<li>How to make NumPy code faster by using the JIT compiler from numba<\/li>\n<\/ul>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/scientific-functions-in-numpy-and-scipy\/\">Scientific Functions in NumPy and SciPy<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/scientific-functions-in-numpy-and-scipy\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Adrian Tam Python is a general-purpose computation language, but it is very welcomed in scientific computing. It can replace R and Matlab in many [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2022\/04\/13\/scientific-functions-in-numpy-and-scipy\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":5559,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5558"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=5558"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5558\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/5559"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=5558"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=5558"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=5558"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}