{"id":5423,"date":"2022-02-17T06:25:27","date_gmt":"2022-02-17T06:25:27","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2022\/02\/17\/duck-typing-scope-and-investigative-functions-in-python\/"},"modified":"2022-02-17T06:25:27","modified_gmt":"2022-02-17T06:25:27","slug":"duck-typing-scope-and-investigative-functions-in-python","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2022\/02\/17\/duck-typing-scope-and-investigative-functions-in-python\/","title":{"rendered":"Duck-typing, scope, and investigative functions in Python"},"content":{"rendered":"<p>Author: Adrian Tam<\/p>\n<div>\n<p>Python is a duck typing language. It means the data types of variables can change as long as the syntax is compatible. Python is also a dynamic programming language. Meaning we can change the program while it runs, including defining new functions and the scope of name resolution. Not only these give us a new paradigm in writing Python code, but also a new set of tools for debugging. In the following, we will see what we can do in Python that cannot be done in many other languages. After finishing this tutorial you will know<\/p>\n<ul>\n<li>How Python manages the variables you defined<\/li>\n<li>How Python code uses a variable and why we don\u2019t need to define its type like C or Java<\/li>\n<\/ul>\n<p>Let\u2019s get started.<\/p>\n<div id=\"attachment_13220\" style=\"width: 810px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-13220\" class=\"size-full wp-image-13220\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/02\/pexels-julissa-helmuth-4381480-scaled.jpg\" alt=\"\" width=\"800\"><\/p>\n<p id=\"caption-attachment-13220\" class=\"wp-caption-text\">Duck-typing, scope, and investigative functions in Python. Photo by <a href=\"https:\/\/www.pexels.com\/photo\/flock-of-yellow-baby-ducks-in-grass-4381480\/\">Julissa Helmuth<\/a>. Some rights reserved<\/p>\n<\/div>\n<h2 id=\"Overview\">Overview<\/h2>\n<p>This tutorial is in three parts, they are<\/p>\n<ul>\n<li>Duck typing in programming languages<\/li>\n<li>Scopes and name space in Python<\/li>\n<li>Investigating the type and scope<\/li>\n<\/ul>\n<h2 id=\"Duck-typing-in-programming-languages\">Duck typing in programming languages<\/h2>\n<p>Duck typing is a feature of some modern programming languages that allow data types to be dynamic.<\/p>\n<blockquote>\n<p>A programming style which does not look at an object\u2019s type to determine if it has the right interface; instead, the method or attribute is simply called or used (\u201cIf it looks like a duck and quacks like a duck, it must be a duck.\u201d) By emphasizing interfaces rather than specific types, well-designed code improves its flexibility by allowing polymorphic substitution.<\/p>\n<\/blockquote>\n<p>\u2014 <a href=\"https:\/\/docs.python.org\/3\/glossary.html\">Python Glossary<\/a><\/p>\n<p>Simply speaking, the program should allow you to swap data structures as long as the same syntax still makes sense. In C, for example, you have to define functions like the following<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">float fsquare(float x)\r\n{\r\n    return x * x;\r\n};\r\n\r\nint isquare(int x)\r\n{\r\n    return x * x;\r\n};<\/pre>\n<p>while the operation\u00a0<code>x * x<\/code>\u00a0is identical for integers and floating point numbers, a function taking an integer argument and a function taking a floating point argument are not the same. Because types are static in C, we must define two functions although they are performing the same logic. In Python, types are dynamic, hence we can define the corresponding function as<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">def square(x):\r\n    return x * x<\/pre>\n<p>This feature indeed gives us tremendous power and convenience. For example, from scikit-learn, we have a function to do cross validation<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># evaluate a perceptron model on the dataset\r\nfrom numpy import mean\r\nfrom numpy import std\r\nfrom sklearn.datasets import make_classification\r\nfrom sklearn.model_selection import cross_val_score\r\nfrom sklearn.model_selection import RepeatedStratifiedKFold\r\nfrom sklearn.linear_model import Perceptron\r\n# define dataset\r\nX, y = make_classification(n_samples=1000, n_features=10, n_informative=10, n_redundant=0, random_state=1)\r\n# define model\r\nmodel = Perceptron()\r\n# define model evaluation method\r\ncv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)\r\n# evaluate model\r\nscores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)\r\n# summarize result\r\nprint('Mean Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))<\/pre>\n<p>But in the above, the\u00a0<code>model<\/code>\u00a0is a variable of a scikit-learn model object. It doesn\u2019t matter if it is a perceptron model as in the above, or a decision tree, or a support vector machine model. What matters is that, inside\u00a0<code>cross_val_score()<\/code>\u00a0function the data will be passed onto the model with its\u00a0<code>fit()<\/code>\u00a0function. Therefore the model must implement the\u00a0<code>fit()<\/code>\u00a0member function and the\u00a0<code>fit()<\/code>\u00a0function behaves identically. The consequence is that,\u00a0<code>cross_val_score()<\/code>\u00a0function is not expecting any particular model type as long as it looks like one. If we are using Keras to build a neural network model, we can make the Keras model looks like a scikit-learn model with a wrapper:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\"># MLP for Pima Indians Dataset with 10-fold cross validation via sklearn\r\nfrom keras.models import Sequential\r\nfrom keras.layers import Dense\r\nfrom keras.wrappers.scikit_learn import KerasClassifier\r\nfrom sklearn.model_selection import StratifiedKFold\r\nfrom sklearn.model_selection import cross_val_score\r\nfrom sklearn.datasets import load_diabetes\r\nimport numpy\r\n\r\n# Function to create model, required for KerasClassifier\r\ndef create_model():\r\n\t# create model\r\n\tmodel = Sequential()\r\n\tmodel.add(Dense(12, input_dim=8, activation='relu'))\r\n\tmodel.add(Dense(8, activation='relu'))\r\n\tmodel.add(Dense(1, activation='sigmoid'))\r\n\t# Compile model\r\n\tmodel.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])\r\n\treturn model\r\n\r\n# fix random seed for reproducibility\r\nseed = 7\r\nnumpy.random.seed(seed)\r\n# load pima indians dataset\r\ndataset = numpy.loadtxt(\"https:\/\/raw.githubusercontent.com\/jbrownlee\/Datasets\/master\/pima-indians-diabetes.csv\", delimiter=\",\")\r\n# split into input (X) and output (Y) variables\r\nX = dataset[:,0:8]\r\nY = dataset[:,8]\r\n# create model\r\nmodel = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)\r\n# evaluate using 10-fold cross validation\r\nkfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)\r\nresults = cross_val_score(model, X, Y, cv=kfold)\r\nprint(results.mean())<\/pre>\n<p>In the above, we used the wrapper from Tensorflow. Other wrappers exist, such as scikeras. All it does is to make sure the\u00a0<strong>interface<\/strong>\u00a0of Keras model looks like a scikit-learn classifier so you can make use of the\u00a0<code>cross_val_score()<\/code>\u00a0function. If we replace the\u00a0<code>model<\/code>\u00a0above with<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&lt;span class=\"cm-variable\"&gt;model&lt;\/span&gt; &lt;span class=\"cm-operator\"&gt;=&lt;\/span&gt; &lt;span class=\"cm-variable\"&gt;create_model&lt;\/span&gt;()<\/pre>\n<p>then the scikit-learn function will complain as it cannot find the\u00a0<code>model.score()<\/code>\u00a0function.<\/p>\n<p>Similarly, because of duck typing, we can reuse a function that expects a list for NumPy array or pandas series because they all supports the same indexing and slicing operation. For example, the fitting a time series with ARIMA as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from statsmodels.tsa.statespace.sarimax import SARIMAX\r\nimport numpy as np\r\nimport pandas as pd\r\n\r\ndata = [266.0,145.9,183.1,119.3,180.3,168.5,231.8,224.5,192.8,122.9,336.5,185.9,\r\n        194.3,149.5,210.1,273.3,191.4,287.0,226.0,303.6,289.9,421.6,264.5,342.3,\r\n        339.7,440.4,315.9,439.3,401.3,437.4,575.5,407.6,682.0,475.3,581.3,646.9]\r\nmodel = SARIMAX(y, order=(5,1,0))\r\nres = model.fit(disp=False)\r\nprint(\"AIC = \", res.aic)\r\n\r\ndata = np.array(data)\r\nmodel = SARIMAX(y, order=(5,1,0))\r\nres = model.fit(disp=False)\r\nprint(\"AIC = \", res.aic)\r\n\r\ndata = pd.Series(data)\r\nmodel = SARIMAX(y, order=(5,1,0))\r\nres = model.fit(disp=False)\r\nprint(\"AIC = \", res.aic)<\/pre>\n<p>The above should produce the same AIC scores for each fitting.<\/p>\n<h2 id=\"Scopes-and-name-space-in-Python\">Scopes and name space in Python<\/h2>\n<p>In most languages, variables are defined in a limited scope. For example, a variable defined inside a function is accessible only inside that function:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from math import sqrt\r\n\r\ndef quadratic(a,b,c):\r\n    discrim = b*b - 4*a*c\r\n    x = -b\/(2*a)\r\n    y = sqrt(discrim)\/(2*a)\r\n    return x-y, x+y<\/pre>\n<p>the\u00a0<strong>local variable<\/strong>\u00a0<code>discrim<\/code>\u00a0is no way to be accessible if we are not inside the function\u00a0<code>quadratic()<\/code>. Moreover, this may be surprising for someone:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">a = 1\r\n\r\ndef f(x):\r\n    a = 2 * x\r\n    return a\r\n\r\nb = f(3)\r\nprint(a, b)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">1 6<\/pre>\n<p>We defined the variable\u00a0<code>a<\/code>\u00a0outside function\u00a0<code>f<\/code>\u00a0but inside\u00a0<code>f<\/code>, variable\u00a0<code>a<\/code>\u00a0is assigned to be\u00a0<code>2 * x<\/code>. However, the\u00a0<code>a<\/code>\u00a0inside function and the one outside are unrelated except the name. Therefore, as we exit from the function, the value of\u00a0<code>a<\/code>\u00a0is untouched. To make it modifiable inside function\u00a0<code>f<\/code>, we need to declare the name\u00a0<code>a<\/code>\u00a0as\u00a0<code>global<\/code>\u00a0so to make it clear that this name should be from the\u00a0<strong>global scope<\/strong>\u00a0not the\u00a0<strong>local scope<\/strong>:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">a = 1\r\n\r\ndef f(x):\r\n    global a\r\n    a = 2 * x\r\n    return a\r\n\r\nb = f(3)\r\nprint(a, b)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">6 6<\/pre>\n<p>However, we may further complicated the issue when we introduced the\u00a0<strong>nested scope<\/strong>\u00a0in functions. Consider the following example:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">a = 1\r\n\r\ndef f(x):\r\n    a = x\r\n    def g(x):\r\n        return a * x\r\n    return g(3)\r\n\r\nb = f(2)\r\nprint(b)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">6<\/pre>\n<p>The variable\u00a0<code>a<\/code>\u00a0inside function\u00a0<code>f<\/code>\u00a0is distinct from the global one. However, when inside\u00a0<code>g<\/code>, since there is never anything written to\u00a0<code>a<\/code>\u00a0but merely read from it, Python will see the same\u00a0<code>a<\/code>\u00a0from the nearest scope, i.e., from function\u00a0<code>f<\/code>. The variable <code>x<\/code> however, is defined as argument to the function <code>g<\/code> and it takes the value <code>3<\/code> when we called <code>g(3)<\/code> instead of assuming the value of <code>x<\/code> from function <code>f<\/code>.<\/p>\n<p><strong>NOTE:<\/strong>\u00a0If a variable has any value assigned to it\u00a0<strong>anywhere<\/strong>\u00a0in the function, it is defined in the local scope. And if that variable has its value read from it before the assignment, an error is raised rather than using the value from the variable of the same name from the outer or global scope.<\/p>\n<p>This property has many uses. Many implementations of memoization decorators in Python make clever use of the function scopes. Another example is the following:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numpy as np\r\n\r\ndef datagen(X, y, batch_size, sampling_rate=0.7):\r\n    \"\"\"A generator to produce samples from input numpy arrays X and y\r\n    \"\"\"\r\n    # Select rows from arrays X and y randomly\r\n    indexing = np.random.random(len(X)) &lt; sampling_rate\r\n    Xsam, ysam = X[indexing], y[indexing]\r\n\r\n    # Actual logic to generate batches\r\n    def _gen(batch_size):\r\n        while True:\r\n            Xbatch, ybatch = [], []\r\n            for _ in range(batch_size):\r\n                i = np.random.randint(len(Xsam))\r\n                Xbatch.append(Xsam[i])\r\n                ybatch.append(ysam[i])\r\n            yield np.array(Xbatch), np.array(ybatch)\r\n    \r\n    # Create and return a generator\r\n    return _gen(batch_size)<\/pre>\n<p>This is a\u00a0<strong>generator function<\/strong>\u00a0that creates batches of samples from the input numpy arrays\u00a0<code>X<\/code>\u00a0and\u00a0<code>y<\/code>. Such generator is acceptable by Keras models in their training. However, for reasons such as cross validation, we do not want to sample from the entire input arrays\u00a0<code>X<\/code>\u00a0and\u00a0<code>y<\/code>\u00a0but a\u00a0<strong>fixed<\/strong>\u00a0subset of rows from them. The way we do it is to randomly select a portion of rows at the beginning of the\u00a0<code>datagen()<\/code>\u00a0function and keep them in\u00a0<code>Xsam<\/code>,\u00a0<code>ysam<\/code>. Then in the inner function\u00a0<code>_gen()<\/code>, rows are sampled from\u00a0<code>Xsam<\/code>\u00a0and\u00a0<code>ysam<\/code>\u00a0until a batch is created. While the lists\u00a0<code>Xbatch<\/code>\u00a0and\u00a0<code>ybatch<\/code>\u00a0are defined and created inside function\u00a0<code>_gen()<\/code>, the arrays\u00a0<code>Xsam<\/code>\u00a0and\u00a0<code>ysam<\/code>\u00a0are not local to\u00a0<code>_gen()<\/code>. What\u2019s more interesting is when the generator is created:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">X = np.random.random((100,3))\r\ny = np.random.random(100)\r\n\r\ngen1 = datagen(X, y, 3)\r\ngen2 = datagen(X, y, 4)\r\nprint(next(gen1))\r\nprint(next(gen2))<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(array([[0.89702235, 0.97516228, 0.08893787],\r\n       [0.26395301, 0.37674529, 0.1439478 ],\r\n       [0.24859104, 0.17448628, 0.41182877]]), array([0.2821138 , 0.87590954, 0.96646776]))\r\n(array([[0.62199772, 0.01442743, 0.4897467 ],\r\n       [0.41129379, 0.24600387, 0.53640666],\r\n       [0.02417213, 0.27637708, 0.65571031],\r\n       [0.15107433, 0.11331674, 0.67000849]]), array([0.91559533, 0.84886957, 0.30451455, 0.5144225 ]))<\/pre>\n<p>The function\u00a0<code>datagen()<\/code>\u00a0is called two times and therefore two different sets of\u00a0<code>Xsam<\/code>,\u00a0<code>ysam<\/code>\u00a0are created. But since the inner function\u00a0<code>_gen()<\/code>\u00a0depends on them, these two sets of\u00a0<code>Xsam<\/code>,\u00a0<code>ysam<\/code>\u00a0are in memory concurrently. Technically, we say that when\u00a0<code>datagen()<\/code>\u00a0is called, a\u00a0<strong>closure<\/strong>\u00a0is created with the specific\u00a0<code>Xsam<\/code>,\u00a0<code>ysam<\/code>\u00a0defined within, and the call to\u00a0<code>_gen()<\/code>\u00a0is accessing that closure. In other words, the scopes of the two incarnation of\u00a0<code>datagen()<\/code>\u00a0calls coexists.<\/p>\n<p>In summary, whenever a line of code references to a name (whether it is a variable, a function, or a module), the name is resolved in the order of LEGB rule:<\/p>\n<ol>\n<li>Local scope first, i.e., those name that defined in the same function<\/li>\n<li>Enclosure, or called the \u201cnonlocal\u201d scope. That\u2019s the upper level function if we are inside the nested function<\/li>\n<li>Global scope, i.e., those that defined in the top level of the same script (but not across different program files)<\/li>\n<li>Built-in scope, i.e., those created by Python automatically, such as the variable <code>__name__<\/code> or functions <code>list()<\/code>\n<\/li>\n<\/ol>\n<h2 id=\"Investigating-the-type-and-scope\">Investigating the type and scope<\/h2>\n<p>Because the types are not static in Python, sometimes we would like to know what we are dealing with but it is not trivial to tell from the code. One way to tell is using the\u00a0<code>type()<\/code>\u00a0or\u00a0<code>isinstance()<\/code>\u00a0functions. For example:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numpy as np\r\n\r\nX = np.random.random((100,3))\r\nprint(type(X))\r\nprint(isinstance(X, np.ndarray))<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&lt;class 'numpy.ndarray'&gt;\r\nTrue<\/pre>\n<p>The\u00a0<code>type()<\/code>\u00a0function returns a type object. The\u00a0<code>isinstance()<\/code>\u00a0function returns a boolean that allows us to check if something matches a particular type. These are useful in case we need to know what type a variable is. This is useful if we are debugging a code. For example, if we pass on a pandas dataframe to the\u00a0<code>datagen()<\/code>\u00a0function that we defined above:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import pandas as pd\r\nimport numpy as np\r\n\r\ndef datagen(X, y, batch_size, sampling_rate=0.7):\r\n    \"\"\"A generator to produce samples from input numpy arrays X and y\r\n    \"\"\"\r\n    # Select rows from arrays X and y randomly\r\n    indexing = np.random.random(len(X)) &lt; sampling_rate\r\n    Xsam, ysam = X[indexing], y[indexing]\r\n\r\n    # Actual logic to generate batches\r\n    def _gen(batch_size):\r\n        while True:\r\n            Xbatch, ybatch = [], []\r\n            for _ in range(batch_size):\r\n                i = np.random.randint(len(Xsam))\r\n                Xbatch.append(Xsam[i])\r\n                ybatch.append(ysam[i])\r\n            yield np.array(Xbatch), np.array(ybatch)\r\n    \r\n    # Create and return a generator\r\n    return _gen(batch_size)\r\n\r\nX = pd.DataFrame(np.random.random((100,3)))\r\ny = pd.DataFrame(np.random.random(100))\r\n\r\ngen3 = datagen(X, y, 3)\r\nprint(next(gen3))<\/pre>\n<p>Running the above code under the Python\u2019s debugger\u00a0<code>pdb<\/code>\u00a0will give the following:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt; \/Users\/MLM\/ducktype.py(1)&lt;module&gt;()\r\n-&gt; import pandas as pd\r\n(Pdb) c\r\nTraceback (most recent call last):\r\n  File \"\/usr\/local\/lib\/python3.9\/site-packages\/pandas\/core\/indexes\/range.py\", line 385, in get_loc\r\n    return self._range.index(new_key)\r\nValueError: 1 is not in range\r\n\r\nThe above exception was the direct cause of the following exception:\r\n\r\nTraceback (most recent call last):\r\n  File \"\/usr\/local\/Cellar\/python@3.9\/3.9.9\/Frameworks\/Python.framework\/Versions\/3.9\/lib\/python3.9\/pdb.py\", line 1723, in main\r\n    pdb._runscript(mainpyfile)\r\n  File \"\/usr\/local\/Cellar\/python@3.9\/3.9.9\/Frameworks\/Python.framework\/Versions\/3.9\/lib\/python3.9\/pdb.py\", line 1583, in _runscript\r\n    self.run(statement)\r\n  File \"\/usr\/local\/Cellar\/python@3.9\/3.9.9\/Frameworks\/Python.framework\/Versions\/3.9\/lib\/python3.9\/bdb.py\", line 580, in run\r\n    exec(cmd, globals, locals)\r\n  File \"&lt;string&gt;\", line 1, in &lt;module&gt;\r\n  File \"\/Users\/MLM\/ducktype.py\", line 1, in &lt;module&gt;\r\n    import pandas as pd\r\n  File \"\/Users\/MLM\/ducktype.py\", line 18, in _gen\r\n    ybatch.append(ysam[i])\r\n  File \"\/usr\/local\/lib\/python3.9\/site-packages\/pandas\/core\/frame.py\", line 3458, in __getitem__\r\n    indexer = self.columns.get_loc(key)\r\n  File \"\/usr\/local\/lib\/python3.9\/site-packages\/pandas\/core\/indexes\/range.py\", line 387, in get_loc\r\n    raise KeyError(key) from err\r\nKeyError: 1\r\nUncaught exception. Entering post mortem debugging\r\nRunning 'cont' or 'step' will restart the program\r\n&gt; \/usr\/local\/lib\/python3.9\/site-packages\/pandas\/core\/indexes\/range.py(387)get_loc()\r\n-&gt; raise KeyError(key) from err\r\n(Pdb)<\/pre>\n<p>We see from the traceback that something is wrong because we cannot get\u00a0<code>ysam[i]<\/code>. We can use the following to verify that\u00a0<code>ysam<\/code>\u00a0is indeed a Pandas DataFrame instead of a NumPy array:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(Pdb) up\r\n&gt; \/usr\/local\/lib\/python3.9\/site-packages\/pandas\/core\/frame.py(3458)__getitem__()\r\n-&gt; indexer = self.columns.get_loc(key)\r\n(Pdb) up\r\n&gt; \/Users\/MLM\/ducktype.py(18)_gen()\r\n-&gt; ybatch.append(ysam[i])\r\n(Pdb) type(ysam)\r\n&lt;class 'pandas.core.frame.DataFrame'&gt;<\/pre>\n<p>Therefore we cannot use\u00a0<code>ysam[i]<\/code>\u00a0to select row\u00a0<code>i<\/code>\u00a0from\u00a0<code>ysam<\/code>. Now in the debugger, what can we do to verify how should we modify our code? There are several useful functions you can use to investigate the variables and the scope:<\/p>\n<ul>\n<li>\n<code>dir()<\/code>\u00a0to see the names defined in the scope or the attributes defined in an object<\/li>\n<li>\n<code>locals()<\/code>\u00a0and\u00a0<code>globals()<\/code>\u00a0to see the names and values defined locally and globally, respectively.<\/li>\n<\/ul>\n<p>For example, we can use\u00a0<code>dir(ysam)<\/code>\u00a0to see what attributes or functions are defined inside\u00a0<code>ysam<\/code>:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(Pdb) dir(ysam)\r\n['T', '_AXIS_LEN', '_AXIS_ORDERS', '_AXIS_REVERSED', '_AXIS_TO_AXIS_NUMBER', \r\n...\r\n'iat', 'idxmax', 'idxmin', 'iloc', 'index', 'infer_objects', 'info', 'insert',\r\n'interpolate', 'isin', 'isna', 'isnull', 'items', 'iteritems', 'iterrows',\r\n'itertuples', 'join', 'keys', 'kurt', 'kurtosis', 'last', 'last_valid_index',\r\n...\r\n'transform', 'transpose', 'truediv', 'truncate', 'tz_convert', 'tz_localize',\r\n'unstack', 'update', 'value_counts', 'values', 'var', 'where', 'xs']\r\n(Pdb)<\/pre>\n<p>Some of these are attributes, such as\u00a0<code>shape<\/code>, and some of these are functions, such as\u00a0<code>describe()<\/code>. You can read the attribute or invoke the function in\u00a0<code>pdb<\/code>. By carefully reading this output, we recalled that the way to read row\u00a0<code>i<\/code>\u00a0from a DataFrame is through\u00a0<code>iloc<\/code>\u00a0and hence we can verify the syntax with:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(Pdb) ysam.iloc[i]\r\n0    0.83794\r\nName: 2, dtype: float64\r\n(Pdb)<\/pre>\n<p>If we call\u00a0<code>dir()<\/code>\u00a0without any argument, it gives you all the names defined in the current scope, e.g.,<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(Pdb) dir()\r\n['Xbatch', 'Xsam', '_', 'batch_size', 'i', 'ybatch', 'ysam']\r\n(Pdb) up\r\n&gt; \/Users\/MLM\/ducktype.py(1)&lt;module&gt;()\r\n-&gt; import pandas as pd\r\n(Pdb) dir()\r\n['X', '__builtins__', '__file__', '__name__', 'datagen', 'gen3', 'np', 'pd', 'y']\r\n(Pdb)<\/pre>\n<p>which the scope changes as you move around the call stack. Similar to\u00a0<code>dir()<\/code>\u00a0without argument, we can call\u00a0<code>locals()<\/code>\u00a0to show all locally defined variables, e.g.,<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">(Pdb) locals()\r\n{'batch_size': 3, 'Xbatch': ...,\r\n 'ybatch': ..., '_': 0, 'i': 1, 'Xsam': ...,\r\n 'ysam': ...}\r\n(Pdb)<\/pre>\n<p>Indeed\u00a0<code>locals()<\/code>\u00a0returns you a\u00a0<code>dict<\/code>\u00a0that allows you to see all the names and values. Therefore if we need to read the variable\u00a0<code>Xbatch<\/code>, we can get the same with\u00a0<code>locals()[\"Xbatch\"]<\/code>. Similarly, we can use\u00a0<code>globals()<\/code>\u00a0to get a dictionary of names defined in the global scope.<\/p>\n<p>This technique is beneficial sometimes. For example, we can check if a Keras model is \u201ccompiled\u201d or not by using\u00a0<code>dir(model)<\/code>. In Keras, compiling a model is to set up the loss function for training and build the flow for forward and backward propagations. Therefore, a compiled model will have an extra attribute\u00a0<code>loss<\/code>\u00a0defined:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">from tensorflow.keras import Sequential\r\nfrom tensorflow.keras.layers import Dense\r\n\r\nmodel = Sequential([\r\n    Dense(5, input_shape=(3,)),\r\n    Dense(1)\r\n])\r\n\r\nhas_loss = \"loss\" in dir(model)\r\nprint(\"Before compile, loss function defined:\", has_loss)\r\n\r\nmodel.compile()\r\nhas_loss = \"loss\" in dir(model)\r\nprint(\"After compile, loss function defined:\", has_loss)<\/pre>\n<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Before compile, loss function defined: False\r\nAfter compile, loss function defined: True<\/pre>\n<p>This allows us to put extra guard on our code before we run into error.<\/p>\n<h2 id=\"Further-reading\">Further reading<\/h2>\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h4 id=\"Articles\">Articles<\/h4>\n<ul>\n<li>Duck typing,\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Duck_typing\" target=\"_blank\" rel=\"noopener\">https:\/\/en.wikipedia.org\/wiki\/Duck_typing<\/a>\n<\/li>\n<li>Python Glossary (Duck-typing),\u00a0<a href=\"https:\/\/docs.python.org\/3\/glossary.html#term-duck-typing\" target=\"_blank\" rel=\"noopener\">https:\/\/docs.python.org\/3\/glossary.html#term-duck-typing<\/a>\n<\/li>\n<li>Python built-in functions,\u00a0<a href=\"https:\/\/docs.python.org\/3\/library\/functions.html\" target=\"_blank\" rel=\"noopener\">https:\/\/docs.python.org\/3\/library\/functions.html<\/a>\n<\/li>\n<\/ul>\n<h4 id=\"Books\">Books<\/h4>\n<ul>\n<li>Fluent Python, second edition, by Luciano Ramalho,\u00a0<a href=\"https:\/\/www.amazon.com\/dp\/1492056359\/\" target=\"_blank\" rel=\"noopener\">https:\/\/www.amazon.com\/dp\/1492056359\/<\/a>\n<\/li>\n<\/ul>\n<h2 id=\"Summary\">Summary<\/h2>\n<p>In this tutorial, you\u2019ve see how Python organize the naming scopes and how variables are interacting with the code. Specifically, you learned<\/p>\n<ul>\n<li>Python code uses variables through their interfaces, therefore variables\u2019 data type is usually unimportant<\/li>\n<li>Python variables are defined in their naming scope or closure, which variables of the same name can coexist in different scopes so they are not interfering each other<\/li>\n<li>We have some built-in functions from Python to allow us to examine the names defined in the current scope or the data type of a variable<\/li>\n<\/ul>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/duck-typing-python\/\">Duck-typing, scope, and investigative functions in Python<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/duck-typing-python\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Adrian Tam Python is a duck typing language. It means the data types of variables can change as long as the syntax is compatible. [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2022\/02\/17\/duck-typing-scope-and-investigative-functions-in-python\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":5424,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5423"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=5423"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5423\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/5424"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=5423"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=5423"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=5423"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}