{"id":5660,"date":"2022-05-31T06:29:01","date_gmt":"2022-05-31T06:29:01","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2022\/05\/31\/developing-a-python-program-using-inspection-tools\/"},"modified":"2022-05-31T06:29:01","modified_gmt":"2022-05-31T06:29:01","slug":"developing-a-python-program-using-inspection-tools","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2022\/05\/31\/developing-a-python-program-using-inspection-tools\/","title":{"rendered":"Developing a Python Program Using Inspection Tools"},"content":{"rendered":"<p>Author: Adrian Tam<\/p>\n<div>\n<p>Python is an interpreting language. It means there is an interpreter to run our program, rather than compiling the code and running natively. In Python, a REPL (read-eval-print loop) can run commands line by line. Together with some inspection tools provided by Python, it helps to develop codes.<\/p>\n<p>In the following, you will see how to make use of the Python interpreter to inspect an object and develop a program.<\/p>\n<p>After finishing this tutorial, you will learn:<\/p>\n<ul>\n<li>How to work in the Python interpreter<\/li>\n<li>How to use the inspection functions in Python<\/li>\n<li>How to develop a solution step by step with the help of inspection functions<\/li>\n<\/ul>\n<p>Let\u2019s get started!<\/p>\n<div id=\"attachment_13187\" style=\"width: 810px\" class=\"wp-caption aligncenter\">\n<img decoding=\"async\" aria-describedby=\"caption-attachment-13187\" class=\"size-full wp-image-13187\" src=\"https:\/\/machinelearningmastery.com\/wp-content\/uploads\/2022\/05\/tekton-kzlxOJwD6i8-unsplash.jpg\" alt=\"\" width=\"800\"><\/p>\n<p id=\"caption-attachment-13187\" class=\"wp-caption-text\">Developing a Python Program Using Inspection Tools. <br \/>Photo by <a href=\"https:\/\/unsplash.com\/photos\/kzlxOJwD6i8\">Tekton<\/a>. Some rights reserved.<\/p>\n<\/div>\n<h2>Tutorial Overview<\/h2>\n<p>This tutorial is in four parts; they are:<\/p>\n<ul>\n<li>PyTorch and TensorFlow<\/li>\n<li>Looking for Clues<\/li>\n<li>Learning from the Weights<\/li>\n<li>Making a Copier<\/li>\n<\/ul>\n<h2>PyTorch and TensorFlow<\/h2>\n<p>PyTorch and TensorFlow are the two biggest neural network libraries in Python. Their code is different, but the things they can do are similar.<\/p>\n<p>Consider the classic MNIST handwritten digit recognition problem; you can build a LeNet-5 model to classify the digits as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numpy as np\r\nimport torch\r\nimport torch.nn as nn\r\nimport torch.optim as optim\r\nimport torchvision\r\n\r\n# Load MNIST training data\r\ntransform = torchvision.transforms.Compose([\r\n    torchvision.transforms.ToTensor()\r\n])\r\ntrain = torchvision.datasets.MNIST('.\/datafiles\/', train=True, download=True, transform=transform)\r\ntrain_loader = torch.utils.data.DataLoader(train, batch_size=32, shuffle=True)\r\n\r\n# LeNet5 model\r\ntorch_model = nn.Sequential(\r\n    nn.Conv2d(1, 6, kernel_size=(5,5), stride=1, padding=2),\r\n    nn.Tanh(),\r\n    nn.AvgPool2d(kernel_size=2, stride=2),\r\n    nn.Conv2d(6, 16, kernel_size=5, stride=1, padding=0),\r\n    nn.Tanh(),\r\n    nn.AvgPool2d(kernel_size=2, stride=2),\r\n    nn.Conv2d(16, 120, kernel_size=5, stride=1, padding=0),\r\n    nn.Tanh(),\r\n    nn.Flatten(),\r\n    nn.Linear(120, 84),\r\n    nn.Tanh(),\r\n    nn.Linear(84, 10),\r\n    nn.Softmax(dim=1)\r\n)\r\n\r\n# Training loop\r\ndef training_loop(model, optimizer, loss_fn, train_loader, n_epochs=100):\r\n    model.train()\r\n    for epoch in range(n_epochs):\r\n        for data, target in train_loader:\r\n            output = model(data)\r\n            loss = loss_fn(output, target)\r\n            optimizer.zero_grad()\r\n            loss.backward()\r\n            optimizer.step()\r\n    model.eval()\r\n\r\n# Run training\r\noptimizer = optim.Adam(torch_model.parameters())\r\nloss_fn = nn.CrossEntropyLoss()\r\ntraining_loop(torch_model, optimizer, loss_fn, train_loader, n_epochs=20)\r\n\r\n# Save model\r\ntorch.save(torch_model, \"lenet5.pt\")<\/pre>\n<p>This is a simplified code that does not need any validation or testing. The counterpart in TensorFlow is the following:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numpy as np\r\nimport tensorflow as tf\r\nfrom tensorflow.keras.models import Sequential\r\nfrom tensorflow.keras.layers import Conv2D, Dense, AveragePooling2D, Flatten\r\nfrom tensorflow.keras.datasets import mnist\r\n\r\n# LeNet5 model\r\nkeras_model = Sequential([\r\n    Conv2D(6, (5,5), input_shape=(28,28,1), padding=\"same\", activation=\"tanh\"),\r\n    AveragePooling2D((2,2), strides=2),\r\n    Conv2D(16, (5,5), activation=\"tanh\"),\r\n    AveragePooling2D((2,2), strides=2),\r\n    Conv2D(120, (5,5), activation=\"tanh\"),\r\n    Flatten(),\r\n    Dense(84, activation=\"tanh\"),\r\n    Dense(10, activation=\"softmax\")\r\n])\r\n\r\n# Reshape data to shape of (n_sample, height, width, n_channel)\r\n(X_train, y_train), (X_test, y_test) = mnist.load_data()\r\nX_train = np.expand_dims(X_train, axis=3).astype('float32')\r\n\r\n# Train\r\nkeras_model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"adam\", metrics=[\"accuracy\"])\r\nkeras_model.fit(X_train, y_train, epochs=20, batch_size=32)\r\n\r\n# Save\r\nkeras_model.save(\"lenet5.h5\")<\/pre>\n<p>Running this program would give you the file <code>lenet5.pt<\/code> from the PyTorch code and <code>lenet5.h5<\/code> from the TensorFlow code.<\/p>\n<h2>Looking for Clues<\/h2>\n<p>If you understand what the above neural networks are doing, you should be able to tell that there is nothing but many multiply and add calculations in each layer. Mathematically, there is a matrix multiplication between the input and the <strong>kernel<\/strong> of each fully-connected layer before adding the <strong>bias<\/strong> to the result. In the convolutional layers, there is the element-wise multiplication of the kernel to a portion of the input matrix before taking the sum of the result and adding the bias as one output element of the feature map.<\/p>\n<p>While developing the same LeNet-5 model using two different frameworks, it should be possible to make them work identically if their weights are the same. How can you copy over the weight from one model to another, given their architectures are identical?<\/p>\n<p>You can load the saved models as follows:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import torch\r\nimport tensorflow as tf\r\ntorch_model = torch.load(\"lenet5.pt\")\r\nkeras_model = tf.keras.models.load_model(\"lenet5.h5\")<\/pre>\n<p>This probably does not tell you much. But if you run <code>python<\/code> in the command line without any parameters, you launch the REPL, in which you can type in the above code (you can leave the REPL with <code>quit()<\/code>):<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">Python 3.9.13 (main, May 19 2022, 13:48:47)\r\n[Clang 13.1.6 (clang-1316.0.21.2)] on darwin\r\nType \"help\", \"copyright\", \"credits\" or \"license\" for more information.\r\n&gt;&gt;&gt; import torch\r\n&gt;&gt;&gt; import tensorflow as tf\r\n&gt;&gt;&gt; torch_model = torch.load(\"lenet5.pt\")\r\n&gt;&gt;&gt; keras_model = tf.keras.models.load_model(\"lenet5.h5\")<\/pre>\n<p>Nothing shall be printed in the above. But you can check the two models that were loaded using the <code>type()<\/code> built-in command:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; type(torch_model)\r\n&lt;class 'torch.nn.modules.container.Sequential'&gt;\r\n&gt;&gt;&gt; type(keras_model)\r\n&lt;class 'keras.engine.sequential.Sequential'&gt;<\/pre>\n<p>So here you know they are neural network models from PyTorch and Keras, respectively. Since they are trained models, the weight must be stored inside. So how can you find the weights in these models? Since they are objects, the easiest way is to use <code>dir()<\/code> built-in function to inspect their members:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; dir(torch_model)\r\n['T_destination', '__annotations__', '__call__', '__class__', '__delattr__', \r\n'__delitem__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', \r\n...\r\n'_slow_forward', '_state_dict_hooks', '_version', 'add_module', 'append', 'apply', \r\n'bfloat16', 'buffers', 'children', 'cpu', 'cuda', 'double', 'dump_patches', 'eval', \r\n'extra_repr', 'float', 'forward', 'get_buffer', 'get_extra_state', 'get_parameter', \r\n'get_submodule', 'half', 'load_state_dict', 'modules', 'named_buffers', \r\n'named_children', 'named_modules', 'named_parameters', 'parameters', \r\n'register_backward_hook', 'register_buffer', 'register_forward_hook', \r\n'register_forward_pre_hook', 'register_full_backward_hook', 'register_module', \r\n'register_parameter', 'requires_grad_', 'set_extra_state', 'share_memory', 'state_dict',\r\n'to', 'to_empty', 'train', 'training', 'type', 'xpu', 'zero_grad']\r\n&gt;&gt;&gt; dir(keras_model)\r\n['_SCALAR_UPRANKING_ON', '_TF_MODULE_IGNORED_PROPERTIES', '__call__', '__class__', \r\n'__copy__', '__deepcopy__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', \r\n...\r\n'activity_regularizer', 'add', 'add_loss', 'add_metric', 'add_update', 'add_variable', \r\n'add_weight', 'build', 'built', 'call', 'compile', 'compiled_loss', 'compiled_metrics', \r\n'compute_dtype', 'compute_loss', 'compute_mask', 'compute_metrics', \r\n'compute_output_shape', 'compute_output_signature', 'count_params', \r\n'distribute_strategy', 'dtype', 'dtype_policy', 'dynamic', 'evaluate', \r\n'evaluate_generator', 'finalize_state', 'fit', 'fit_generator', 'from_config', \r\n'get_config', 'get_input_at', 'get_input_mask_at', 'get_input_shape_at', 'get_layer', \r\n'get_output_at', 'get_output_mask_at', 'get_output_shape_at', 'get_weights', 'history', \r\n'inbound_nodes', 'input', 'input_mask', 'input_names', 'input_shape', 'input_spec', \r\n'inputs', 'layers', 'load_weights', 'loss', 'losses', 'make_predict_function', \r\n'make_test_function', 'make_train_function', 'metrics', 'metrics_names', 'name', \r\n'name_scope', 'non_trainable_variables', 'non_trainable_weights', 'optimizer', \r\n'outbound_nodes', 'output', 'output_mask', 'output_names', 'output_shape', 'outputs', \r\n'pop', 'predict', 'predict_function', 'predict_generator', 'predict_on_batch', \r\n'predict_step', 'reset_metrics', 'reset_states', 'run_eagerly', 'save', 'save_spec', \r\n'save_weights', 'set_weights', 'state_updates', 'stateful', 'stop_training', \r\n'submodules', 'summary', 'supports_masking', 'test_function', 'test_on_batch', \r\n'test_step', 'to_json', 'to_yaml', 'train_function', 'train_on_batch', 'train_step', \r\n'train_tf_function', 'trainable', 'trainable_variables', 'trainable_weights', 'updates',\r\n'variable_dtype', 'variables', 'weights', 'with_name_scope']<\/pre>\n<p>There are a lot of members in each object. Some are attributes, and some are methods of the class. By convention, those that begin with an underscore are internal members that you are not supposed to access in normal circumstances. If you want to see more of each member, you can use the <code>getmembers()<\/code> function from the <code>inspect<\/code> module:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; import inspect\r\n&gt;&gt;&gt; inspect(torch_model)\r\n&gt;&gt;&gt; inspect.getmembers(torch_model)\r\n[('T_destination', ~T_destination), ('__annotations__', {'_modules': typing.Dict[str, \r\ntorch.nn.modules.module.Module]}), ('__call__', &lt;bound method Module._call_impl of \r\nSequential(\r\n...<\/pre>\n<p>The output of the <code>getmembers()<\/code> function is a list of tuples, in which each tuple is the name of the member and the member itself. From the above, for example, you know that <code>__call__<\/code> is a \u201cbound method,\u201d i.e., a member method of a class.<\/p>\n<p>By carefully looking at the members\u2019 names, you can see that in the PyTorch model, the \u201cstate\u201d should be your interest, while in the Keras model, you have some member with the name \u201cweights.\u201d To shortlist the names of them, you can do the following in the interpreter:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; [n for n in dir(torch_model) if 'state' in n]\r\n['__setstate__', '_load_from_state_dict', '_load_state_dict_pre_hooks', \r\n'_register_load_state_dict_pre_hook', '_register_state_dict_hook', \r\n'_save_to_state_dict', '_state_dict_hooks', 'get_extra_state', 'load_state_dict', \r\n'set_extra_state', 'state_dict']\r\n&gt;&gt;&gt; [n for n in dir(keras_model) if 'weight' in n]\r\n['_assert_weights_created', '_captured_weight_regularizer', \r\n'_check_sample_weight_warning', '_dedup_weights', '_handle_weight_regularization', \r\n'_initial_weights', '_non_trainable_weights', '_trainable_weights', \r\n'_undeduplicated_weights', 'add_weight', 'get_weights', 'load_weights', \r\n'non_trainable_weights', 'save_weights', 'set_weights', 'trainable_weights', 'weights']<\/pre>\n<p>This might take some time in trial and error. But it\u2019s not too difficult, and you may discover that you can see the weight with <code>state_dict<\/code> in the torch model:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; torch_model.state_dict\r\n&lt;bound method Module.state_dict of Sequential(\r\n  (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))\r\n  (1): Tanh()\r\n  (2): AvgPool2d(kernel_size=2, stride=2, padding=0)\r\n  (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))\r\n  (4): Tanh()\r\n  (5): AvgPool2d(kernel_size=2, stride=2, padding=0)\r\n  (6): Conv2d(16, 120, kernel_size=(5, 5), stride=(1, 1))\r\n  (7): Tanh()\r\n  (8): Flatten(start_dim=1, end_dim=-1)\r\n  (9): Linear(in_features=120, out_features=84, bias=True)\r\n  (10): Tanh()\r\n  (11): Linear(in_features=84, out_features=10, bias=True)\r\n  (12): Softmax(dim=1)\r\n)&gt;\r\n&gt;&gt;&gt; torch_model.state_dict()\r\nOrderedDict([('0.weight', tensor([[[[ 0.1559,  0.1681,  0.2726,  0.3187,  0.4909],\r\n          [ 0.1179,  0.1340, -0.0815, -0.3253,  0.0904],\r\n          [ 0.2326, -0.2079, -0.8614, -0.8643, -0.0632],\r\n          [ 0.3874, -0.3490, -0.7957, -0.5873, -0.0638],\r\n          [ 0.2800,  0.0947,  0.0308,  0.4065,  0.6916]]],\r\n\r\n\r\n        [[[ 0.5116,  0.1798, -0.1062, -0.4099, -0.3307],\r\n          [ 0.1090,  0.0689, -0.1010, -0.9136, -0.5271],\r\n          [ 0.2910,  0.2096, -0.2442, -1.5576, -0.0305],\r\n...<\/pre>\n<p>For the TensorFlow\/Keras model, you can find the weights with <code>get_weights()<\/code>:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; keras_model.get_weights\r\n&lt;bound method Model.get_weights of &lt;keras.engine.sequential.Sequential object at 0x159d93eb0&gt;&gt;\r\n&gt;&gt;&gt; keras_model.get_weights()\r\n[array([[[[ 0.14078194,  0.04990018, -0.06204645, -0.03128023,\r\n          -0.22033708,  0.19721672]],\r\n\r\n        [[-0.06618818, -0.152075  ,  0.13130261,  0.22893831,\r\n           0.08880515,  0.01917628]],\r\n\r\n        [[-0.28716782, -0.23207009,  0.00505603,  0.2697424 ,\r\n          -0.1916888 , -0.25858143]],\r\n\r\n        [[-0.41863152, -0.20710683,  0.13254236,  0.18774481,\r\n          -0.14866787, -0.14398652]],\r\n\r\n        [[-0.25119543, -0.14405733, -0.048533  , -0.12108403,\r\n           0.06704573, -0.1196835 ]]],\r\n\r\n\r\n       [[[-0.2438466 ,  0.02499897, -0.1243961 , -0.20115352,\r\n          -0.0241346 ,  0.15888865]],\r\n\r\n        [[-0.20548582, -0.26495507,  0.21004884,  0.32183227,\r\n          -0.13990627, -0.02996112]],\r\n...<\/pre>\n<p>Here it is also with the attribute <code>weights<\/code>:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; keras_model.weights\r\n[&lt;tf.Variable 'conv2d\/kernel:0' shape=(5, 5, 1, 6) dtype=float32, numpy=\r\narray([[[[ 0.14078194,  0.04990018, -0.06204645, -0.03128023,\r\n          -0.22033708,  0.19721672]],\r\n\r\n        [[-0.06618818, -0.152075  ,  0.13130261,  0.22893831,\r\n           0.08880515,  0.01917628]],\r\n...\r\n         8.25365111e-02, -1.72486171e-01,  3.16280037e-01,\r\n         4.12595004e-01]], dtype=float32)&gt;, &lt;tf.Variable 'dense_1\/bias:0' shape=(10,) dtype=float32, numpy=\r\narray([-0.19007775,  0.14427921,  0.0571407 , -0.24149619, -0.03247226,\r\n        0.18109408, -0.17159976,  0.21736498, -0.10254183,  0.02417901],\r\n      dtype=float32)&gt;]<\/pre>\n<p>Here,\u00a0 you can observe the following: In the PyTorch model, the function <code>state_dict()<\/code> gives an <code>OrderedDict<\/code>, which is a dictionary with the key in a specified order. There are keys such as <code>0.weight<\/code>, and they are mapped to a tensor value. In the Keras model, the <code>get_weights()<\/code> function returns a list. Each element in the list is a NumPy array. The <code>weight<\/code> attribute also holds a list, but the elements are <code>tf.Variable<\/code> type.<\/p>\n<p>You can know more by checking the shape of each tensor or array:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; [(key, val.shape) for key, val in torch_model.state_dict().items()]\r\n[('0.weight', torch.Size([6, 1, 5, 5])), ('0.bias', torch.Size([6])), ('3.weight', \r\ntorch.Size([16, 6, 5, 5])), ('3.bias', torch.Size([16])), ('6.weight', torch.Size([120,\r\n16, 5, 5])), ('6.bias', torch.Size([120])), ('9.weight', torch.Size([84, 120])), \r\n('9.bias', torch.Size([84])), ('11.weight', torch.Size([10, 84])), ('11.bias', \r\ntorch.Size([10]))]\r\n&gt;&gt;&gt; [arr.shape for arr in keras_model.get_weights()]\r\n[(5, 5, 1, 6), (6,), (5, 5, 6, 16), (16,), (5, 5, 16, 120), (120,), (120, 84), (84,), \r\n(84, 10), (10,)]<\/pre>\n<p>While you do not see the name of the layers from the Keras model above, in fact, you can use similar reasoning to find the layers and get their name:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; keras_model.layers\r\n[&lt;keras.layers.convolutional.conv2d.Conv2D object at 0x159ddd850&gt;, \r\n&lt;keras.layers.pooling.average_pooling2d.AveragePooling2D object at 0x159ddd820&gt;, \r\n&lt;keras.layers.convolutional.conv2d.Conv2D object at 0x15a12b1c0&gt;, \r\n&lt;keras.layers.pooling.average_pooling2d.AveragePooling2D object at 0x15a1705e0&gt;, \r\n&lt;keras.layers.convolutional.conv2d.Conv2D object at 0x15a1812b0&gt;, \r\n&lt;keras.layers.reshaping.flatten.Flatten object at 0x15a194310&gt;, \r\n&lt;keras.layers.core.dense.Dense object at 0x15a1947c0&gt;, &lt;keras.layers.core.dense.Dense \r\nobject at 0x15a194910&gt;]\r\n&gt;&gt;&gt; [layer.name for layer in keras_model.layers]\r\n['conv2d', 'average_pooling2d', 'conv2d_1', 'average_pooling2d_1', 'conv2d_2', \r\n'flatten', 'dense', 'dense_1']\r\n&gt;&gt;&gt;<\/pre>\n<\/p>\n<h2>Learning from the Weights<\/h2>\n<p>By comparing the result of <code>state_dict()<\/code> from the PyTorch model and that of <code>get_weights()<\/code> from the Keras model, you can see that they both contain 10 elements. From the shape of the PyTorch tensors and NumPy arrays, you can further notice that they are in similar shapes. This is probably because both frameworks recognize a model in the order from input to output. You can further confirm that from the key of the <code>state_dict()<\/code> output compared to the layer names from the Keras model.<\/p>\n<p>You can check how you can manipulate a PyTorch tensor by extracting one and inspecting:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; torch_states = torch_model.state_dict()\r\n&gt;&gt;&gt; torch_states.keys()\r\nodict_keys(['0.weight', '0.bias', '3.weight', '3.bias', '6.weight', '6.bias', '9.weight', '9.bias', '11.weight', '11.bias'])\r\n&gt;&gt;&gt; torch_states[\"0.weight\"]\r\ntensor([[[[ 0.1559,  0.1681,  0.2726,  0.3187,  0.4909],\r\n          [ 0.1179,  0.1340, -0.0815, -0.3253,  0.0904],\r\n          [ 0.2326, -0.2079, -0.8614, -0.8643, -0.0632],\r\n          [ 0.3874, -0.3490, -0.7957, -0.5873, -0.0638],\r\n          [ 0.2800,  0.0947,  0.0308,  0.4065,  0.6916]]],\r\n...\r\n        [[[ 0.0980,  0.0240,  0.3295,  0.4507,  0.4539],\r\n          [-0.1530, -0.3991, -0.3834, -0.2716,  0.0809],\r\n          [-0.4639, -0.5537, -1.0207, -0.8049, -0.4977],\r\n          [ 0.1825, -0.1284, -0.0669, -0.4652, -0.2961],\r\n          [ 0.3402,  0.4256,  0.4329,  0.1503,  0.4207]]]])\r\n&gt;&gt;&gt; dir(torch_states[\"0.weight\"])\r\n['H', 'T', '__abs__', '__add__', '__and__', '__array__', '__array_priority__', \r\n'__array_wrap__', '__bool__', '__class__', '__complex__', '__contains__', \r\n...\r\n'trunc', 'trunc_', 'type', 'type_as', 'unbind', 'unflatten', 'unfold', 'uniform_', \r\n'unique', 'unique_consecutive', 'unsafe_chunk', 'unsafe_split', \r\n'unsafe_split_with_sizes', 'unsqueeze', 'unsqueeze_', 'values', 'var', 'vdot', 'view', \r\n'view_as', 'vsplit', 'where', 'xlogy', 'xlogy_', 'xpu', 'zero_']\r\n&gt;&gt;&gt; torch_states[\"0.weight\"].numpy()\r\narray([[[[ 0.15587455,  0.16805592,  0.27259687,  0.31871665,\r\n           0.49091515],\r\n         [ 0.11791296,  0.13400094, -0.08148099, -0.32530317,\r\n           0.09039831],\r\n...\r\n         [ 0.18252987, -0.12838107, -0.0669101 , -0.4652463 ,\r\n          -0.2960882 ],\r\n         [ 0.34022188,  0.4256311 ,  0.4328527 ,  0.15025541,\r\n           0.4207182 ]]]], dtype=float32)\r\n&gt;&gt;&gt; torch_states[\"0.weight\"].shape\r\ntorch.Size([6, 1, 5, 5])\r\n&gt;&gt;&gt; torch_states[\"0.weight\"].numpy().shape\r\n(6, 1, 5, 5)<\/pre>\n<p>From the output of <code>dir()<\/code> on a PyTorch tensor, you found a member named <code>numpy<\/code>, and by calling that function, it seems to convert a tensor into a NumPy array. You can be quite confident about that because you see the numbers match and the shape matches. In fact, you can be more confident by looking at the documentation:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; help(torch_states[\"0.weight\"].numpy)<\/pre>\n<p>The <code>help()<\/code> function will show you the docstring of a function, which usually is its documentation.<\/p>\n<p>Since this is the kernel of the first convolution layer, by comparing the shape of this kernel to that of the Keras model, you can note their shapes are different:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; keras_weights = keras_model.get_weights()\r\n&gt;&gt;&gt; keras_weights[0].shape\r\n(5, 5, 1, 6)<\/pre>\n<p>Know that the input to the first layer is a 28\u00d728\u00d71 image array while the output is 6 feature maps. It is natural to correspond the 1 and 6 in the kernel shape to be the number of channels in the input and output. Also, from our understanding of the mechanism of a convolutional layer, the kernel should be a 5\u00d75 matrix.<\/p>\n<p>At this point, you probably guessed that in the PyTorch convolutional layer, the kernel is represented as (output \u00d7 input \u00d7 height \u00d7 width), while in Keras, it is represented as (height \u00d7 width \u00d7 input \u00d7 output).<\/p>\n<p>Similarly, you also see in the fully-connected layers that PyTorch presents the kernel as (output \u00d7 input) while Keras is in (input \u00d7 output):<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; keras_weights[6].shape\r\n(120, 84)\r\n&gt;&gt;&gt; list(torch_states.values())[6].shape\r\ntorch.Size([84, 120])<\/pre>\n<p>Matching the weights and tensors and showing their shapes side by side should make these clearer:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; for k,t in zip(keras_weights, torch_states.values()):\r\n...     print(f\"Keras: {k.shape}, Torch: {t.shape}\")\r\n...\r\nKeras: (5, 5, 1, 6), Torch: torch.Size([6, 1, 5, 5])\r\nKeras: (6,), Torch: torch.Size([6])\r\nKeras: (5, 5, 6, 16), Torch: torch.Size([16, 6, 5, 5])\r\nKeras: (16,), Torch: torch.Size([16])\r\nKeras: (5, 5, 16, 120), Torch: torch.Size([120, 16, 5, 5])\r\nKeras: (120,), Torch: torch.Size([120])\r\nKeras: (120, 84), Torch: torch.Size([84, 120])\r\nKeras: (84,), Torch: torch.Size([84])\r\nKeras: (84, 10), Torch: torch.Size([10, 84])\r\nKeras: (10,), Torch: torch.Size([10])<\/pre>\n<p>And we can also match the name of the Keras weights and PyTorch tensors:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; for k, t in zip(keras_model.weights, torch_states.keys()):\r\n...     print(f\"Keras: {k.name}, Torch: {t}\")\r\n...\r\nKeras: conv2d\/kernel:0, Torch: 0.weight\r\nKeras: conv2d\/bias:0, Torch: 0.bias\r\nKeras: conv2d_1\/kernel:0, Torch: 3.weight\r\nKeras: conv2d_1\/bias:0, Torch: 3.bias\r\nKeras: conv2d_2\/kernel:0, Torch: 6.weight\r\nKeras: conv2d_2\/bias:0, Torch: 6.bias\r\nKeras: dense\/kernel:0, Torch: 9.weight\r\nKeras: dense\/bias:0, Torch: 9.bias\r\nKeras: dense_1\/kernel:0, Torch: 11.weight\r\nKeras: dense_1\/bias:0, Torch: 11.bias<\/pre>\n<\/p>\n<h2>Making a Copier<\/h2>\n<p>Since you learned what the weights look like in each model, it doesn\u2019t seem difficult to create a program to copy weights from one to another. The key is to answer:<\/p>\n<ol>\n<li>How to set the weights in each model<\/li>\n<li>What the weights are supposed to look like (shape and data type) in each model<\/li>\n<\/ol>\n<p>The first question can be answered from the previous inspection using the <code>dir()<\/code> built-in function. You saw the <code>load_state_dict<\/code> member in the PyTorch model, and it seems to be the tool. Similarly, in the Keras model, you saw a member named <code>set_weight<\/code> that is exactly the counterpart name for <code>get_weight<\/code>. You can further confirm it is the case by checking their documentation online or via the <code>help()<\/code> function:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">&gt;&gt;&gt; keras_model.set_weights\r\n&lt;bound method Layer.set_weights of &lt;keras.engine.sequential.Sequential object at 0x159d93eb0&gt;&gt;\r\n&gt;&gt;&gt; torch_model.load_state_dict\r\n&lt;bound method Module.load_state_dict of Sequential(\r\n  (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))\r\n  (1): Tanh()\r\n  (2): AvgPool2d(kernel_size=2, stride=2, padding=0)\r\n  (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))\r\n  (4): Tanh()\r\n  (5): AvgPool2d(kernel_size=2, stride=2, padding=0)\r\n  (6): Conv2d(16, 120, kernel_size=(5, 5), stride=(1, 1))\r\n  (7): Tanh()\r\n  (8): Flatten(start_dim=1, end_dim=-1)\r\n  (9): Linear(in_features=120, out_features=84, bias=True)\r\n  (10): Tanh()\r\n  (11): Linear(in_features=84, out_features=10, bias=True)\r\n  (12): Softmax(dim=1)\r\n)&gt;\r\n&gt;&gt;&gt; help(torch_model.load_state_dict)\r\n\r\n&gt;&gt;&gt; help(keras_model.set_weights)<\/pre>\n<p>You confirmed that these are both functions, and their documentation explained they are what you believed them to be. From the documentation, you further learned that the <code>load_state_dict()<\/code> function of the PyTorch model expects the argument to be the same format as that returned from the <code>state_dict()<\/code> function; the <code>set_weights()<\/code> function of the Keras model expects the same format as returned from the <code>get_weights()<\/code> function.<\/p>\n<p>Now you have finished your adventure with the Python REPL (you can enter <code>quit()<\/code> to leave).<\/p>\n<p>By researching a bit on how to <strong>reshape<\/strong> the weights and <strong>cast<\/strong> from one data type to another, you come up with the following program:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import torch\r\nimport tensorflow as tf\r\n\r\n# Load the models\r\ntorch_model = torch.load(\"lenet5.pt\")\r\nkeras_model = tf.keras.models.load_model(\"lenet5.h5\")\r\n\r\n# Extract weights from Keras model\r\nkeras_weights = keras_model.get_weights()\r\n\r\n# Transform shape from Keras to PyTorch\r\nfor idx in [0, 2, 4]:\r\n    # conv layers: (out, in, height, width)\r\n    keras_weights[idx] = keras_weights[idx].transpose([3, 2, 0, 1])\r\nfor idx in [6, 8]:\r\n    # dense layers: (out, in)\r\n    keras_weights[idx] = keras_weights[idx].transpose()\r\n\r\n# Set weights\r\ntorch_states = torch_model.state_dict()\r\nfor key, weight in zip(torch_states.keys(), keras_weights):\r\n    torch_states[key] = torch.tensor(weight)\r\ntorch_model.load_state_dict(torch_states)\r\n\r\n# Save new model\r\ntorch.save(torch_model, \"lenet5-keras.pt\")<\/pre>\n<p>And the other way around, copying weights from the PyTorch model to the Keras model can be done similarly,<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import torch\r\nimport tensorflow as tf\r\n\r\n# Load the models\r\ntorch_model = torch.load(\"lenet5.pt\")\r\nkeras_model = tf.keras.models.load_model(\"lenet5.h5\")\r\n\r\n# Extract weights from PyTorch model\r\ntorch_states = torch_model.state_dict()\r\nweights = list(torch_states.values())\r\n\r\n# Transform tensor to numpy array\r\nweights = [w.numpy() for w in weights]\r\n\r\n# Transform shape from PyTorch to Keras\r\nfor idx in [0, 2, 4]:\r\n    # conv layers: (height, width, in, out)\r\n    weights[idx] = weights[idx].transpose([2, 3, 1, 0])\r\nfor idx in [6, 8]:\r\n    # dense layers: (in, out)\r\n    weights[idx] = weights[idx].transpose()\r\n\r\n# Set weights\r\nkeras_model.set_weights(weights)\r\n\r\n# Save new model\r\nkeras_model.save(\"lenet5-torch.h5\")<\/pre>\n<p>Then, you can verify they work the same by passing a random array as input, in which you can expect the output tied out exactly:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">import numpy as np\r\nimport torch\r\nimport tensorflow as tf\r\n\r\n# Load the models\r\ntorch_orig_model = torch.load(\"lenet5.pt\")\r\nkeras_orig_model = tf.keras.models.load_model(\"lenet5.h5\")\r\ntorch_converted_model = torch.load(\"lenet5-keras.pt\")\r\nkeras_converted_model = tf.keras.models.load_model(\"lenet5-torch.h5\")\r\n\r\n# Create a random input\r\nsample = np.random.random((28,28))\r\n\r\n# Convert sample to torch input shape\r\ntorch_sample = torch.Tensor(sample.reshape(1,1,28,28))\r\n\r\n# Convert sample to keras input shape\r\nkeras_sample = sample.reshape(1,28,28,1)\r\n\r\n# Check output\r\nkeras_converted_output = keras_converted_model.predict(keras_sample, verbose=0)\r\nkeras_orig_output = keras_orig_model.predict(keras_sample, verbose=0)\r\ntorch_converted_output = torch_converted_model(torch_sample).detach().numpy()\r\ntorch_orig_output = torch_orig_model(torch_sample).detach().numpy()\r\n\r\nnp.set_printoptions(precision=4)\r\nprint(keras_orig_output)\r\nprint(torch_converted_output)\r\nprint()\r\nprint(torch_orig_output)\r\nprint(keras_converted_output)<\/pre>\n<p>In our case, the output is:<\/p>\n<pre class=\"urvanov-syntax-highlighter-plain-tag\">[[9.8908e-06 2.4246e-07 3.1996e-04 8.2742e-01 1.6853e-10 1.7212e-01\r\n  3.6018e-10 1.5521e-06 1.3128e-04 2.2083e-06]]\r\n[[9.8908e-06 2.4245e-07 3.1996e-04 8.2742e-01 1.6853e-10 1.7212e-01\r\n  3.6018e-10 1.5521e-06 1.3128e-04 2.2083e-06]]\r\n\r\n[[4.1505e-10 1.9959e-17 1.7399e-08 4.0302e-11 9.5790e-14 3.7395e-12\r\n  1.0634e-10 1.7682e-16 1.0000e+00 8.8126e-10]]\r\n[[4.1506e-10 1.9959e-17 1.7399e-08 4.0302e-11 9.5791e-14 3.7395e-12\r\n  1.0634e-10 1.7682e-16 1.0000e+00 8.8127e-10]]<\/pre>\n<p>This agrees with each other at sufficient precision. Note that your result may not be exactly the same due to the random nature of training. Also, due to the nature of floating point calculation, the PyTorch and TensorFlow\/Keras model would not produce the exact same output even if the weights were the same.<\/p>\n<p>However, the objective here is to show you how you can make use of Python\u2019s inspection tools to understand something you didn\u2019t know and develop a solution.<\/p>\n<h2>Further Readings<\/h2>\n<div class=\"page\" title=\"Page 273\">\n<div class=\"layoutArea\">\n<div class=\"column\">\n<p>This section provides more resources on the topic if you are looking to go deeper.<\/p>\n<h4>Articles<\/h4>\n<\/div>\n<\/div>\n<\/div>\n<ul>\n<li>\n<a href=\"https:\/\/docs.python.org\/3\/library\/inspect.html\">inspect<\/a> module in Python Standard Libraries<\/li>\n<li>\n<a href=\"https:\/\/docs.python.org\/3\/library\/functions.html#dir\">dir<\/a> built-in function<\/li>\n<li><a href=\"https:\/\/pytorch.org\/tutorials\/recipes\/recipes\/what_is_state_dict.html\">What is a <code>state_dict<\/code> in PyTorch<\/a><\/li>\n<li><a href=\"https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/keras\/layers\/Layer#get_weights\">TensorFlow <code>get_weights<\/code> method<\/a><\/li>\n<\/ul>\n<h2>Summary<\/h2>\n<p>In this tutorial, you learned how to work under the Python REPL and use the inspection functions to develop a solution. Specifically,<\/p>\n<ul>\n<li>You learned how to use the inspection functions in REPL to learn the internal members of an object<\/li>\n<li>You learned how to use REPL to experiment with Python code<\/li>\n<li>As a result, you developed a program converting between a PyTorch and a Keras model<\/li>\n<\/ul>\n<p>The post <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/developing-a-python-program-using-inspection-tools\/\">Developing a Python Program Using Inspection Tools<\/a> appeared first on <a rel=\"nofollow\" href=\"https:\/\/machinelearningmastery.com\/\">Machine Learning Mastery<\/a>.<\/p>\n<\/div>\n<p><a href=\"https:\/\/machinelearningmastery.com\/developing-a-python-program-using-inspection-tools\/\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Adrian Tam Python is an interpreting language. It means there is an interpreter to run our program, rather than compiling the code and running [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2022\/05\/31\/developing-a-python-program-using-inspection-tools\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":5661,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[24],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5660"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=5660"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/5660\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/5661"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=5660"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=5660"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=5660"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}