{"id":4541,"date":"2021-04-04T06:35:28","date_gmt":"2021-04-04T06:35:28","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2021\/04\/04\/distributed-artificial-intelligence-with-intersystems-iris\/"},"modified":"2021-04-04T06:35:28","modified_gmt":"2021-04-04T06:35:28","slug":"distributed-artificial-intelligence-with-intersystems-iris","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2021\/04\/04\/distributed-artificial-intelligence-with-intersystems-iris\/","title":{"rendered":"Distributed Artificial Intelligence with InterSystems IRIS"},"content":{"rendered":"<p>Author: Sergey Lukyanchikov<\/p>\n<div>\n<p><strong><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8753372271?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8753372271?profile=RESIZE_710x\" width=\"720\" class=\"align-full\"><\/a><\/strong><\/p>\n<p><strong>What is Distributed Artificial Intelligence (DAI)?<\/strong><\/p>\n<p>Attempts to find a \u201cbullet-proof\u201d definition have not produced result: it seems like the term is slightly \u201cahead of time\u201d. Still, we can analyze semantically the term itself \u2013 deriving that distributed artificial intelligence is the same AI (see <a href=\"https:\/\/habr.com\/en\/company\/intersystems\/blog\/478822\/\">our effort<\/a> to suggest an \u201capplied\u201d definition) though partitioned across several computers that are not clustered together (neither data-wise, nor via applications, not by providing access to particular computers in principle). I.e., ideally, distributed artificial intelligence should be arranged in such a way that none of the computers participating in that \u201cdistribution\u201d have direct access to data nor applications of another computer: the only alternative becomes transmission of data samples and executable scripts via \u201ctransparent\u201d messaging. Any deviations from that ideal should lead to an advent of \u201cpartially distributed artificial intelligence\u201d \u2013 an example being distributed data with a central application server. Or its inverse. One way or the other, we obtain as a result a set of \u201cfederated\u201d models (i.e., either models trained each on their own data sources, or each trained by their own algorithms, or \u201cboth at once\u201d).<\/p>\n<p><strong>Distributed AI scenarios \u201cfor the masses\u201d<\/strong><\/p>\n<p>We will not be discussing edge computations, confidential data operators, scattered mobile searches, or similar fascinating yet not the most consciously and wide-applied (not at this moment) scenarios. We will be much \u201ccloser to life\u201d if, for instance, we consider the following scenario (its detailed demo can and should be <a href=\"https:\/\/youtu.be\/rRJ8_O4Y3gs?t=906\" target=\"_blank\" rel=\"noopener\">watched here<\/a>): a company runs a production-level AI\/ML solution, the quality of its functioning is being systematically checked by an external data scientist (i.e., an expert that is not an employee of the company). For a number of reasons, the company cannot grant the data scientist access to the solution but it can send him a sample of records from a required table following a schedule or a particular event (for example, termination of a training session for one or several models by the solution). With that we assume, that the data scientist owns some version of the AI\/ML mechanisms already integrated in the production-level solution that the company is running \u2013 and it is likely that they are being developed, improved, and adapted to concrete use cases of that concrete company, by the data scientist himself. Deployment of those mechanisms into the running solution, monitoring of their functioning, and other lifecycle aspects are being handled by a data engineer (the company employee).<\/p>\n<p>An example of deployment of a production-level AI\/ML solution on InterSystems IRIS platform that works autonomously with a flow of data coming from equipment, was provided by us in <a href=\"https:\/\/habr.com\/ru\/company\/intersystems\/blog\/516344\/\" target=\"_blank\" rel=\"noopener\">this article<\/a>. The same solution runs in the demo under the link provided in the above paragraph. You can build your own solution prototype on InterSystems IRIS using the content (free with no time limit) in our repo \u00a0<a href=\"https:\/\/github.com\/intersystems-community\/Convergent-Analytics\">Convergent Analytics<\/a> (visit sections <a href=\"https:\/\/github.com\/intersystems-community\/Convergent-Analytics#links-to-required-downloads\">Links to Required Downloads<\/a> and <a href=\"https:\/\/github.com\/intersystems-community\/Convergent-Analytics#root-resources\">Root Resources<\/a>).<\/p>\n<p>Which \u201cdegree of distribution\u201d of AI do we get via such scenario? In our opinion, in this scenario we are rather close to the ideal because the data scientist is \u201ccut from\u201d both the data (just a limited sample is transmitted \u2013 although crucial as of a point in time) and the algorithms of the company (data scientist\u2019s own \u201cspecimens\u201d are never in 100% sync with the \u201clive\u201d mechanisms deployed and running as part of the real-time production-level solution), he has no access at all to the company IT infrastructure. Therefore, the data scientist\u2019s role resolves to a partial replay on his local computational resources of an episode of the company production-level AI\/ML solution functioning, getting an estimate of the quality of that functioning at an acceptable confidence level \u2013 and returning a feedback to the company (formulated, in our concrete scenario, as \u201caudit\u201d results plus, maybe, an improved version of this or that AI\/ML mechanism involved in the company solution).<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752114479?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752114479?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/p>\n<p><em>Figure 1 Distributed AI scenario formulation<\/em><\/p>\n<p>We know that feedback may not necessarily need to be formulated and transmitted during an AI artifact exchange by humans, this follows from publications about modern instruments and already existing experience around implementations of distributed AI. However, the strength of InterSystems IRIS platform is that it allows equally efficiently to develop and launch both \u201chybrid\u201d (a tandem of a human and a machine) and fully automated AI use cases \u2013 so we will continue our analysis based on the above \u201chybrid\u201d example, while leaving a possibility for the reader to elaborate on its full automation on their own.<\/p>\n<p><strong>How a concrete distributed AI scenario runs on InterSystems IRIS platform<\/strong><\/p>\n<p><a href=\"https:\/\/youtu.be\/rRJ8_O4Y3gs?t=111\"><\/a><a href=\"https:\/\/youtu.be\/rRJ8_O4Y3gs?t=111\" target=\"_blank\" rel=\"noopener\">The intro<\/a><a href=\"https:\/\/youtu.be\/rRJ8_O4Y3gs?t=111\"><\/a>\u00a0to our video with the scenario demo that is mentioned in the above section of this article gives a general overview of InterSystems IRIS as real-time AI\/ML platform and explains its support of DevOps macromechanisms. In the demo, the \u201ccompany-side\u201d business process that handles regular transmission of training datasets to the external data scientist, is not covered explicitly \u2013 so we will start from a short coverage of that business process and its steps.<\/p>\n<p>A major \u201cengine\u201d of the sender business processes is the while-loop (implemented using InterSystems IRIS visual business process composer that is based on the <a href=\"https:\/\/docs.intersystems.com\/irisforhealthlatest\/csp\/docbook\/DocBook.UI.Page.cls?KEY=EBPL\">BPL<\/a> notation interpreted by the platform), responsible for a systematic sending of training datasets to the external data scientist. The following actions are executed inside that \u201cengine\u201d (see the diagram, skip data consistency actions):<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752126489?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752126489?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/p>\n<p><em>Figure 2 Main part of the \u201csender\u201d business process<\/em><\/p>\n<p>(a) Load Analyzer \u2013 loads the current set of records from the training dataset table into the business process and forms a dataframe in the Python session based on it. The call-action triggers an SQL query to InterSystems IRIS DBMS and a call to Python interface to transfer the SQL result to it so that the dataframe is formed;<\/p>\n<p>(b) Analyzer 2 Azure \u2013 another call-action, triggers a call to Python interface to transfer it a set of Azure ML SDK for Python instructions to build required infrastructure in Azure and to deploy over that infrastructure the dataframe data formed in the previous action;<\/p>\n<p>As a result of the above business process actions executed, we obtain a stored object (a .csv file) in Azure containing an export of the recent dataset used for model training by the production-level solution at the company:<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752130894?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752130894?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/p>\n<p><em>Figure 3 \u201cArrival\u201d of the training dataset to Azure ML<\/em><\/p>\n<p>With that, the main part of the sender business process is over, but we need to execute one more action keeping in mind that any computation resources that we create in Azure ML are billable (see the diagram, skip data consistency actions):<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752140875?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752140875?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/p>\n<p><em>Figure 4 Final part of the \u201csender\u201d business process<\/em><\/p>\n<p>(c) Resource Cleanup \u2013 triggers a call to Python interface to transfer it a set of Azure ML SDK for Python instructions to remove from Azure the computational infrastructure built in the previous action.<\/p>\n<p>The data required for the data scientist has been transmitted (the dataset is now in Azure), so we can proceed with launching the \u201cexternal\u201d business process that would access the dataset, run at least one alternative model training (algorithmically, an alternative model is distinct from the model running as part of the production-level solution), and return to the data scientist the resulting model quality metrics plus visualizations permitting to formulate \u201caudit findings\u201d about the company production-level solution functioning efficiency.<\/p>\n<p>Let us now take a look at the receiver business process: unlike its sender counterpart (runs among the other business processes comprising the autonomous AI\/ML solution at the company), it does not require a while-loop, but it contains instead a sequence of actions related to training of alternative models in Azure ML and in IntegratedML (the accelerator for use of auto-ML frameworks from within InterSystems IRIS), and extracting the training results into InterSystems IRIS (the platform is also considered installed locally at the data scientist\u2019s):<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752141088?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752141088?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/p>\n<p><em>Figure 5 \u201cReceiver\u201d business process<\/em><\/p>\n<p>(a) Import Python Modules \u2013 triggers a call to Python interface to transfer it a set of instructions to import Python modules that are required for further actions;<\/p>\n<p>(b) Set AUDITOR Parameters \u2013 triggers a call to Python interface to transfer it a set of instructions to assign default values to the variables required for further actions;<\/p>\n<p>(c) Audit with Azure ML \u2013 (we will be skipping any further reference to Python interface triggering) hands \u201caudit assignment\u201d to Azure ML;<\/p>\n<p>(d) Interpret Azure ML \u2013 gets the data transmitted to Azure ML by the sender business process, into the local Python session together with the \u201caudit\u201d results by Azure ML (also, creates a visualization of the \u201caudit\u201d results in the Python session);<\/p>\n<p>(e) Stream to IRIS \u2013 extracts the data transmitted to Azure ML by the sender business process, together with the \u201caudit\u201d results by Azure ML, from the local Python session into a business process variable in IRIS;<\/p>\n<p>(f) Populate IRIS \u2013 writes the data transmitted to Azure ML by the sender business process, together with the \u201caudit\u201d results by Azure ML, from the business process variable in IRIS to a table in IRIS;<\/p>\n<p>(g) Audit with IntegratedML \u2013 \u201caudits\u201d the data received from Azure ML, together with the \u201caudit\u201d results by Azure ML, written into IRIS in the previous action, using IntegratedML accelerator (in this particular case it handles H2O auto-ML framework);<\/p>\n<p>(h) Query to Python \u2013 transfers the data and the \u201caudit\u201d results by IntegratedML into the Python session;<\/p>\n<p>(i) Interpret IntegratedML \u2013 in the Python session, creates a visualization of the \u201caudit\u201d results by IntegratedML;<\/p>\n<p>(j) Resource Cleanup \u2013 deletes from Azure the computational infrastructure created in the previous actions.<\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752141894?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752141894?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/p>\n<p><em>Figure 6 Visualization of Azure ML \u201caudit\u201d results<\/em><\/p>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752146061?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752146061?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/p>\n<p><em>Figure 7 Visualization of IntegratedML \u201caudit\u201d results<\/em><\/p>\n<p><strong>How distributed AI is implemented in general on InterSystems IRIS platform<\/strong><\/p>\n<p>InterSystems IRIS platform distinguishes among three fundamental approaches to distributed AI implementation:<\/p>\n<ul>\n<li>Direct exchange of AI artifacts with their local and central handling based on the rules and algorithms defined by the user<\/li>\n<li>AI artifact handling delegated to specialized frameworks (for example: TensorFlow, PyTorch) with exchange orchestration and various preparatory steps configured on local and the central instances of InterSystems IRIS by the user<\/li>\n<li>Both AI artifact exchange and their handling done via cloud providers (Azure, AWS, GCP) with local and the central instances just sending input data to a cloud provider and receiving back the end result from it<\/li>\n<\/ul>\n<p><a href=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752146095?profile=original\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/storage.ning.com\/topology\/rest\/1.0\/file\/get\/8752146095?profile=RESIZE_710x\" class=\"align-full\"><\/a><\/p>\n<p><em>Figure 8 Fundamental approaches to distributed AI implementation on InterSystems IRIS platform<\/em><\/p>\n<p>These fundamental approaches can be used modified\/combined: in particular, in the concrete scenario described in the previous section of this article (\u201caudit\u201d), the third, \u201ccloud-centric\u201d, approach is used with a split of the \u201cauditor\u201d part into a cloud portion and a local portion executed on the data scientist side (acting as a \u201ccentral instance\u201d).<\/p>\n<p>Theoretical and applied elements that are adding up to the \u201cdistributed artificial intelligence\u201d discipline right now in this reality that we are living, have not yet taken a \u201ccanonical form\u201d, which creates a huge potential for implementation innovations. Our team of experts follows closely the evolution of distributed AI as a discipline, and constructs accelerators for its implementation on InterSystems IRIS platform. We would be glad to share our content and help everyone who finds useful the domain discussed here to start prototyping distributed AI mechanisms.<\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:1046079\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Sergey Lukyanchikov What is Distributed Artificial Intelligence (DAI)? Attempts to find a \u201cbullet-proof\u201d definition have not produced result: it seems like the term is [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2021\/04\/04\/distributed-artificial-intelligence-with-intersystems-iris\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":457,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4541"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=4541"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/4541\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/457"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=4541"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=4541"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=4541"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}