{"id":2471,"date":"2019-08-16T06:43:35","date_gmt":"2019-08-16T06:43:35","guid":{"rendered":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/16\/different-kinds-of-loops-in-r\/"},"modified":"2019-08-16T06:43:35","modified_gmt":"2019-08-16T06:43:35","slug":"different-kinds-of-loops-in-r","status":"publish","type":"post","link":"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/16\/different-kinds-of-loops-in-r\/","title":{"rendered":"Different kinds of loops in R."},"content":{"rendered":"<p>Author: Frank Raulf<\/p>\n<div>\n<p>Normally, it is better to avoid loops in R. But for highly individual tasks a vectorization is not always possible. Hence, a loop is needed \u2013 if the problem is decomposable.<\/p>\n<p>Which different kinds of loops exist in R and which one to use in which situation?<\/p>\n<p>In each programming language, for- and while-loops (sometimes until-loops) exist. These loops are sequential and not that fast \u2013 in R.<\/p>\n<\/p>\n<p><em>for(i in x)<\/em><\/p>\n<p><em>{task}<\/em><\/p>\n<p>\u00a0<\/p>\n<p><em>i=y<\/em><\/p>\n<p><em>while(i<=x)<\/em><\/p>\n<p><em>{task<\/em><\/p>\n<p><em>i=i+1}<\/em><\/p>\n<p>\u00a0<\/p>\n<p>Even for prototyping sometimes too slow.<\/p>\n<p>But how to improve speed?<\/p>\n<p>There are three options in R:<\/p>\n<ol>\n<li>apply loops<\/li>\n<li>parallelization<\/li>\n<li>RCPP<\/li>\n<\/ol>\n<p><b>apply loops:<\/b><\/p>\n<p>Normally, you can use apply for calculating some standard statistics of the columns, the rows, or both. But you can use a trick to adjust the apply order for a loop. The syntax is:<\/p>\n<\/p>\n<p><em>F <- function(i, x, y, z,\u2026)<\/em><\/p>\n<p><em>{task}<\/em><\/p>\n<p><em>apply(as.data.frame(1:length(vector)), margin = 1, FUN = F)<\/em><\/p>\n<\/p>\n<p>In this case you use the vector not for direct calculation but as an index \u201ci\u201d instead.<\/p>\n<p>The sapply order is even faster.<\/p>\n<\/p>\n<p><em>F <- function(i, x, y, z,\u2026)<\/em><\/p>\n<p><em>{task}<\/em><\/p>\n<p><em>sapply(1:length(vector), FUN = F)<\/em><\/p>\n<\/p>\n<p><strong>Parallelization:<\/strong><\/p>\n<p>You can use loops and apply orders also in parallel. You need:<\/p>\n<\/p>\n<p><em>library(&#8220;doParallel&#8221;)<\/em><\/p>\n<p><em>library(&#8220;parallel&#8221;)<\/em><\/p>\n<p><em>library(&#8220;foreach&#8221;)<\/em><\/p>\n<p>\u00a0<\/p>\n<p>Firstly defining the number of cores. Leave out at least one:<\/p>\n<p>\u00a0<\/p>\n<p><em>NumOfCores <- detectCores() - 1<\/em><\/p>\n<p><em>registerDoParallel(NumOfCores)<\/em><\/p>\n<p>\u00a0<\/p>\n<p>Either using a loop:<\/p>\n<p>\u00a0<\/p>\n<p><em>foreach::foreach(x = 1:length(vector), .combine = rbind, .inorder = T, .multicombine = F) %dopar%<\/em><\/p>\n<p><em>{task}<\/em><\/p>\n<p>\u00a0<\/p>\n<p>This loop creates a vector of results.<\/p>\n<p>If the order is not important you can increase performance by <em>.inorder = F<\/em>. This means that a free processor takes the next iteration independent from the sequence of the iterations.<\/p>\n<p>\u00a0<\/p>\n<p>Or using a parSapply order:<\/p>\n<p>\u00a0<\/p>\n<p><em>clusters <- makeCluster(NumOfCores)<\/em><\/p>\n<p><em>parSapply(cl = clusters, X = 1:length(vector), FUN = F, x = x, y = y, z = z,\u2026 )<\/em><\/p>\n<p><em>\u00a0<\/em><\/p>\n<p>In this case it is important to integrate the data in reference within the parentheses \u2013 you cannot directly connect to the workspace like in the ordinary sapply order.<\/p>\n<p>\u00a0<\/p>\n<p>RCPP:<\/p>\n<p>\u00a0<\/p>\n<p>Firstly you need to install RTools.<\/p>\n<p>\u00a0<\/p>\n<p><em>library(&#8220;Rcpp&#8221;)<\/em><\/p>\n<p>\u00a0<\/p>\n<p>define a function in C++, create a shared library and compile the code.<\/p>\n<p>\u00a0<\/p>\n<p><em>#include <Rcpp.h><\/em><\/p>\n<p><em>using namespace Rcpp;<\/em><\/p>\n<p><em>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/em><\/p>\n<p><em>\/\/ [[Rcpp::export]]<\/em><\/p>\n<p><em>double NameOfFunction (NumericVector Vector)<\/em><\/p>\n<p><em>{task}<\/em><\/p>\n<p>\u00a0<\/p>\n<p>Then you can call it in R:<\/p>\n<p>\u00a0<\/p>\n<p><em>sapply(X = 1:length(testVec), FUN = NameOfFunction, y = Vector)<\/em><\/p>\n<p><em>\u00a0<\/em><\/p>\n<p>But when to use which kind of loop?<\/p>\n<p>\u00a0<\/p>\n<p>Judging from the experience, I recommend to make the decision dependent from the number of iterations and the costs of each iteration.<\/p>\n<p>\u00a0<\/p>\n<table>\n<tbody>\n<tr>\n<td width=\"201\">\n<p><em>\u00a0<\/em><\/p>\n<\/td>\n<td width=\"201\">\n<p><strong><em>Not costly<\/em><\/strong><\/p>\n<\/td>\n<td width=\"201\">\n<p><strong><em>costly<\/em><\/strong><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td width=\"201\">\n<p><strong><em>Low number of iterations<\/em><\/strong><\/p>\n<\/td>\n<td width=\"201\">\n<p>for-loop, while-loop<\/p>\n<\/td>\n<td width=\"201\">\n<p>RCPP, foreach<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td width=\"201\">\n<p><strong><em>Large number of iterations<\/em><\/strong><\/p>\n<\/td>\n<td width=\"201\">\n<p>RCPP, sapply, apply, lapply<\/p>\n<\/td>\n<td width=\"201\">\n<p>parSapply, RCPP<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\u00a0<\/p>\n<p>\u00a0<\/p>\n<\/div>\n<p><a href=\"https:\/\/www.datasciencecentral.com\/xn\/detail\/6448529:BlogPost:867530\">Go to Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author: Frank Raulf Normally, it is better to avoid loops in R. But for highly individual tasks a vectorization is not always possible. Hence, a [&hellip;] <span class=\"read-more-link\"><a class=\"read-more\" href=\"https:\/\/www.aiproblog.com\/index.php\/2019\/08\/16\/different-kinds-of-loops-in-r\/\">Read More<\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":469,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[26],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2471"}],"collection":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/comments?post=2471"}],"version-history":[{"count":0,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/posts\/2471\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media\/456"}],"wp:attachment":[{"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/media?parent=2471"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/categories?post=2471"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aiproblog.com\/index.php\/wp-json\/wp\/v2\/tags?post=2471"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}