Counting loop in r

#COUNTING LOOP IN R HOW TO#

You could parallelize an inner loop instead, but that could be inefficient because you’re repeatedly waiting for all the results to be returned every time through the outer loop. However, if the outer loop doesn’t have many iterations and the tasks are already large, parallelizing the outer loop results in a small number of huge tasks, which may not allow you to use all of your processors, and can also result in load balancing problems. This results in larger individual tasks, and larger tasks can often be performed more efficiently than smaller tasks. The standard advice is to parallelize the outer loop. Follow Musings on Quantitative Palaeoecology on WordPress.When parallelizing nested for loops, there is always a question of which loop to parallelize.The tidyverse solution is complicated because the count sums overlap. Percent = count/relevant_count_sum * 100)įrom this object, the percent, taxon and siteID can be selected and then spread. Relevant_count_sum = if_else(sum = "Aquatic", total_count_sum, count_sum), Gather(key = taxon, value = count, -siteID) %>% Rowid_to_column(var = "siteID") %>% #need identifier One solution is to separate the different groups into separate ames, calculate percent, and then bond the ames back together. First we need a dictionary, which you would normally have already. The BCI tree data obviously doesn’t have any aquatic macrophytes, but I’m going to treat all species beginning with ‘A’ as aquatic, ‘T’ as tree, ‘S’ as shrub and other taxa as herbs. With pollen, however, there can be complexities, for example, we might want to have trees, shrubs and upland herbs to be in the terrestrial pollen sum (T + S + H), and aquatic macrophytes part of the total pollen sum (T + S + H + A). So far, this has assumed that all the taxa are part of a single count sum as will typically be the case for diatoms, chironomids, or planktic foraminifera. This could be appended directly onto the previous code. We can convert this thin object back to a fat format with spread (the opposite of gather). Head(BCI_percent2 %>% filter(count > 0))# show some data Mutate(percent = count / sum(count) * 100) Group_by(siteID) %>% #do calculations by siteID

Gather(key = taxon, value = count, -siteID) %>% #make into thin format This is usually fine because we need the pure species x sites ame for ordinations etc One solution is to select the columns with the count data, and process these as above.

There are a few different solutions to this. To calculate percent, we need to divide the counts by the count sums for each sample, and then multiply by 100.īCI_percent % rowid_to_column(var = "siteID") Here I use the BCI count data from the vegan package. The counts, and only the counts, are in a species x sites (columns x rows) matrix or ame.

#COUNTING LOOP IN R HOW TO#

Rather than discussing that paper, which has certain interesting aspects, today I am going to show how to calculate percent from count data in R. Yesterday, I found that a recently published paper had archived percent data that were impossible in several ways. From looking at archived data, I realise that what should be an easy process goes wrong far more often that it should (which is of course never). Micropaleontologists and others often want to calculate percent from count data.