Hello,
I have a piece of code that is shown below. The code is a nested for loop where the inner loop is dependent on the outer loop. The outer most loop "i" value loops for every chromosome. The "j" value loops for every cytoband in a chromosome. The "k" value loops for every sample.
for (i = 1: NumberOfChromosomes) { for (j = 1: NumberOfCytobandsInEachChromosome) { for (k = 1: TotalNumberOfSamples) { z_1 = NULL #reset this value before the calculation for every k # do something with z_1 to get z_2 # do something with z_2 to get z_3 x[k, j] = z_3 #store the output value into a matrix } # end of k loop y [[i]] = x } # end of j loop } # end of i loop
I would like to make this code faster and more efficient. Could anyone suggest a way to use one of the apply functions on this ? I have used apply (lapply and mapply) before, but never on nested for loops, so not sure how to do this.
Any help would be great. Thank you.
One thing that the apply functions do for you is to pre-allocate space for the result; if the code had set
y = list()
, theny[[i]] = x
would run in quadratic time. This can be seen even in this simple exampleAlso, sometimes it seems that writing something as an apply() makes it more obvious how it should be vectorized (a single function call, rather than iteration), and vectorization is where real speed benefits can occur.
Good point. Plus, it also makes it easier to switch to parallelized versions like
bplapply
.That said, trying to cram a complicated piece of code into a function to use in
apply
doesn't seem ideal for readability.