Hi,
I use microarray data. I'm using "oligo" R package for background correction and normalisation of expression values. After normalisation I want to calculate Z-score to generate a heatmap.
As they are around 25,000 genes with expression values in the matrix, I want to create a heatmap with only top 10% highly variable genes.
Looking for a best statistical way to select top 10% highly variable genes with which I can plot a heatmap.
With some google search I found the following one:
"normdata" is a matrix with 25,000 genes after background correction and normalisation.
x <- apply(normdata, 1, IQR) #Calculate IQR
y <- normdata[x > quantile(x, 0.9), ] #selecting top 10% highly variable genes
Do you think the above code is the right way to select top 10% highly variable genes?
Thank you
Dear James,
Thanks for the reply. I'm not asking about the which is faster. I'm asking whether the above given code can be used for selecting top 10% highly variable genes or not.
And one more question is - Do I need to select top 10% highly variable genes before normalisation or after normalisation?
Thank you