**30**wrote:

Hi all,

I'm using voom and limma to analyse RNA-seq data. In general, I have followed the basic instructions regarding RNA-seq analysis presented in the limma manual.

The matrix of read counts is filtered so that I keep genes where cpm>=1 in at least 3 samples. This is done using the following code:

keep=rowSums(cpm(matrix)>=1)>=3 matrix=rnaseqMatrix_parents[keep,]

Next, I apply the TMM normalization and use the results as input for voom

DGE=DGEList(matrix) DGE=calcNormFactors(DGE,method =c("TMM")) v=voom(DGE,design,plot=T)

However, with one of my datasets, the plot produced by voom looks generally ok, but with a gap midway

Looking back at the manual for limma, I noticed the following note:

*If the data are very noisy, one can apply the same between-array normalization methods as would be used for microarrays, for example: v <- voom(counts,design,plot=TRUE,normalize="quantile")*

Although I am not certain that my dataset qualifies as "very noisy" (happy to read what people have to say about it or point me to examples) I decided to give it a whirl.

v2=voom(matrix,design,plot=T, normalize="quantile")

The plot looks better than before:

I should finish by saying that I went on and perform the differential expression analysis using the two outputs from voom. Things change marginally (1570 vs 1758 DEG out of 15729 genes at 5% FDR when using the "TMM" vs "quantile" strategy). Not surprisingly, the genes differentially expressed in one configuration only are associated with Q-values close to the 0.05 cut-off.

I am a bit uncertain about using the quantile normalization route (did not find a reference where people had used it), so I am eager to have inputs/guidelines about when to decide using TMM vs quantile norm.

Input on this most welcome!

Thanks!

Jean-Marc

**37k**• written 3.5 years ago by lassancejm •

**30**