Sample sizes must be equal for TMM?
1
0
Entering edit mode
@humberto_munoz-10903
Last seen 6.1 years ago

I want to compute the TMM for two RNASeq-samples in two different conditions. The number of DE genes is different in each sample. In order to compute the TMM normalization is required that the samples have the same genes listed in the corresponding order? If it is the case, how I can do this arrangement?

Thanks so much.

TMM edgeR • 793 views
0
Entering edit mode

Gordon,

Thanks for your answer. The case is that in my two samples the genes with zero-read count have been trimmed, in the reference sample are 2814 and in the other 2809 from a set of total 2883 genes. Anyway, I have tried to use the readDGE function, considering my two samples with CSV (comma-separated) delimiter on the Desktop and I got this error message.

> files <- dir(pattern="*\\.csv\$")

> RG <- readDGE(files, path=NULL, columns=c(1,2), group=NULL, labels=NULL)
Error in [.data.frame(d[[i]], , columns[2]) :
undefined columns selected

Thanks for any help.

2
Entering edit mode
@gordon-smyth
Last seen 2 minutes ago
WEHI, Melbourne, Australia

No, there is no requirement for equal sample sizes. If you can run edgeR for differential expression, then you can also call calcNormFactors() to do TMM normalization. There is no additional requirement.

To be honest, I can't really figure out what your concern is. It isn't meaningful to say that the number of DE genes is different in each sample -- that is actually impossible, because the number of DE genes is a characteristic of the whole experiment, not something that varies from sample to sample.

Perhaps you are worried that the number of DE genes must be somehow balanced in each direction, equal up and down. That also is not necessary.