High variation among biological replicates. Options for correction during DE analysis
1
0
Entering edit mode
hac141 • 0
@hac141-11750
Last seen 7.3 years ago

Hello Bioconductor community,

I have 6 RNA bacterial libraries from 3 biological replicates (control and treatment). PCA analysis shows that one of the control libraries clusters away from the other 2 controls (in fact, closer to the treatment libraries). When I do the DE analysis on edgeR including this "outlier" library, I get no significant regulation. Removing it, on the other hand, gives me a decent number of genes. I am not sure if removing the library from the analysis, purely based on the PCA plot, is correct, so I wonder if there's a way to correct for biological replication during the DE analysis? 

 

Thank you

rnaseq • 2.2k views
ADD COMMENT
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 14 hours ago
The city by the bay

I don't know what you mean by "correct for biological replication". Variability between replicates is inherent to your biological or experimental system. The aim of edgeR and similar packages is to model the variability, rather than "correcting" it in any sense. (If we could do that, we wouldn't need replicates.) In larger experiments, you might include known or empirical blocking factors, which would "correct" for these factors of variation among replicates. However, this is unlikely to be helpful here, given you only have one suspect library.

It is generally dangerous to remove samples just because they don't look nice on a PCA plot. This is especially true when you only have a small number of replicates to begin with, as you may end up manufacturing clean-looking differences between conditions where there are actually none. I would encourage you to investigate why this library seems to be misbehaving. For example, did sequencing fail for this library (e.g., low library sizes)? What are the genes that are driving the separation (e.g., stress response/heat shock proteins)? There may be an obvious experimental cause for this separation, allowing you to discard the (low-quality) library in a principled manner. If not... well, then your system is just that variable, and the variance of your replicates is appropriately large.

Also see A: edgeR - MDS Plot for Count Data.

P.S. Add the edgeR tag, otherwise the maintainers don't get notified.

ADD COMMENT

Login before adding your answer.

Traffic: 874 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6