Question

Are two biological replicates enough for a transcriptome analysis using DESeq2?

0

Entering edit mode

Jonas B. ▴ 40

@jonas-b-14652

Last seen 5.1 years ago

Belgium, Antwerp, University of Antwerp

Hi all,

We are performing a transcriptome analysis with DESeq2.

We have three factors: treatment (control and drought), climate (CO2 and ambient) and zone in the leaf (zone1, zone2, zone3).

For each possible combination, we have 3 biological replicates (no technical replicates). The PCA showed us that we have 2 outliers. They are not in the same group, so this leaves us with 2 biological replicates for two of our sample groups.

I've been looking on the internet for information on how many biological replicates I need at minimum using DESeq2. I did find some information ( https://assets.geneious.com/manual/10.2/GeneiousManualsu100.html ), but I want to be sure before continuing. Running DESeq2 does not prompt an error message. Removing the two outliers results in more significant genes.

Thank you for your advice and time.

Kind regards,
Jonas

Antwerp University - Belgium

deseq2 • 5.3k views

ADD COMMENT • link updated 6.8 years ago by Sean Davis 21k • written 6.8 years ago by Jonas B. ▴ 40

score 2 · Accepted Answer · 2018-01-29

2

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 3 months ago

United States

Two replicates is fine, operationally. Whether or not two replicates is "enough" for your experiment is something that is hard to comment on without more information.

As an aside, I would note that using PCA to remove "outliers" when there are only three samples per group runs the risk of removing "real" biological variation.

ADD COMMENT • link 6.8 years ago Sean Davis 21k

0

Entering edit mode

Dear Sean,

Thank you for your quick response and for noting the possibility of removing biological variation by removing outliers based on a PCA.

To respond to your suggestion about the outliers: The effect of sampling zone in the leaf is quite dominant and these two samples did not match there zone (when clustered), where in two previous transcriptome studies (with a similar setup) and the current one (the remaining 34) all the samples clustered really nicely together for the zones.

Concerning the number of replicates: An article ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4878611/ ) mentioned that at least 6 replicates should be used in RNAseq analysis to find significantly differentially expressed (SDE) genes. We're sometimes limited in the number of replicates we can run during an experiment, as it is the case here...

As we are quite sure that we are talking about outliers here, we would like to leave out these two samples. This also resulted in more SDE genes. I was afraid that the increase in SDE genes could not be trusted, since we only had 2 biological replicates in two of our sample groups...

Feel free to respond if you have any more suggestions, doubts or ideas.

Thanks again for your quick response and advice!

ADD REPLY • link 6.8 years ago Jonas B. ▴ 40

2

Entering edit mode

A small comment re: "at least six replicates should be used". It's a bit more subtle. Take a look at their Figure 1b, for edgeR. You'll find a similar curve for most RNA-seq tools. On the x axis is number of replicates, and the top curves show sensitivity. The 4 curves are for those genes with |LFC| > 0, 0.3, 1, and 2. The far left represents two replicates, where you see ~75% sensitivity for genes with |LFC| > 1. It's also about 50% sensitivity for |LFC| > 0.3. So not to say that 2 replicates (with low biological variability) is sufficient for all purposes, but rather that investigators should know they will only recover genes with the largest effect size.

ADD REPLY • link 6.8 years ago Michael Love 43k

2

Entering edit mode

Regarding the sample size paper, see my replies to Nicholas Schurch and Conrad Burden here: https://f1000research.com/articles/5-1438

I agree with Michael that the true situation is more nuanced than the authors of that paper make out and the main issue is power rather than validity. In most of my work, insisting on 6 replicates per group would be an irresponsible waste of taxpayers' money.

ADD REPLY • link 6.8 years ago Gordon Smyth 51k