Question: Are two biological replicates enough for a transcriptome analysis using DESeq2?
0
gravatar for Jonas B.
20 months ago by
Jonas B.0
Belgium, Antwerp, University of Antwerp
Jonas B.0 wrote:

Hi all,

 

We are performing a transcriptome analysis with DESeq2.

We have three factors: treatment (control and drought), climate (CO2 and ambient) and zone in the leaf (zone1, zone2, zone3).

For each possible combination, we have 3 biological replicates (no technical replicates). The PCA showed us that we have 2 outliers. They are not in the same group, so this leaves us with 2 biological replicates for two of our sample groups. 

 

I've been looking on the internet for information on how many biological replicates I need at minimum using DESeq2. I did find some information ( https://assets.geneious.com/manual/10.2/GeneiousManualsu100.html ), but I want to be sure before continuing. Running DESeq2 does not prompt an error message. Removing the two outliers results in more significant genes. 

 

Thank you for your advice and time.

 

Kind regards,
Jonas

Antwerp University - Belgium 

deseq2 • 1.4k views
ADD COMMENTlink modified 20 months ago by Sean Davis21k • written 20 months ago by Jonas B.0
Answer: Are two biological replicates enough for a transcriptome analysis using DESeq2?
2
gravatar for Sean Davis
20 months ago by
Sean Davis21k
United States
Sean Davis21k wrote:

Two replicates is fine, operationally. Whether or not two replicates is "enough" for your experiment is something that is hard to comment on without more information.

As an aside, I would note that using PCA to remove "outliers" when there are only three samples per group runs the risk of removing "real" biological variation.

ADD COMMENTlink written 20 months ago by Sean Davis21k

Dear Sean,

Thank you for your quick response and for noting the possibility of removing biological variation by removing outliers based on a PCA.

To respond to your suggestion about the outliers: The effect of sampling zone in the leaf is quite dominant and these two samples did not match there zone (when clustered), where in two previous transcriptome studies (with a similar setup) and the current one (the remaining 34) all the samples clustered really nicely together for the zones. 

Concerning the number of replicates: An article ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4878611/ ) mentioned that at least 6 replicates should be used in RNAseq analysis to find significantly differentially expressed (SDE) genes. We're sometimes limited in the number of replicates we can run during an experiment, as it is the case here...

As we are quite sure that we are talking about outliers here, we would like to leave out these two samples. This also resulted in more SDE genes. I was afraid that the increase in SDE genes could not be trusted, since we only had 2 biological replicates in two of our sample groups... 

Feel free to respond if you have any more suggestions, doubts or ideas.

Thanks again for your quick response and advice!

ADD REPLYlink modified 20 months ago • written 20 months ago by Jonas B.0
2

A small comment re: "at least six replicates should be used". It's a bit more subtle. Take a look at their Figure 1b, for edgeR. You'll find a similar curve for most RNA-seq tools. On the x axis is number of replicates, and the top curves show sensitivity. The 4 curves are for those genes with |LFC| > 0, 0.3, 1, and 2. The far left represents two replicates, where you see ~75% sensitivity for genes with |LFC| > 1. It's also about 50% sensitivity for |LFC| > 0.3. So not to say that 2 replicates (with low biological variability) is sufficient for all purposes, but rather that investigators should know they will only recover genes with the largest effect size.

ADD REPLYlink written 20 months ago by Michael Love25k
2

Regarding the sample size paper, see my replies to Nicholas Schurch and Conrad Burden here: https://f1000research.com/articles/5-1438

I agree with Michael that the true situation is more nuanced than the authors of that paper make out and the main issue is power rather than validity. In most of my work, insisting on 6 replicates per group would be an irresponsible waste of taxpayers' money.

ADD REPLYlink modified 20 months ago • written 20 months ago by Gordon Smyth38k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 209 users visited in the last hour