Question: Fwd: Outliers in RNA seq analysis using DESeq2
0
gravatar for Emma Quinn
6.2 years ago by
Emma Quinn20
Emma Quinn20 wrote:
Hi I've conducted a 2 condition RNA seq experiment using "disease" versus "control" cells. I have 16 biological replicates in my disease group and 11 in my control. I'm using DESeq2 v 1.0.9 for the analysis. >From the heatmap and pca plots (attached) its clear that there's some variability amongst the biological replicates in my groups which id expect, but also 6 of my disease samples seem to cluster closely with the controls. All of the samples in each group were prepared in the same way and sequenced together and I can't identify any obvious batch effect that could be contributing to this. I don't have much experience analysing this kind of data and my statistics knowledge is also unfortunately somewhat lacking but I'm wondering if anyone has any experience with regards how well biological replicates from RNA seq data usually cluster together? I'm not sure if its more appropriate to drop these 6 samples and continue the analysis with 10 V 11 in each group or leave them in as perhaps this is more representative of variability of the disease biology. I'd appreciate any advice anyone has! Thanks in advance Emma
deseq2 • 678 views
ADD COMMENTlink modified 6.2 years ago by Michael Love24k • written 6.2 years ago by Emma Quinn20
Answer: Fwd: Outliers in RNA seq analysis using DESeq2
0
gravatar for Michael Love
6.2 years ago by
Michael Love24k
United States
Michael Love24k wrote:
hi Emma, On Wed, May 8, 2013 at 5:24 PM, Emma Quinn <emmamquinn@googlemail.com>wrote: > Hi > > I've conducted a 2 condition RNA seq experiment using "disease" versus > "control" cells. I have 16 biological replicates in my disease group and 11 > in my control. I'm using DESeq2 v 1.0.9 for the analysis. > > >From the heatmap and pca plots (attached) its clear that there's some > variability amongst the biological replicates in my groups which id expect, > but also 6 of my disease samples seem to cluster closely with the controls. > I would try to follow up with more sample preparation information to help explain these 6 samples. Are the size factors and/or total number of mapped reads different for these? You might also want to run some QA packages such as qa() from the ShortRead package. > All of the samples in each group were prepared in the same way and > sequenced together and I can't identify any obvious batch effect that could > be contributing to this. > Were all samples sequenced at the same time, or in different runs? Were the groups balanced across the runs? > > I don't have much experience analysing this kind of data and my statistics > knowledge is also unfortunately somewhat lacking but I'm wondering if > anyone has any experience with regards how well biological replicates from > RNA seq data usually cluster together? I'm not sure if its more > appropriate to drop these 6 samples and continue the analysis with 10 V 11 > in each group or leave them in as perhaps this is more > representative of variability of the disease biology. > It's not appropriate to drop some of the disease samples after seeing they cluster with control. As you can imagine, this could lead to every experiment with enough samples generating significant differences. But I would try to follow up and see what preparation steps might have been different with these. It might be possible to then deal with batch effects by including these variables (for example date of run) as terms in the model, or first running a normalization package such as cqn or EDASeq and then passing this information as a normalization factor as described in the Appendix of the vignette. Mike [[alternative HTML version deleted]]
ADD COMMENTlink written 6.2 years ago by Michael Love24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 214 users visited in the last hour