Question: Analyzing technical replicates with DESeq2
0
gravatar for Michael Muratet
6.0 years ago by
Michael Muratet420 wrote:
Hi Simon I've been using DESeq2 to analyze RNA-seq data I was given for a multi-factor experiment, three factors with two, three and three levels each with three 'replicates' in each cell. I recently learned that the replicates are actually technical, not biological, and I'm looking for the best way to set up the design matrix to take the correlation into account. I don't see anything in the manual or on the list, I'd appreciate your input. Should I use a weighting scheme via normalization factors? Thanks Mike Michael Muratet, Ph.D. Senior Scientist HudsonAlpha Institute for Biotechnology mmuratet at hudsonalpha.org (256) 327-0473 (p) (256) 327-0966 (f) Room 4005 601 Genome Way Huntsville, Alabama 35806
normalization deseq2 • 2.1k views
ADD COMMENTlink modified 6.0 years ago by Ryan C. Thompson7.3k • written 6.0 years ago by Michael Muratet420
Answer: Analyzing technical replicates with DESeq2
0
gravatar for Ryan C. Thompson
6.0 years ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson7.3k wrote:
I believe that the simplest way to deal with technical replicates is to simply add their counts together, so that you have one column for each biological replicate. On Wednesday, May 1, 2013, Michael Muratet wrote: > Hi Simon > > I've been using DESeq2 to analyze RNA-seq data I was given for a > multi-factor experiment, three factors with two, three and three levels > each with three 'replicates' in each cell. I recently learned that the > replicates are actually technical, not biological, and I'm looking for the > best way to set up the design matrix to take the correlation into account. > > I don't see anything in the manual or on the list, I'd appreciate your > input. Should I use a weighting scheme via normalization factors? > > Thanks > > Mike > > > Michael Muratet, Ph.D. > Senior Scientist > HudsonAlpha Institute for Biotechnology > mmuratet@hudsonalpha.org <javascript:;> > (256) 327-0473 (p) > (256) 327-0966 (f) > > Room 4005 > 601 Genome Way > Huntsville, Alabama 35806 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org <javascript:;> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 6.0 years ago by Ryan C. Thompson7.3k
Dear Michael I second Ryan's advice. Rationale: unless there is a catastrophic data quality problem, then the variability from the technical replicates only reflects the (more or less Poissonian) counting noise, and is appropriately taken care of by just adding the counts. The counting noise is small compared to the biological variability for all but the genes with the lowest counts, say below ~200. OTOH, if there is a catastrophic data quality problem, then it's better to just drop the affected sample / lane / library. Best wishes Wolfgang El May 1, 2013, a las 8:25 pm, Ryan Thompson <rct at="" thompsonclan.org=""> escribi?: > I believe that the simplest way to deal with technical replicates is to > simply add their counts together, so that you have one column for each > biological replicate. > > On Wednesday, May 1, 2013, Michael Muratet wrote: > >> Hi Simon >> >> I've been using DESeq2 to analyze RNA-seq data I was given for a >> multi-factor experiment, three factors with two, three and three levels >> each with three 'replicates' in each cell. I recently learned that the >> replicates are actually technical, not biological, and I'm looking for the >> best way to set up the design matrix to take the correlation into account. >> >> I don't see anything in the manual or on the list, I'd appreciate your >> input. Should I use a weighting scheme via normalization factors? >> >> Thanks >> >> Mike >> >> >> Michael Muratet, Ph.D. >> Senior Scientist >> HudsonAlpha Institute for Biotechnology >> mmuratet at hudsonalpha.org <javascript:;> >> (256) 327-0473 (p) >> (256) 327-0966 (f) >> >> Room 4005 >> 601 Genome Way >> Huntsville, Alabama 35806 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org <javascript:;> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLYlink written 6.0 years ago by Wolfgang Huber13k
On 01/05/13 20:25, Ryan Thompson wrote: > I believe that the simplest way to deal with technical replicates is to > simply add their counts together, so that you have one column for each > biological replicate. > > On Wednesday, May 1, 2013, Michael Muratet wrote: > I've been using DESeq2 to analyze RNA-seq data I was given for a > multi-factor experiment, three factors with two, three and three > levels each with three 'replicates' in each cell. I recently learned > that the replicates are actually technical, not biological, and I'm > looking for the best way to set up the design matrix to take the > correlation into account. Ryan is correct: adding up the technical replicates is the way to go. As your design is two-way, you will (hopefully) still have enough degrees of freedom left to estimate the dispersion. Simon
ADD REPLYlink written 6.0 years ago by Simon Anders3.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 277 users visited in the last hour