Question

Differential expression testing for groups with unequal variances/dispersions?

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

Hi Ryan, edgeR can't. voom can, but you have to put it together partly yourself. Just fit voom to each timepoint separately, then cbind the voom output objects back together. Or else just proceed in edgeR as if the dispersions are equal across timepoints. This will be conservative but won't give false positive results. Best wishes Gordon > Date: Fri, 24 May 2013 12:10:09 -0700 > From: "Ryan C. Thompson" <rct at="" thompsonclan.org=""> > To: bioconductor <bioconductor at="" r-project.org=""> > Subject: [BioC] Differential expression testing for groups with > unequal variances/dispersions? > > Hi all, > > I am studying a ChIP-Seq dataset (looking at gene promoter regions in > human) where it appears that different experimental groups have widely > different dispersions/variances using edgeR/limma. I have 4 timepoints, > and if I use edgeR to compute the dispersion for each timepoint > separately, I get: > > 0 hours: 0.407 > 24 hours: 0.505 > 120 hours: 0.115 > 2 weeks: 0.0531 > > So the dispersion seems to range from 0.05 to 0.5. I am looking to test > for "differential modification" between these timepoints, as well as > between cell types at each timepoint, etc., and I was wondering if there > is any differential expression test (or dispersion estimation method?) > that can handle groups with different dispersions/variances. > > For reference, here is my experimenal design as an Excel spreadsheet: > https://www.dropbox.com/s/3vnk4mai3dh39yv/chipseq-samples.xlsx > > And here is the result of plotBCV on each group (look at the last 4 > pages for the time point groups): > https://www.dropbox.com/s/s4caq1p0h3e4zhm/groupdisps.pdf (Warning: big > PDF with lots of points which may bring your PDF reader to its knees.) > > -Ryan Thompson ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

edgeR edgeR • 1.4k views

ADD COMMENT • link updated 10.9 years ago by Ryan C. Thompson ★ 7.9k • written 10.9 years ago by Gordon Smyth 50k

score 0 · Answer 1 · 2013-05-25

Hi Gordon, Thanks for the tips. You say that edgeR should be conservative when the equal dispersion assumption is violated, but this is not my experience. (I probably wouldn't have asked here on the list unless I was worried about false positives.) What I've seen is that will all 4 groups included in a single analysis, the low-dispersion time points drag down to overall dispersion estimate, and this results in (apparently) anticonservative results when testing for differential modification between the two high-dispersion time points. Obviously, I don't have a gold standard to compare against to conclude that the test is anticonservative, but I can compare to the results to previous analyses that I did before the final low-dispersion time point had come off the sequencer, and as expected, including the low-dispersion timepoint inflated the significance of most P-values in all contrasts. So, to get around this, would you recommend testing between time points by first subsetting the DGEList to just the two time points being compared and then re-estimating the dispersions, then finally conducting the test? That way, each individual test would be "self-contained" and not affected by groups that are not being tested. I could imagine that under these conditions, edgeR might be conservative, as you say. -Ryan Thompson On Sat May 25 04:28:39 2013, Gordon K Smyth wrote: > Hi Ryan, > > edgeR can't. > > voom can, but you have to put it together partly yourself. Just fit > voom to each timepoint separately, then cbind the voom output objects > back together. > > Or else just proceed in edgeR as if the dispersions are equal across > timepoints. This will be conservative but won't give false positive > results. > > Best wishes > Gordon > >> Date: Fri, 24 May 2013 12:10:09 -0700 >> From: "Ryan C. Thompson" <rct at="" thompsonclan.org=""> >> To: bioconductor <bioconductor at="" r-project.org=""> >> Subject: [BioC] Differential expression testing for groups with >> unequal variances/dispersions? >> >> Hi all, >> >> I am studying a ChIP-Seq dataset (looking at gene promoter regions in >> human) where it appears that different experimental groups have widely >> different dispersions/variances using edgeR/limma. I have 4 timepoints, >> and if I use edgeR to compute the dispersion for each timepoint >> separately, I get: >> >> 0 hours: 0.407 >> 24 hours: 0.505 >> 120 hours: 0.115 >> 2 weeks: 0.0531 >> >> So the dispersion seems to range from 0.05 to 0.5. I am looking to test >> for "differential modification" between these timepoints, as well as >> between cell types at each timepoint, etc., and I was wondering if there >> is any differential expression test (or dispersion estimation method?) >> that can handle groups with different dispersions/variances. >> >> For reference, here is my experimenal design as an Excel spreadsheet: >> https://www.dropbox.com/s/3vnk4mai3dh39yv/chipseq-samples.xlsx >> >> And here is the result of plotBCV on each group (look at the last 4 >> pages for the time point groups): >> https://www.dropbox.com/s/s4caq1p0h3e4zhm/groupdisps.pdf (Warning: big >> PDF with lots of points which may bring your PDF reader to its knees.) >> >> -Ryan Thompson > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:6}}