DESeq2 dispersion estimate gets stuck
1
0
Entering edit mode
@carl-herrmann-6485
Last seen 9.6 years ago
Hi, I am analysing a patient dataset for diff. expression with DESeq2 (v 1.2.10). We have several patients, and for each patient we have several samples. I am running this analysis on different combinations of samples. My design has two factors : patient (multilevel) and a second factor (proliferation, true/false) so : design ~ patient + proliferation I am interested in diff. expression depending on the proliferation only (second factor). On some combinations, the gene dispersion estimation step (estimateDispersion) gets stuck at the gene-wise dispersion estimate. I let it run for several hours, and eventually killed the job. When reducing the number of iterations to 10, it gets stuck. Same problem with version DESeq2 v 1.4.0 Strangely, this happens when including samples that, when included in other combinations, run just fine. So this does not seem to be a problem of the dataset itself. I am not sure whether this is enough to get a hint on where the problem might come from, so please tell me whether I should provide additional information. Thanks for your help ! Carl -- ------------------------------------------------------------------ C a r l H E R R M A N N ------------------------------------------------------------------ Institut f?r Pharmazie und Molekulare Biotechnologie Universit?t Heidelberg DKFZ Heidelberg - Department of Theoretical Bioinformatics Im Neuenheimer Feld 580 D-69120 Heidelberg tel.: +49 (0) 6221 42-3612 email : c.herrmann at dkfz.de web: http://biologie.univ-mrs.fr/carlherrmann
DESeq2 DESeq2 • 1.2k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 16 minutes ago
United States
hi Carl, Thanks for reporting this. I had one user report such a hang for version 1.2, and had implemented a fix for this in version 1.4, so I'm not sure about this one. I could look into this if you could email me off list a small example. Is it possible to find a small subset of rows which produce the hang? ddssub <- dds[1:100,] ddssub <- estimateDispersions(ddssub) And then to remove any identifying information, colData(ddssub) <- colData(ddssub)[, c("patient","proliferation") ] levels(ddssub$patient) <- letters[ 1:nlevels(ddssub$patient) ] Mike On Fri, May 9, 2014 at 7:51 AM, Carl Herrmann <c.herrmann at="" dkfz-heidelberg.de=""> wrote: > Hi, > > I am analysing a patient dataset for diff. expression with DESeq2 (v > 1.2.10). > > We have several patients, and for each patient we have several samples. > I am running this analysis on different combinations of samples. > > My design has two factors : patient (multilevel) and a second factor > (proliferation, true/false) > so : design ~ patient + proliferation > > I am interested in diff. expression depending on the proliferation only > (second factor). > > On some combinations, the gene dispersion estimation step > (estimateDispersion) gets stuck at the gene-wise dispersion estimate. > I let it run for several hours, and eventually killed the job. > > When reducing the number of iterations to 10, it gets stuck. Same problem > with version DESeq2 v 1.4.0 > > Strangely, this happens when including samples that, when included in other > combinations, run just fine. > So this does not seem to be a problem of the dataset itself. > > I am not sure whether this is enough to get a hint on where the problem > might come from, so please tell me whether I should provide additional > information. > > Thanks for your help ! > > Carl > > -- > ------------------------------------------------------------------ > C a r l H E R R M A N N > ------------------------------------------------------------------ > Institut f?r Pharmazie und Molekulare Biotechnologie > Universit?t Heidelberg > DKFZ Heidelberg - Department of Theoretical Bioinformatics > Im Neuenheimer Feld 580 > D-69120 Heidelberg > tel.: +49 (0) 6221 42-3612 > email : c.herrmann at dkfz.de > web: http://biologie.univ-mrs.fr/carlherrmann > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Just for users who end up on this thread through search: updating to DESeq2 v1.4.5 clears up this hang. Mike On Sat, May 10, 2014 at 8:19 AM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: > hi Carl, > > Thanks for reporting this. I had one user report such a hang for > version 1.2, and had implemented a fix for this in version 1.4, so I'm > not sure about this one. I could look into this if you could email me > off list a small example. Is it possible to find a small subset of > rows which produce the hang? > > ddssub <- dds[1:100,] > ddssub <- estimateDispersions(ddssub) > > And then to remove any identifying information, > > colData(ddssub) <- colData(ddssub)[, c("patient","proliferation") ] > levels(ddssub$patient) <- letters[ 1:nlevels(ddssub$patient) ] > > Mike > > On Fri, May 9, 2014 at 7:51 AM, Carl Herrmann > <c.herrmann at="" dkfz-heidelberg.de=""> wrote: >> Hi, >> >> I am analysing a patient dataset for diff. expression with DESeq2 (v >> 1.2.10). >> >> We have several patients, and for each patient we have several samples. >> I am running this analysis on different combinations of samples. >> >> My design has two factors : patient (multilevel) and a second factor >> (proliferation, true/false) >> so : design ~ patient + proliferation >> >> I am interested in diff. expression depending on the proliferation only >> (second factor). >> >> On some combinations, the gene dispersion estimation step >> (estimateDispersion) gets stuck at the gene-wise dispersion estimate. >> I let it run for several hours, and eventually killed the job. >> >> When reducing the number of iterations to 10, it gets stuck. Same problem >> with version DESeq2 v 1.4.0 >> >> Strangely, this happens when including samples that, when included in other >> combinations, run just fine. >> So this does not seem to be a problem of the dataset itself. >> >> I am not sure whether this is enough to get a hint on where the problem >> might come from, so please tell me whether I should provide additional >> information. >> >> Thanks for your help ! >> >> Carl >> >> -- >> ------------------------------------------------------------------ >> C a r l H E R R M A N N >> ------------------------------------------------------------------ >> Institut f?r Pharmazie und Molekulare Biotechnologie >> Universit?t Heidelberg >> DKFZ Heidelberg - Department of Theoretical Bioinformatics >> Im Neuenheimer Feld 580 >> D-69120 Heidelberg >> tel.: +49 (0) 6221 42-3612 >> email : c.herrmann at dkfz.de >> web: http://biologie.univ-mrs.fr/carlherrmann >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 884 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6