Incorrect result in edgeR.calculateCommonDispersion
1
0
Entering edit mode
@jacob-silterra-6587
Last seen 11.4 years ago
Hello all, I've encountered an issue with edgeR when it calculates dispersion, and there aren't any samples for a given group. I believe it happens with both tagwise and common dispersion; same idea. Basically splitIntoGroups will return an empty matrix for that group, which messes up the dispersion calculation. I think it would be better to ignore groups that have no data associated with them. Example attached. This might seem unnecessary, but I have a situation where I read in a matrix with samples of different classes and then remove some groups entirely Thanks, -- Jacob Silterra Associate Computational Biologist Broad Institute
edgeR edgeR • 863 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia
Dear Jacob, There is no function called edgeR.calculateCommonDispersion in the edgeR package. There also wasn't any attachment with your posting. If you subset a DGEList in such a way that a group is removed entirely, you can prevent any problems by resetting the levels of the group factor: dge$samples$group <- factor(dge$samples$group) Best wishes Gordon ----------- original message ------------ Jacob Silterra jacob at broadinstitute.org Wed Jun 4 19:45:50 CEST 2014 Hello all, I've encountered an issue with edgeR when it calculates dispersion, and there aren't any samples for a given group. I believe it happens with both tagwise and common dispersion; same idea. Basically splitIntoGroups will return an empty matrix for that group, which messes up the dispersion calculation. I think it would be better to ignore groups that have no data associated with them. Example attached. This might seem unnecessary, but I have a situation where I read in a matrix with samples of different classes and then remove some groups entirely Thanks, -- Jacob Silterra Associate Computational Biologist Broad Institute ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT
0
Entering edit mode
Hi Gordon, Thanks for the info. My apologies for being unclear, I meant the function estimateCommonDisp (and estimateTagwiseDisp) in the package edgeR. I guess the attachment didn't go through, I've pasted it below -Jacob R script: library(edgeR) groups <- factor(c("A", "A", "B", "B", "C", "C")) rows <- 10 cols <- 6 counts <- matrix( rnorm(rows*cols,mean=100,sd=20), nrow=rows, ncol=cols) counts <- round(counts) #Everything runs smoothly y <- DGEList(counts=counts,group=groups) y <- calcNormFactors(y) y <- estimateCommonDisp(y) print(y$common.disp) #[1] 0.0310142 #Take out samples from group "B", estimating the dispersion fails sel_cols <- c(1,2,5,6) counts <- counts[,sel_cols] groups <- groups[sel_cols] y <- DGEList(counts=counts,group=groups) y <- calcNormFactors(y) y <- estimateCommonDisp(y) print(y$common.disp) #[1] 99.99477 print(warnings()) On Wed, Jun 4, 2014 at 8:21 PM, Gordon K Smyth <smyth@wehi.edu.au> wrote: > Dear Jacob, > > There is no function called edgeR.calculateCommonDispersion in the edgeR > package. > > There also wasn't any attachment with your posting. > > If you subset a DGEList in such a way that a group is removed entirely, > you can prevent any problems by resetting the levels of the group factor: > > dge$samples$group <- factor(dge$samples$group) > > Best wishes > Gordon > > > > ----------- original message ------------ > Jacob Silterra jacob at broadinstitute.org > Wed Jun 4 19:45:50 CEST 2014 > > > Hello all, > > I've encountered an issue with edgeR when it calculates dispersion, and > there aren't any samples for a given group. I believe it happens with both > tagwise and common dispersion; same idea. Basically splitIntoGroups will > return an empty matrix for that group, which messes up the dispersion > calculation. I think it would be better to ignore groups that have no data > associated with them. Example attached. This might seem unnecessary, but I > have a situation where I read in a matrix with samples of different classes > and then remove some groups entirely > > Thanks, > -- > Jacob Silterra > Associate Computational Biologist > Broad Institute > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:17}}
ADD REPLY
0
Entering edit mode
Hi Jacob, Yes, I see the issue. The edgeR routines assume that y$samples$group doesn't have superfluous factor levels. The culprit is: groups <- groups[sel_cols] If you change this to groups <- factor(groups[sel_cols]) all will be well. Best wishes Gordon On Wed, 4 Jun 2014, Jacob Silterra wrote: > Hi Gordon, > > Thanks for the info. My apologies for being unclear, I meant the function > estimateCommonDisp (and estimateTagwiseDisp) in the package edgeR. I guess > the attachment didn't go through, I've pasted it below > > -Jacob > > R script: > library(edgeR) > > > groups <- factor(c("A", "A", "B", "B", "C", "C")) > rows <- 10 > cols <- 6 > counts <- matrix( rnorm(rows*cols,mean=100,sd=20), nrow=rows, ncol=cols) > counts <- round(counts) > > #Everything runs smoothly > y <- DGEList(counts=counts,group=groups) > y <- calcNormFactors(y) > y <- estimateCommonDisp(y) > print(y$common.disp) > #[1] 0.0310142 > > #Take out samples from group "B", estimating the dispersion fails > sel_cols <- c(1,2,5,6) > counts <- counts[,sel_cols] > groups <- groups[sel_cols] > y <- DGEList(counts=counts,group=groups) > y <- calcNormFactors(y) > y <- estimateCommonDisp(y) > print(y$common.disp) > #[1] 99.99477 > print(warnings()) > > > On Wed, Jun 4, 2014 at 8:21 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > >> Dear Jacob, >> >> There is no function called edgeR.calculateCommonDispersion in the edgeR >> package. >> >> There also wasn't any attachment with your posting. >> >> If you subset a DGEList in such a way that a group is removed entirely, >> you can prevent any problems by resetting the levels of the group factor: >> >> dge$samples$group <- factor(dge$samples$group) >> >> Best wishes >> Gordon >> >> >> >> ----------- original message ------------ >> Jacob Silterra jacob at broadinstitute.org >> Wed Jun 4 19:45:50 CEST 2014 >> >> >> Hello all, >> >> I've encountered an issue with edgeR when it calculates dispersion, and >> there aren't any samples for a given group. I believe it happens with both >> tagwise and common dispersion; same idea. Basically splitIntoGroups will >> return an empty matrix for that group, which messes up the dispersion >> calculation. I think it would be better to ignore groups that have no data >> associated with them. Example attached. This might seem unnecessary, but I >> have a situation where I read in a matrix with samples of different classes >> and then remove some groups entirely >> >> Thanks, >> -- >> Jacob Silterra >> Associate Computational Biologist >> Broad Institute ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD REPLY

Login before adding your answer.

Traffic: 1295 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6