Hello all,
I've encountered an issue with edgeR when it calculates dispersion,
and
there aren't any samples for a given group. I believe it happens with
both
tagwise and common dispersion; same idea. Basically splitIntoGroups
will
return an empty matrix for that group, which messes up the dispersion
calculation. I think it would be better to ignore groups that have no
data
associated with them. Example attached. This might seem unnecessary,
but I
have a situation where I read in a matrix with samples of different
classes
and then remove some groups entirely
Thanks,
--
Jacob Silterra
Associate Computational Biologist
Broad Institute
Dear Jacob,
There is no function called edgeR.calculateCommonDispersion in the
edgeR
package.
There also wasn't any attachment with your posting.
If you subset a DGEList in such a way that a group is removed
entirely,
you can prevent any problems by resetting the levels of the group
factor:
dge$samples$group <- factor(dge$samples$group)
Best wishes
Gordon
----------- original message ------------
Jacob Silterra jacob at broadinstitute.org
Wed Jun 4 19:45:50 CEST 2014
Hello all,
I've encountered an issue with edgeR when it calculates dispersion,
and
there aren't any samples for a given group. I believe it happens with
both
tagwise and common dispersion; same idea. Basically splitIntoGroups
will
return an empty matrix for that group, which messes up the dispersion
calculation. I think it would be better to ignore groups that have no
data
associated with them. Example attached. This might seem unnecessary,
but I
have a situation where I read in a matrix with samples of different
classes and then remove some groups entirely
Thanks,
--
Jacob Silterra
Associate Computational Biologist
Broad Institute
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}
Hi Gordon,
Thanks for the info. My apologies for being unclear, I meant the
function
estimateCommonDisp (and estimateTagwiseDisp) in the package edgeR. I
guess
the attachment didn't go through, I've pasted it below
-Jacob
R script:
library(edgeR)
groups <- factor(c("A", "A", "B", "B", "C", "C"))
rows <- 10
cols <- 6
counts <- matrix( rnorm(rows*cols,mean=100,sd=20), nrow=rows,
ncol=cols)
counts <- round(counts)
#Everything runs smoothly
y <- DGEList(counts=counts,group=groups)
y <- calcNormFactors(y)
y <- estimateCommonDisp(y)
print(y$common.disp)
#[1] 0.0310142
#Take out samples from group "B", estimating the dispersion fails
sel_cols <- c(1,2,5,6)
counts <- counts[,sel_cols]
groups <- groups[sel_cols]
y <- DGEList(counts=counts,group=groups)
y <- calcNormFactors(y)
y <- estimateCommonDisp(y)
print(y$common.disp)
#[1] 99.99477
print(warnings())
On Wed, Jun 4, 2014 at 8:21 PM, Gordon K Smyth <smyth@wehi.edu.au>
wrote:
> Dear Jacob,
>
> There is no function called edgeR.calculateCommonDispersion in the
edgeR
> package.
>
> There also wasn't any attachment with your posting.
>
> If you subset a DGEList in such a way that a group is removed
entirely,
> you can prevent any problems by resetting the levels of the group
factor:
>
> dge$samples$group <- factor(dge$samples$group)
>
> Best wishes
> Gordon
>
>
>
> ----------- original message ------------
> Jacob Silterra jacob at broadinstitute.org
> Wed Jun 4 19:45:50 CEST 2014
>
>
> Hello all,
>
> I've encountered an issue with edgeR when it calculates dispersion,
and
> there aren't any samples for a given group. I believe it happens
with both
> tagwise and common dispersion; same idea. Basically splitIntoGroups
will
> return an empty matrix for that group, which messes up the
dispersion
> calculation. I think it would be better to ignore groups that have
no data
> associated with them. Example attached. This might seem unnecessary,
but I
> have a situation where I read in a matrix with samples of different
classes
> and then remove some groups entirely
>
> Thanks,
> --
> Jacob Silterra
> Associate Computational Biologist
> Broad Institute
>
>
______________________________________________________________________
> The information in this email is confidential and
inte...{{dropped:17}}
Hi Jacob,
Yes, I see the issue. The edgeR routines assume that y$samples$group
doesn't have superfluous factor levels. The culprit is:
groups <- groups[sel_cols]
If you change this to
groups <- factor(groups[sel_cols])
all will be well.
Best wishes
Gordon
On Wed, 4 Jun 2014, Jacob Silterra wrote:
> Hi Gordon,
>
> Thanks for the info. My apologies for being unclear, I meant the
function
> estimateCommonDisp (and estimateTagwiseDisp) in the package edgeR. I
guess
> the attachment didn't go through, I've pasted it below
>
> -Jacob
>
> R script:
> library(edgeR)
>
>
> groups <- factor(c("A", "A", "B", "B", "C", "C"))
> rows <- 10
> cols <- 6
> counts <- matrix( rnorm(rows*cols,mean=100,sd=20), nrow=rows,
ncol=cols)
> counts <- round(counts)
>
> #Everything runs smoothly
> y <- DGEList(counts=counts,group=groups)
> y <- calcNormFactors(y)
> y <- estimateCommonDisp(y)
> print(y$common.disp)
> #[1] 0.0310142
>
> #Take out samples from group "B", estimating the dispersion fails
> sel_cols <- c(1,2,5,6)
> counts <- counts[,sel_cols]
> groups <- groups[sel_cols]
> y <- DGEList(counts=counts,group=groups)
> y <- calcNormFactors(y)
> y <- estimateCommonDisp(y)
> print(y$common.disp)
> #[1] 99.99477
> print(warnings())
>
>
> On Wed, Jun 4, 2014 at 8:21 PM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote:
>
>> Dear Jacob,
>>
>> There is no function called edgeR.calculateCommonDispersion in the
edgeR
>> package.
>>
>> There also wasn't any attachment with your posting.
>>
>> If you subset a DGEList in such a way that a group is removed
entirely,
>> you can prevent any problems by resetting the levels of the group
factor:
>>
>> dge$samples$group <- factor(dge$samples$group)
>>
>> Best wishes
>> Gordon
>>
>>
>>
>> ----------- original message ------------
>> Jacob Silterra jacob at broadinstitute.org
>> Wed Jun 4 19:45:50 CEST 2014
>>
>>
>> Hello all,
>>
>> I've encountered an issue with edgeR when it calculates dispersion,
and
>> there aren't any samples for a given group. I believe it happens
with both
>> tagwise and common dispersion; same idea. Basically splitIntoGroups
will
>> return an empty matrix for that group, which messes up the
dispersion
>> calculation. I think it would be better to ignore groups that have
no data
>> associated with them. Example attached. This might seem
unnecessary, but I
>> have a situation where I read in a matrix with samples of different
classes
>> and then remove some groups entirely
>>
>> Thanks,
>> --
>> Jacob Silterra
>> Associate Computational Biologist
>> Broad Institute
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}