DEXSeq - Error in estimateSizeFactorsForMatrix(featureCounts(object)
5
0
Entering edit mode
@fahmi-nazarie-8790
Last seen 9.1 years ago
United Kingdom

Hello

I do have a problem here. I am using DEXSeq to analyse exon usage across 27 different human tissues. Each tissues have 3 replicates and some of them have 2 replicates. The error came out when I want to EstimateSizeFactors:

> dxd = estimateSizeFactors(dxd)
Error in estimateSizeFactorsForMatrix(featureCounts(object), locfunc,  :
  every gene contains at least one zero, cannot compute log geometric means

> head(dxd)
class: DEXSeqDataSet
dim: 1 154
exptData(0):
assays(1): counts
rownames(1): ENSG00000000003:E001
rowRanges metadata column names(5): featureID groupID exonBaseMean
  exonBaseVar transcripts
colnames: NULL
colData names(4): sample condition libType exon

 

I removed the zero using 

 

cts <- counts(dxd)
geoMeans <- apply(cts, 1, function(row) if (all(row == 0)) 0 else exp(mean(log(row[row != 0]))))
idx <- which.max(colSums(counts(dxd) == 0))

dxd2 <- dxd[ , -idx]
dxd2 = estimateSizeFactors(dxd2, geoMeans=geoMeans)

Then I got an error

Error in `[[<-`(`*tmp*`, name, value = c(0.041799357270166, 0.0721042335357389,  :
  152 elements in value to replace 153 elements

> head(dxd2)
class: DEXSeqDataSet
dim: 1 153
exptData(0):
assays(1): counts
rownames(1): ENSG00000000003:E001
rowRanges metadata column names(5): featureID groupID exonBaseMean
  exonBaseVar transcripts
colnames: NULL
colData names(4): sample condition libType exon

It seems like the one of the column has been reduced from 154 to 153. How could I solved this problem? Thank you

 

DEXSeq DESeq2 dexseq deseq2 alternative splicing • 2.4k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 8 minutes ago
United States

You might have to come up with an alternate size factor correction. Its hard to know what will work without knowing more about the kind of data you have. You've got a mix of solutions, with coming up with an alternate geoMeans as well removing samples which are all zero (do you know why these samples have all zero?) Can you show the following descriptive summaries:

dim(cts)

summary(colSums(cts))

ADD COMMENT
0
Entering edit mode
@fahmi-nazarie-8790
Last seen 9.1 years ago
United Kingdom

I don't think these samples have all zero maybe apart of it only. What is your recommendation for the estimation size factor here? Below are the descriptive summaries.

 

​> dim(cts)
[1] 579556    154
> summary(colSums(cts))
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
0.000e+00 3.410e+07 1.029e+08 7.025e+08 1.281e+09 4.061e+09

 

ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 10 weeks ago
EMBL European Molecular Biology Laborat…

Fahmi, have a look at the descriptive statistics, they do indicate that indeed at least one sample has all zeros. Remove it from cts, and try again.

ADD COMMENT
0
Entering edit mode
@fahmi-nazarie-8790
Last seen 9.1 years ago
United Kingdom

Gotcha! Thanks Wolgang and Michael. I resolved the issue. As you mention from the descriptive statistics at least one sample has all zeros, definitely something went wrong on count read file on that particular sample. I checked the BAM file that I used for read counting got problem. I will have re-run STAR and do it again. 

ADD COMMENT
0
Entering edit mode
@fahmi-nazarie-8790
Last seen 9.1 years ago
United Kingdom

Just one question on computational time of DEXSeq. How long does it take to finish for estimateDispersions step? I set BPPARAM = MulticoreParam(workers=24). I wonder if it will take long time to finish the job. Does it take long time for DEU step too? 

ADD COMMENT

Login before adding your answer.

Traffic: 858 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6