DEXSeq - Error in estimateSizeFactorsForMatrix(featureCounts(object)
5
0
Entering edit mode
@fahmi-nazarie-8790
Last seen 5.7 years ago
United Kingdom

Hello

I do have a problem here. I am using DEXSeq to analyse exon usage across 27 different human tissues. Each tissues have 3 replicates and some of them have 2 replicates. The error came out when I want to EstimateSizeFactors:

> dxd = estimateSizeFactors(dxd)
Error in estimateSizeFactorsForMatrix(featureCounts(object), locfunc,  :
every gene contains at least one zero, cannot compute log geometric means

class: DEXSeqDataSet
dim: 1 154
exptData(0):
assays(1): counts
rownames(1): ENSG00000000003:E001
rowRanges metadata column names(5): featureID groupID exonBaseMean
exonBaseVar transcripts
colnames: NULL
colData names(4): sample condition libType exon

I removed the zero using

cts <- counts(dxd)
geoMeans <- apply(cts, 1, function(row) if (all(row == 0)) 0 else exp(mean(log(row[row != 0]))))
idx <- which.max(colSums(counts(dxd) == 0))

dxd2 <- dxd[ , -idx]
dxd2 = estimateSizeFactors(dxd2, geoMeans=geoMeans)


Then I got an error

Error in [[<-(*tmp*, name, value = c(0.041799357270166, 0.0721042335357389,  :
152 elements in value to replace 153 elements

class: DEXSeqDataSet
dim: 1 153
exptData(0):
assays(1): counts
rownames(1): ENSG00000000003:E001
rowRanges metadata column names(5): featureID groupID exonBaseMean
exonBaseVar transcripts
colnames: NULL
colData names(4): sample condition libType exon

It seems like the one of the column has been reduced from 154 to 153. How could I solved this problem? Thank you

DEXSeq DESeq2 dexseq deseq2 alternative splicing • 985 views
0
Entering edit mode
@mikelove
Last seen 2 minutes ago
United States

You might have to come up with an alternate size factor correction. Its hard to know what will work without knowing more about the kind of data you have. You've got a mix of solutions, with coming up with an alternate geoMeans as well removing samples which are all zero (do you know why these samples have all zero?) Can you show the following descriptive summaries:

dim(cts)

summary(colSums(cts))

0
Entering edit mode
@fahmi-nazarie-8790
Last seen 5.7 years ago
United Kingdom

I don't think these samples have all zero maybe apart of it only. What is your recommendation for the estimation size factor here? Below are the descriptive summaries.

​> dim(cts)
[1] 579556    154
> summary(colSums(cts))
Min.   1st Qu.    Median      Mean   3rd Qu.      Max.
0.000e+00 3.410e+07 1.029e+08 7.025e+08 1.281e+09 4.061e+09

0
Entering edit mode
@wolfgang-huber-3550
Last seen 6 weeks ago
EMBL European Molecular Biology Laborat…

Fahmi, have a look at the descriptive statistics, they do indicate that indeed at least one sample has all zeros. Remove it from cts, and try again.

0
Entering edit mode
@fahmi-nazarie-8790
Last seen 5.7 years ago
United Kingdom

Gotcha! Thanks Wolgang and Michael. I resolved the issue. As you mention from the descriptive statistics at least one sample has all zeros, definitely something went wrong on count read file on that particular sample. I checked the BAM file that I used for read counting got problem. I will have re-run STAR and do it again.

0
Entering edit mode
@fahmi-nazarie-8790
Last seen 5.7 years ago
United Kingdom

Just one question on computational time of DEXSeq. How long does it take to finish for estimateDispersions step? I set BPPARAM = MulticoreParam(workers=24). I wonder if it will take long time to finish the job. Does it take long time for DEU step too?