Search
Question: DESeq2 - Error in estimateSizeFactorsForMatrix(counts(object)
0
gravatar for Fahmi Nazarie
2.2 years ago by
United Kingdom
Fahmi Nazarie0 wrote:

Dear Michael Love

I got a problem running DESeq2. Briefly, I have 27 different human tissue and each of them has 3 replicates and some of them only has 2 replicates with a total 76 samples. I want to study the differential expression across the human tissue. I wanna get which gene has differential alternative splicing across tissues. 

 

sampleFiles <- c ( "brain_3b" , "brain_3c" , "brain_a" , ..."Skin"), 

samplesCondition <- c ( "Brain" , "Brain" , "Brain" , ..."Skin"),

sampleNames <-c ( "ERR315477_count.txt" , "ERR315455_count.txt" , "ERR315432_count.txt", ..."ERR315339_count.txt")

 

sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleNames, condition=sampleCondition)

ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design=~condition)
ddsHTSeq

 

> dds <- DESeq(ddsHTSeq)

estimating size factors

Error in estimateSizeFactorsForMatrix(counts(object), locfunc = locfunc,  :

  every gene contains at least one zero, cannot compute log geometric means

 

So, I checked the '0' value and produced below;

 

> rs <- rowSums( counts(ddsHTSeq) == 0 )

> table(rs)

rs

     1      2      3      4      5      6      7      8      9     10     11

163108  27467  16926  12820   9801   8075   6833   6311   5607   5181   4867

    12     13     14     15     16     17     18     19     20     21     22

  4638   4417   4325   4102   3824   3784   3687   3560   3389   3398   3325

    23     24     25     26     27     28     29     30     31     32     33

  3311   3216   3259   3099   3196   3038   3128   3102   3139   3121   3027

    34     35     36     37     38     39     40     41     42     43     44

  3143   3118   3036   3105   3234   3213   3261   3251   3171   3245   3233

    45     46     47     48     49     50     51     52     53     54     55

  3363   3387   3523   3388   3553   3486   3547   3810   3730   3832   4027

    56     57     58     59     60     61     62     63     64     65     66

  4044   4163   4305   4424   4526   4761   4929   4941   5446   5519   5810

    67     68     69     70     71     72     73     74     75     76

  6139   6567   7147   7480   8735   9786  12177  18243  10859  84717

 

Additional information about sampleTable

 

> head(sampleTable)

  sampleName            fileName condition

1   brain_3b ERR315477_count.txt     Brain

2   brain_3c ERR315455_count.txt     Brain

3    brain_a ERR315432_count.txt     Brain

4 thyroid_5b ERR315412_count.txt   Thyroid

5 thyroid_5c ERR315363_count.txt   Thyroid

6 thyroid_5d ERR315397_count.txt   Thyroid

 

> sessionInfo()

R version 3.2.0 (2015-04-16)

Platform: x86_64-unknown-linux-gnu (64-bit)

Running under: Scientific Linux release 6.5 (Carbon)

 

locale:

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C

 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8

 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8

 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C

 [9] LC_ADDRESS=C               LC_TELEPHONE=C

[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 

attached base packages:

[1] parallel  stats4    stats     graphics  grDevices utils     datasets

[8] methods   base

 

other attached packages:

[1] BiocParallel_1.2.20       DESeq2_1.8.1

[3] RcppArmadillo_0.5.400.2.0 Rcpp_0.12.0

[5] GenomicRanges_1.20.5      GenomeInfoDb_1.4.2

[7] IRanges_2.2.4             S4Vectors_0.6.5

[9] BiocGenerics_0.14.0

 

loaded via a namespace (and not attached):

 [1] RColorBrewer_1.1-2   futile.logger_1.4.1  plyr_1.8.3

 [4] XVector_0.8.0        futile.options_1.0.0 tools_3.2.0

 [7] rpart_4.1-10         digest_0.6.8         RSQLite_1.0.0

[10] annotate_1.46.1      gtable_0.1.2         lattice_0.20-33

[13] DBI_0.3.1            proto_0.3-10         gridExtra_2.0.0

[16] genefilter_1.50.0    stringr_1.0.0        cluster_2.0.3

[19] locfit_1.5-9.1       nnet_7.3-11          grid_3.2.0

[22] Biobase_2.28.0       AnnotationDbi_1.30.1 XML_3.98-1.2

[25] survival_2.38-3      foreign_0.8-66       latticeExtra_0.6-26

[28] Formula_1.2-1        geneplotter_1.46.0   ggplot2_1.0.1

[31] reshape2_1.4.1       lambda.r_1.1.7       magrittr_1.5

[34] scales_0.3.0         Hmisc_3.16-0         MASS_7.3-44

[37] splines_3.2.0        xtable_1.7-4         colorspace_1.2-6

[40] stringi_0.5-5        acepack_1.3-3.3      munsell_0.4.2

 

Is there any way to solve this problem? Thanks in advance.

ADD COMMENTlink modified 2.1 years ago • written 2.2 years ago by Fahmi Nazarie0
0
gravatar for Michael Love
2.2 years ago by
Michael Love15k
United States
Michael Love15k wrote:

Take a look at this previous answer:

A: DESeq2 Error in varianceStabilizingTransformation

ADD COMMENTlink written 2.2 years ago by Michael Love15k
0
gravatar for Fahmi Nazarie
2.2 years ago by
United Kingdom
Fahmi Nazarie0 wrote:

Thank Michael for replying my message. I have followed your advice, but I have got an error. I removed all the gene with 0 value by using 

dds = ddsHTSeq[ rowSums(counts(ddsHTSeq)) > 5, ]

Then 

 

cts = counts(dds)

 geoMeans = apply(cts, 1, function(row) if (all(row == 0)) 0 else exp(mean(log(row[row != 0]))))

dds = estimateSizeFactors(dds, geoMeans=geoMeans)

But the error came out

Error in if (any(value <= 0)) { : missing value where TRUE/FALSE needed

Even if I add TRUE/FALSE function it still doesn't work

geoMeans = apply(cts, 1, function(row) if (all(row == 0 , na.rm = TRUE, zneg.rm = FALSE)) 0 else exp(mean(log(row[row != 0]), na.rm = TRUE, zneg.rm = FALSE)))​

Are there any missing things I missed here? Thanks

 

 

ADD COMMENTlink written 2.2 years ago by Fahmi Nazarie0

When exactly did the error occur?

After the error occurs can you run

traceback()

Can you show:

summary(geoMeans)
ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Michael Love15k

Oops, I meant traceback().

Hmm I can't reproduce this error on my end. I have a new question though:

Are the zeros coming from a single sample? Can you do: 

table(colSums(counts(dds) == 0))

And then see if you have a single sample with many zeros? You could remove this one if it is an outlier.

idx <- which.max(colSums(counts(dds) == 0))

dds2 <- dds[ , -idx]
ADD REPLYlink written 2.2 years ago by Michael Love15k
0
gravatar for Fahmi Nazarie
2.2 years ago by
United Kingdom
Fahmi Nazarie0 wrote:

The error occured when I ran this code, which after I filtered out '0' value in my dataset.

dds = estimateSizeFactors(dds, geoMeans=geoMeans)

 

I am not sure if there is a package called trackback().

> trackback()

Error: could not find function "trackback"

 

Here is the summary for geoMeans.

> summary(geoMeans)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
     1.0      3.1      8.0     44.8     30.5 330200.0
ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Fahmi Nazarie0
0
gravatar for Fahmi Nazarie
2.1 years ago by
United Kingdom
Fahmi Nazarie0 wrote:

Hi Michael

I solved the problem. How do I get list of all genes (row) vs tissues (27 different tissues - column) with all the p-value or p-adjsuted value. I just could not find the command line in any questions or manual. Thanks

ADD COMMENTlink written 2.1 years ago by Fahmi Nazarie0
1

Fahim, I faced a similar problem. Could you please tell me how did you solve the issue?

ADD REPLYlink written 3 months ago by ghanbari.msc10
You can make a results table with the results() function. This table has p values and adjusted p values for all genes. Look at rownames(res). You can index the results table using square brackets: res[idx, ]
ADD REPLYlink written 2.1 years ago by Michael Love15k

What I do is like this as says in vignette

res<- results(dds2, contrast=c("condition" , "Brain" , "Thyroid" ))

But I have 27 different tissues, if I make results one by one it will take an ages to finish. How can I extract all these in one command and place it in one file of 'res'?

 

ADD REPLYlink written 2.1 years ago by Fahmi Nazarie0
We don't have a function for this. You can write a simple for loop in an R script (there are many free online resources for learning R) or you could ask someone who is familiar with R at your institution to help.
ADD REPLYlink written 2.1 years ago by Michael Love15k

Alright, thanks Michael for your help.

ADD REPLYlink written 2.1 years ago by Fahmi Nazarie0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 237 users visited in the last hour