question about variance stabilizing transformation and size factors in DESeq2
4
1
Entering edit mode
hanna.sinkko ▴ 10
@hannasinkko-8553
Last seen 8.7 years ago
Finland

I'm very thankful if anybody have time to help! 

Is the estimation of size factors included in VST normalization? I assumed according DESeq2 manual that VST normalization includes the estimation of size factors but I noticed that the size factor after VST normalization was same for all libraries, though the library size varied. I then tried the VST normalization on three different ways and finally succeeded to change size factors.

I wonder that why VST normalization did not estimate size factors correctly or did I something very wrong? Where should I trust? Here is my script:

First I tried this:

#transform a phyloseq object to a deseq

OTU2_deseq<-phyloseq_to_deseq2(OTU2, ~Treatment)

#VST transformation

vsd <- varianceStabilizingTransformation(OTU2_deseq, blind = TRUE)

vsd$sizeFactor

      C1A       C1B       C2B       C3B       C4B       D1A       D2A       D3B       D4A 

0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 

      D4B       M1A       M1B       M2B       M3A       M4B       V1A       V2A       V3A 

0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 

      V4A       V4B 

0.9659363 0.9659363 

The size factor was same for all libraries although there are differences in the library sizes:

   C1A    C1B    C2B    C3B    C4B    D1A    D2A    D3B    D4A    D4B    M1A    M1B    M2B    M3A 

 97884 119267 115434 167045 100795  58687  88702  85306 109512 150311  55833  95364  79377  98857 

   M4B    V1A    V2A    V3A    V4A    V4B 

 99659 120503 113216 118007 102453 146158 

 

Since the size factor was not correct, I tried to correct it before the VST transformation:

#transform phyloseq object to deseq

OTU2_deseq<-phyloseq_to_deseq2(OTU2, ~Treatment)

#estimating size factors from the deseq object

OTU2_deseq<-estimateSizeFactors(OTU2_deseq)

OTU2_deseq$sizeFactor

      C1A       C1B       C2B       C3B       C4B       D1A       D2A       D3B       D4A 

0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 

      D4B       M1A       M1B       M2B       M3A       M4B       V1A       V2A       V3A 

0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 0.9659363 

      V4A       V4B 

0.9659363 0.9659363 

 

So this didn't work either. The factors are still same. Then I tried this:

 

sizefactors_otu2<-estimateSizeFactorsForMatrix(otu2)

sizefactors_otu2

      C1A       C1B       C2B       C3B       C4B       D1A       D2A       D3B       D4A 

0.9483795 1.2267456 1.2178742 1.7675233 1.1336218 0.6019426 0.9482678 0.8143618 1.2240495 

      D4B       M1A       M1B       M2B       M3A       M4B       V1A       V2A       V3A 

1.5083716 0.5034122 0.8343547 0.7268112 0.8225068 0.9190774 1.1940096 1.2189066 1.2699689 

      V4A       V4B 

1.0398748 1.5156871 

 

Now it seemed to work. I continued putting this size factors into the deseq object and 

after that I made an VST normalization again: 

 

sizeFactors(OTU2_deseq)<-sizefactors_otu2

sizeFactors(OTU2_deseq)

      C1A       C1B       C2B       C3B       C4B       D1A       D2A       D3B       D4A 

0.9483795 1.2267456 1.2178742 1.7675233 1.1336218 0.6019426 0.9482678 0.8143618 1.2240495 

      D4B       M1A       M1B       M2B       M3A       M4B       V1A       V2A       V3A 

1.5083716 0.5034122 0.8343547 0.7268112 0.8225068 0.9190774 1.1940096 1.2189066 1.2699689 

      V4A       V4B 

1.0398748 1.5156871 


vsd <- varianceStabilizingTransformation(OTU2_deseq, blind = TRUE)

 

# I checked after the normalization that again and the size factors were still correct:

vsd$sizeFactor

      C1A       C1B       C2B       C3B       C4B       D1A       D2A       D3B       D4A 

0.9483795 1.2267456 1.2178742 1.7675233 1.1336218 0.6019426 0.9482678 0.8143618 1.2240495 

      D4B       M1A       M1B       M2B       M3A       M4B       V1A       V2A       V3A 

1.5083716 0.5034122 0.8343547 0.7268112 0.8225068 0.9190774 1.1940096 1.2189066 1.2699689 

      V4A       V4B 

1.0398748 1.5156871 

 

Regards, 

Hanna Sinkko, PhD

University of Helsinki

deseq2 • 1.8k views
ADD COMMENT
0
Entering edit mode
hanna.sinkko ▴ 10
@hannasinkko-8553
Last seen 8.7 years ago
Finland

And here is the sessionInfo()

sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] phyloseq_1.8.2            DESeq2_1.4.5              RcppArmadillo_0.5.100.1.0
[4] Rcpp_0.11.6               GenomicRanges_1.16.4      GenomeInfoDb_1.0.2       
[7] IRanges_1.22.10           BiocGenerics_0.10.0      

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.26.1 Biobase_2.24.0       Biostrings_2.32.1    DBI_0.3.1           
 [5] MASS_7.3-40          Matrix_1.2-0         RColorBrewer_1.1-2   RJSONIO_1.3-0       
 [9] RSQLite_1.0.0        XML_3.98-1.1         XVector_0.4.0        ade4_1.7-2          
[13] annotate_1.42.1      ape_3.2              biom_0.3.12          chron_2.3-45        
[17] cluster_2.0.1        codetools_0.2-11     colorspace_1.2-6     data.table_1.9.4    
[21] digest_0.6.8         foreach_1.4.2        genefilter_1.46.1    geneplotter_1.42.0  
[25] ggplot2_1.0.1        grid_3.1.2           gtable_0.1.2         igraph_0.7.1        
[29] iterators_1.0.7      lattice_0.20-31      locfit_1.5-9.1       magrittr_1.5        
[33] mgcv_1.8-6           multtest_2.20.0      munsell_0.4.2        nlme_3.1-120        
[37] permute_0.8-4        plyr_1.8.2           proto_0.3-10         reshape2_1.4.1      
[41] scales_0.2.4         splines_3.1.2        stats4_3.1.2         stringi_0.4-1       
[45] stringr_1.0.0        survival_2.38-1      tools_3.1.2          vegan_2.2-1         
[49] xtable_1.7-4         zlibbioc_1.10.0     

ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 12 hours ago
United States

hi Hanna,

Thanks for posting. Your version of DESeq2 is from 2 releases past, I wonder if it's possible for you to reproduce this with the current release version (1.8).

 

ADD COMMENT
0
Entering edit mode
@peter-langfelder-4469
Last seen 1 day ago
United States

Interesting - I don't get the error either in DESeq 1.4-5 nor in the current 1.8-1.  I don't have a reproducible example ready, but my code consists of calling DESeqDataSetFromMatrix and calling varianceStabilizingTransformation on the result.
 

ADD COMMENT
0
Entering edit mode
hanna.sinkko ▴ 10
@hannasinkko-8553
Last seen 8.7 years ago
Finland

Hi

I upgraded Bioconductor and DESeq2 but I got installed only the version 1.6.3. I think I need to update my R to the current version as well. However, I tried to normalize with DESeq2 1.6.3. and did not succeed to produce correct size factors. Interestingly, I noticed that DESeq2 has produced size factors for my other data sets correctly, so it worked for other data but not the data to which I referred here. I have used the same scripts and the same version of DESeq2 for all of my data sets, so it seems that the problem is not in the DESeq2 but in my data. What it is, I don't know since all my data sets are done similarly and looks similar. 

I will anyway update my R to the current version and still try with the newest version of DESeq2 but my feeling is that the problems is somewhere else than in DESeq2.

 

 

 

ADD COMMENT
0
Entering edit mode

After this line:

OTU2_deseq<-phyloseq_to_deseq2(OTU2, ~Treatment)

Can you look at colData(OTU2_deseq)? I wonder if there are size factors already there somehow.

ADD REPLY

Login before adding your answer.

Traffic: 865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6