Question

DESeq2 : Using Normalised ReadCount matrix from EDAseq in DESeq2

1

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 11.4 years ago

Hi, I wanted to use a normalised read count matrix from EDAseq downstream in DESeq2 analysis. I am not very clear on how to do so from the vignette. Following are the steps I followed - ## EDAseq - normalising count matrix by GC content > dataWithin <- withinLaneNormalization(data, "pct_gc", which = "full") > dataNorm <- betweenLaneNormalization(dataWithin, which = "full") ## I normalised the counts itself instead of generating the offsets as mentioned in the EDAseq vignetter ### DESeq2 > ?? > dds <- estimateDispersions(dds) > dds <- nbinomWaldTest(dds) > res <- results(dds2) I dont know how to create a normalization factor matrix. The DESeq2 vignette on the other hand mentions that normalization factors should be on the scale of the counts, like size factors, and unlike o???sets which are typically on the scale of the predictors (i.e. the logarithmic scale for the negative binomial GLM). So in that case should I generate the offset values from EDAseq ie. > dataWithin <- withinLaneNormalization(data, "pct_gc", which = "full",offset=T) > dataNorm <- betweenLaneNormalization(dataWithin, which = "full",offset=T) > EDASeqNormFactors <- exp(-1 * offst(dataNorm)) > normalizationFactors(dds) <- EDASeqNormFactors > dds <- estimateDispersions(dds) > dds <- nbinomWaldTest(dds) > res <- results(dds2) -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 [4] EDASeq_1.10.0 aroma.light_2.0.0 matrixStats_0.8.14 [7] ShortRead_1.22.0 GenomicAlignments_1.0.1 BSgenome_1.32.0 [10] Rsamtools_1.16.0 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 [13] Biostrings_2.32.0 XVector_0.4.0 IRanges_1.22.7 [16] BiocParallel_0.6.1 Biobase_2.24.0 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] annotate_1.42.0 AnnotationDbi_1.26.0 BatchJobs_1.2 [4] BBmisc_1.6 bitops_1.0-6 brew_1.0-6 [7] codetools_0.2-8 DBI_0.2-7 DESeq_1.16.0 [10] digest_0.6.4 fail_1.2 foreach_1.4.2 [13] genefilter_1.46.1 geneplotter_1.42.0 grid_3.1.0 [16] hwriter_1.3 iterators_1.0.7 lattice_0.20-29 [19] latticeExtra_0.6-26 locfit_1.5-9.1 plyr_1.8.1 [22] RColorBrewer_1.0-5 R.methodsS3_1.6.1 R.oo_1.18.0 [25] RSQLite_0.11.4 sendmailR_1.1-2 splines_3.1.0 [28] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 [31] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3 [34] zlibbioc_1.10.0 -- Sent via the guest posting facility at bioconductor.org.

Normalization EDASeq DESeq2 Normalization EDASeq DESeq2 • 2.3k views

ADD COMMENT • link updated 11.6 years ago by Michael Love 43k • written 11.6 years ago by Guest User ★ 13k

score 1 · Answer 1 · 2014-06-26

hi Aditi, Yes, the code in the DESeq2 vignette under 'Sample-/gene-dependent normalization factors' for the EDASeq handoff is still correct (same as you have above). Note, you can use the wrapper function DESeq(), and it will use your normalization factors rather than estimating size factors. Mike On Thu, Jun 26, 2014 at 9:16 AM, Aditi [guest] <guest@bioconductor.org> wrote: > Hi, > > I wanted to use a normalised read count matrix from EDAseq downstream in > DESeq2 analysis. I am not very clear on how to do so from the vignette. > > Following are the steps I followed - > > ## EDAseq - normalising count matrix by GC content > > > dataWithin <- withinLaneNormalization(data, "pct_gc", which = "full") > > dataNorm <- betweenLaneNormalization(dataWithin, which = "full") > > ## I normalised the counts itself instead of generating the offsets as > mentioned in the EDAseq vignetter > > ### DESeq2 > > > ?? > > dds <- estimateDispersions(dds) > > dds <- nbinomWaldTest(dds) > > res <- results(dds2) > > I dont know how to create a normalization factor matrix. The DESeq2 > vignette on the other hand mentions that normalization factors should be on > the scale of the counts, like size factors, > and unlike oï¬sets which are typically on the scale of the predictors (i.e. > the logarithmic scale for the > negative binomial GLM). > > So in that case should I generate the offset values from EDAseq ie. > > > dataWithin <- withinLaneNormalization(data, "pct_gc", which = > "full",offset=T) > > dataNorm <- betweenLaneNormalization(dataWithin, which = "full",offset=T) > > EDASeqNormFactors <- exp(-1 * offst(dataNorm)) > > normalizationFactors(dds) <- EDASeqNormFactors > > dds <- estimateDispersions(dds) > > dds <- nbinomWaldTest(dds) > > res <- results(dds2) > > > > > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] DESeq2_1.4.5 RcppArmadillo_0.4.300.0 Rcpp_0.11.1 > [4] EDASeq_1.10.0 aroma.light_2.0.0 matrixStats_0.8.14 > [7] ShortRead_1.22.0 GenomicAlignments_1.0.1 BSgenome_1.32.0 > [10] Rsamtools_1.16.0 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 > [13] Biostrings_2.32.0 XVector_0.4.0 IRanges_1.22.7 > [16] BiocParallel_0.6.1 Biobase_2.24.0 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] annotate_1.42.0 AnnotationDbi_1.26.0 BatchJobs_1.2 > [4] BBmisc_1.6 bitops_1.0-6 brew_1.0-6 > [7] codetools_0.2-8 DBI_0.2-7 DESeq_1.16.0 > [10] digest_0.6.4 fail_1.2 foreach_1.4.2 > [13] genefilter_1.46.1 geneplotter_1.42.0 grid_3.1.0 > [16] hwriter_1.3 iterators_1.0.7 lattice_0.20-29 > [19] latticeExtra_0.6-26 locfit_1.5-9.1 plyr_1.8.1 > [22] RColorBrewer_1.0-5 R.methodsS3_1.6.1 R.oo_1.18.0 > [25] RSQLite_0.11.4 sendmailR_1.1-2 splines_3.1.0 > [28] stats4_3.1.0 stringr_0.6.2 survival_2.37-7 > [31] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3 > [34] zlibbioc_1.10.0 > > > -- > Sent via the guest posting facility at bioconductor.org. > [[alternative HTML version deleted]]