Search
Question: multiple level factors contrast in nbinomLRT, DESeq2.
1
gravatar for shao
3.2 years ago by
shao80
Germany
shao80 wrote:
Dear list, I am not sure the correct way to interpret the nbinomLRT results in multiple level factors condition. Here is a toy example. I would to find DE genes after controlling batch effect in the experiments, in which there are multiple levels. ## make data library(DESeq2)dds <- makeExampleDESeqDataSet(m = 18)colData <- data.frame(row.names = rownames(colData(dds)), sample = colData(dds)$sample, condition = rep(LETTERS[1:6], each = 3), batch = factor(rep(c(1, 1, 2), 6)))dds <- DESeqDataSetFromMatrix(counts(dds), colData = colData, design = ~ batch + condition)dds <- estimateSizeFactors(dds)dds <- estimateDispersions(dds)## LRTestdds <- nbinomLRT(dds, reduced = formula(~ batch ))resultsNames(dds)[1] "Intercept" "batch_2_vs_1" "condition_B_vs_A" "condition_C_vs_A" "condition_D_vs_A" "condition_E_vs_A" "condition_F_vs_A"res.1 <- results(dds, name = "condition_B_vs_A")res.2 <- results(dds, contrast = list("condition_B_vs_A", "batch_2_vs_1") Questions:1.) I would like to get the DE genes between B and A, while controlling for the batch effect. Should I take "res.1" or "res.2", or both are wrong? and what is the correct way to do it? 2.) Why "res.3 <- results(dds, contrast = c("condition", "A", "b"))" gave error: "Error in normalizeSingleBracketSubscript(j, x) : subscript contains invalid names" 3.) In order to get DE genes between conditions not directly listed in the "resultsNames", is the following codes correct? should "batch_2_vs_1" be included int the contrast?##for example, DE between C and F, controlling for batch effect;res.4 <- results(dds, contrast = list("condition_C_vs_A", "condition_F_vs_A") Best, Chunxuan [[alternative HTML version deleted]]
ADD COMMENTlink modified 3.2 years ago by Michael Love14k • written 3.2 years ago by shao80
0
gravatar for shao
3.2 years ago by
shao80
Germany
shao80 wrote:
Dear list, I am not sure the correct way to interpret the nbinomLRT results in multiple level factors condition. Here is a toy example. I would to find DE genes after controlling batch effect in the experiments, in which there are multiple levels. ## make data library(DESeq2) dds <- makeExampleDESeqDataSet(m = 18) colData <- data.frame(row.names = rownames(colData(dds)), sample = colData(dds)$sample, condition = rep(LETTERS[1:6], each = 3), batch = factor(rep(c(1, 1, 2), 6))) dds <- DESeqDataSetFromMatrix(counts(dds), colData = colData, design = ~ batch + condition) dds <- estimateSizeFactors(dds) dds <- estimateDispersions(dds) ## LRTest dds <- nbinomLRT(dds, reduced = formula(~ batch )) resultsNames(dds) [1] "Intercept" "batch_2_vs_1" "condition_B_vs_A" "condition_C_vs_A" "condition_D_vs_A" "condition_E_vs_A" "condition_F_vs_A" res.1 <- results(dds, name = "condition_B_vs_A") res.2 <- results(dds, contrast = list("condition_B_vs_A", "batch_2_vs_1") Questions: 1.) I would like to get the DE genes between B and A, while controlling for the batch effect. Should I take "res.1" or "res.2", or both are wrong? and what is the correct way to do it? 2.) Why "res.3 <- results(dds, contrast = c("condition", "A", "b"))" gave error: "Error in normalizeSingleBracketSubscript(j, x) : subscript contains invalid names" 3.) In order to get DE genes between conditions not directly listed in the "resultsNames", is the following codes correct? should "batch_2_vs_1" be included int the contrast?##for example, DE between C and F, controlling for batch effect;res.4 <- results(dds, contrast = list("condition_C_vs_A", "condition_F_vs_A") > sessionInfo()R version 3.1.0 (2014-04-10) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel graphics grDevices utils datasets stats grid methods base other attached packages: [1] DESeq2_1.4.5 RcppArmadillo_0.4.320.0 Rcpp_0.11.2 GenomicRanges_1.16.3 GenomeInfoDb_1.0.2 IRanges_1.22.10 BiocGenerics_0.10.0 [8] magrittr_1.0.1 dplyr_0.2 RColorBrewer_1.0-5 reshape2_1.4 ggplot2_1.0.0 loaded via a namespace (and not attached): [1] annotate_1.42.1 AnnotationDbi_1.26.0 assertthat_0.1 Biobase_2.24.0 colorspace_1.2-4 DBI_0.2-7 digest_0.6.4 genefilter_1.46.1 [9] geneplotter_1.42.0 gtable_0.1.2 lattice_0.20-29 locfit_1.5-9.1 MASS_7.3-33 munsell_0.4.2 plyr_1.8.1 proto_0.3-10 [17] R.methodsS3_1.6.1 R.oo_1.18.0 RSQLite_0.11.4 scales_0.2.4 splines_3.1.0 stats4_3.1.0 stringr_0.6.2 survival_2.37-7 [25] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 Best, Chunxuan PS, sorry for the previous one, I forgot the session, and have no idea why the format is a mess. [[alternative HTML version deleted]]
ADD COMMENTlink written 3.2 years ago by shao80
Hi On 20/08/14 12:43, sh. chunxuan wrote: > I am not sure the correct way to interpret the nbinomLRT results in > multiple level factors condition. Is there a reason why you decided to use 'nbinomLRT' rather than 'nbinomWaldTest', which is suggested in the vignette for standard use cases? > Here is a toy example. I would to find DE genes after controlling > batch effect in the experiments, in which there are multiple levels. In your case, the LRT allows you to find genes which are affected by the experimental condition in _some_ way, i.e., for which you can reject the null hypothesis "This gene is expressed at the same level in _all_ six conditions; i.e., it is not affected by the experimental condition at all." However, you want to perform specific contrasts: > 1.) I would like to get the DE genes between B and A, while > controlling for the batch effect. Should I take "res.1" or "res.2", or > both are wrong? and what is the correct way to do it? This is much easier with a Wald test. You just run 'nbinomWaldTest' and then ask for results( dds, contrast = c( "condition", "B", "A" ) ) Simon
ADD REPLYlink written 3.2 years ago by Simon Anders3.4k
0
gravatar for Michael Love
3.2 years ago by
Michael Love14k
United States
Michael Love14k wrote:
Hi Chunxuan, On Aug 20, 2014 12:38 PM, "sh. chunxuan" <hibergo at="" outlook.com=""> wrote: > > > > > Dear list, > I am not sure the correct way to interpret the nbinomLRT results in multiple level factors condition. > Here is a toy example. I would to find DE genes after controlling batch effect in the experiments, in which there are multiple levels. > ## make data library(DESeq2)dds <- makeExampleDESeqDataSet(m = 18)colData <- data.frame(row.names = rownames(colData(dds)), sample = colData(dds)$sample, condition = rep(LETTERS[1:6], each = 3), batch = factor(rep(c(1, 1, 2), 6)))dds <- DESeqDataSetFromMatrix(counts(dds), colData = colData, design = ~ batch + condition)dds <- estimateSizeFactors(dds)dds <- estimateDispersions(dds)## LRTestdds <- nbinomLRT(dds, reduced = formula(~ batch ))resultsNames(dds)[1] "Intercept" "batch_2_vs_1" "condition_B_vs_A" "condition_C_vs_A" "condition_D_vs_A" "condition_E_vs_A" "condition_F_vs_A"res.1 <- results(dds, name = "condition_B_vs_A")res.2 <- results(dds, contrast = list("condition_B_vs_A", "batch_2_vs_1") > Questions:1.) I would like to get the DE genes between B and A, while controlling for the batch effect. You should use the default Wald test for such a comparison, not the LRT. See the vignette for an explanation of the difference between the two tests. So simply use DESeq() and then use: results(dds, contrast = c("condition", "B", "A")) > Should I take "res.1" or "res.2", or both are wrong? and what is the correct way to do it? > 2.) Why "res.3 <- results(dds, contrast = c("condition", "A", "b"))" gave error: "Error in normalizeSingleBracketSubscript(j, x) : subscript contains invalid names" This code gives an error because lowercase "b" is not a level. (It should be giving a more understandable error message, not sure why it is not doing so here.) > 3.) In order to get DE genes between conditions not directly listed in the "resultsNames", is the following codes correct? should "batch_2_vs_1" be included int the contrast?##for example, DE between C and F, controlling for batch effect;res.4 <- results(dds, contrast = list("condition_C_vs_A", "condition_F_vs_A") Check the help for ?results and the examples there. You should simply use: contrast = c("condition", "C", "F") Mike > Best, Chunxuan > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENTlink written 3.2 years ago by Michael Love14k
Hi Mike, Simon, Thanks very much for the quick answers. I moved from DESeq, in which like hood ratio test is used for model comparison, so the the idea just sticked to my mind. Later I saw on the other vignette there is a description of the similar situation specifying "which means that we want to test for the effect of treatment (the last factor), controlling for the effect of patients (the first factor)". Thanks again and best, Chunxuan Date: Wed, 20 Aug 2014 14:43:18 +0200 Subject: Re: [BioC] multiple level factors contrast in nbinomLRT, DESeq2. From: michaelisaiahlove@gmail.com To: hibergo at outlook.com CC: bioconductor at r-project.org Hi Chunxuan, On Aug 20, 2014 12:38 PM, "sh. chunxuan" <hibergo at="" outlook.com=""> wrote: > > > > > Dear list, > I am not sure the correct way to interpret the nbinomLRT results in multiple level factors condition. > Here is a toy example. I would to find DE genes after controlling batch effect in the experiments, in which there are multiple levels. > ## make data library(DESeq2)dds <- makeExampleDESeqDataSet(m = 18)colData <- data.frame(row.names = rownames(colData(dds)), sample = colData(dds)$sample, condition = rep(LETTERS[1:6], each = 3), batch = factor(rep(c(1, 1, 2), 6)))dds <- DESeqDataSetFromMatrix(counts(dds), colData = colData, design = ~ batch + condition)dds <- estimateSizeFactors(dds)dds <- estimateDispersions(dds)## LRTestdds <- nbinomLRT(dds, reduced = formula(~ batch ))resultsNames(dds)[1] "Intercept" "batch_2_vs_1" "condition_B_vs_A" "condition_C_vs_A" "condition_D_vs_A" "condition_E_vs_A" "condition_F_vs_A"res.1 <- results(dds, name = "condition_B_vs_A")res.2 <- results(dds, contrast = list("condition_B_vs_A", "batch_2_vs_1") > Questions:1.) I would like to get the DE genes between B and A, while controlling for the batch effect. You should use the default Wald test for such a comparison, not the LRT. See the vignette for an explanation of the difference between the two tests. So simply use DESeq() and then use: results(dds, contrast = c("condition", "B", "A")) > Should I take "res.1" or "res.2", or both are wrong? and what is the correct way to do it? > 2.) Why "res.3 <- results(dds, contrast = c("condition", "A", "b"))" gave error: "Error in normalizeSingleBracketSubscript(j, x) : subscript contains invalid names" This code gives an error because lowercase "b" is not a level. (It should be giving a more understandable error message, not sure why it is not doing so here.) > 3.) In order to get DE genes between conditions not directly listed in the "resultsNames", is the following codes correct? should "batch_2_vs_1" be included int the contrast?##for example, DE between C and F, controlling for batch effect;res.4 <- results(dds, contrast = list("condition_C_vs_A", "condition_F_vs_A") Check the help for ?results and the examples there. You should simply use: contrast = c("condition", "C", "F") Mike > Best, Chunxuan > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLYlink written 3.2 years ago by shao80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 166 users visited in the last hour