Question: DESeq2 extracting results in version >= 1.3 (was: "using DESeq2 with multi factor data")
1
5.3 years ago by
Michael Love24k
United States
Michael Love24k wrote:
hi Olga and Karen, Thanks for your interest in DESeq2 and for using the development branch, which allows us to clarify changes before they enter the release branch. This concerns a modeling change from v1.2 to v1.3. Changes like these are documented in the NEWS file with pointers to the functions/arguments that changed: (e.g. http://www.bioconductor.org/packages/2.14/bioc/news/DESeq2/NEWS) I've been working on including more documentation in the package, but haven't gotten to adding the parts about interaction terms yet, so this gives me a chance to explain a bit. The new resultsNames(dds) are due to "expanded model matrices", which include a column in the design matrix for each level of a factor, as opposed to "standard model matrices" produced by model.matrix, which have a column in the deisgn matrix for the levels other than the base level. Expanded model matrices are described a bit in the man pages and in the vignette, but not yet using an example with interaction effects. This makes the DE analysis independent of the choice to base level, whereas previously the log fold change (LFC) shrinkage was not independent of base level choices. In addition, it simplifies certain comparisons. You now have the columns "metasmet_no" and "metasmet_yes", whereas before you had "metas_met_yes_vs_met_no". Previously you would use the 'name' argument to extract this comparison, whereas with version >= 1.3 we want to encourage users to use the 'contrast' argument: results(dds, contrast=c("metas","met_yes","met_no")) If the analysis includes interaction terms, previously the above contrast would give the metastasis effect only for the base level of the other factors in the design, but for version >= 1.3, this gives the overall metastasis effect over all levels of the time variable. If you want the individual time-period-specific effects, this is the overall effect plus the interaction effects for that time period.You have the following columns of the model matrix: > [1] "Intercept" "timetime_1" "timetime_2" "timetime_3" "timetime_4" "timetime_5" > [7] "timetime_6" "metasmet_no" "metasmet_yes" "timetime_1.metasmet_no" "timetime_2.metasmet_no" "timetime_3.metasmet_no" > [13] "timetime_4.metasmet_no" "timetime_5.metasmet_no" "timetime_6.metasmet_no" "timetime_1.metasmet_yes" "timetime_2.metasmet_yes" "timetime_3.metasmet_yes" > [19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes" "timetime_6.metasmet_yes" So the metastasis effect specific to time period 2 would be, the contrast of "metasmet_yes" vs "metasmet_no" and additionally, the contrast of "timetime_2.metasmet_yes" vs "timetime_2.metasmet_no". We can pull these out using the numeric contrast vector, starting with all 0's and then filing in the -1's and 1's: ctrst <- numeric(21) rn <- resultsNames(dds) ctrst[ rn == "metasmet_no" ] <- -1 ctrst[ rn == "metasmet_yes" ] <- 1 ctrst[ rn == "timetime_2.metasmet_no" ] <- -1 ctrst[ rn == "timetime_2.metasmet_yes" ] <- 1 results(dds, contrast=ctrst) Mike On Mon, Mar 17, 2014 at 5:07 AM, solgakar at bi.technion.ac.il <solgakar at="" bi.technion.ac.il=""> wrote: > > > > From: Karen Chait [mailto:kchait at tx.technion.ac.il] > Sent: Monday, March 17, 2014 10:57 AM > To: Olga Karinsky > Subject: RE: using DESeq2 with multi factor data > > Hello all, > I am trying to use the DESeq2 package to perform RNA-Seq analysis on a data containing several factors. > I have been closely following the emails between Ming Yi and Michael Love, because I think that my problem is very similar to what they have discussed. But even though I received a lot of useful information from their discussion, I still have several questions regarding my specific data. > Just as an overall information regarding my data, I have 96 samples and the two factors I am interested in exploring are "time" and "metastasis". > In order to build my data set I used the following commands: > > > countData = read.table("merged_counts.txt", header=TRUE, row.names=1) > > > metasVector=c("met_no","met_no","met_no","met_no","met_no","met_n o","met_no","met_no","met_no","met_no","met_no","met_no","met_no","met _no","met_no","met_no","met_no","met_no","met_no","met_no","met_no","m et_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no", "met_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no ","met_no","met_no","met_no","met_no","met_no","met_no","met_no","met_ no","met_no","met_no","met_no","met_no","met_no","met_no","met_no","me t_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no"," met_no","met_no","met_no","met_no","met_no","met_no","met_yes","met_ye s","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_ye s","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_ye s","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_ye s","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_ye s"( > > > timePointsVector=c("6","4","6","6","3","6","3","5","6","6","1","5 ","3","4","3","6","6","6","2","6","1","2","4","6","5","5","5","3","6", "5","6","2","6","6","1","5","5","6","6","6","6","6","6","4","2","6","3 ","1","2","5","6","1","1","3","6","3","6","4","4","5","6","6","3","5", "4","6","1","4","3","1","1","1","4","2","1","1","3","6","1","1","2","1 ","6","3","3","2","5","3","2","3","1","4","1","1","6","1") > > > colData=data.frame(row.names=colnames(countData),metas=metasVecto r,gender=gendarVector) > > > colData$metas=factor(colData$metas, levels=c("met_no","met_yes")) > > > colData$time = factor(colData$time, levels = c("1", "2", "3", "4", "5", "6")) > > > dds=DESeqDataSetFromMatrix(countData=tmpcountData, colData=colData, design=~time + metas + metas:time) > > > dds=DESeq(dds) > I have several questions: > > - first of all I have tried running those commands on DESeq2 version 1.2.10 (R version 3.0.2) and DESeq2 version 1.3.47 (R version 3.0.2) and what I have received from the resultsNames() function I both cases is very different. Using the 1.2.10 version I have received: > > > resultsNames(dds) > > [1] "Intercept" "time_2_vs_1" "time_3_vs_1" "time_4_vs_1" "time_5_vs_1" "time_6_vs_1" "metas_met_yes_vs_met_no" "time2.metasmet_yes" > > [9] "time3.metasmet_yes" "time4.metasmet_yes" "time5.metasmet_yes" "time6.metasmet_yes" > > > sessionInfo() > > R version 3.0.2 (2013-09-25) > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > > locale: > > [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C LC_TIME=Hebrew_Israel.1255 > > > > attached base packages: > > [1] parallel stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] DESeq2_1.2.10 RcppArmadillo_0.4.100.2.1 Rcpp_0.11.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 BiocGenerics_0.8.0 > > > > loaded via a namespace (and not attached): > > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7 genefilter_1.44.0 grid_3.0.2 lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5 RSQLite_0.11.4 > > [11] splines_3.0.2 stats4_3.0.2 survival_2.37-7 tools_3.0.2 XML_3.98-1.1 xtable_1.7-3 > > > > Using the 1.3.47 version I have received: > > > resultsNames(dds) > > [1] "Intercept" "timetime_1" "timetime_2" "timetime_3" "timetime_4" "timetime_5" > > [7] "timetime_6" "metasmet_no" "metasmet_yes" "timetime_1.metasmet_no" "timetime_2.metasmet_no" "timetime_3.metasmet_no" > > [13] "timetime_4.metasmet_no" "timetime_5.metasmet_no" "timetime_6.metasmet_no" "timetime_1.metasmet_yes" "timetime_2.metasmet_yes" "timetime_3.metasmet_yes" > > [19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes" "timetime_6.metasmet_yes" > > > sessionInfo() > > R version 3.0.2 (2013-09-25) > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > > locale: > > [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C LC_TIME=Hebrew_Israel.1255 > > > > attached base packages: > > [1] parallel stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] DESeq2_1.3.47 RcppArmadillo_0.4.100.0 Rcpp_0.11.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 > > [7] BiocGenerics_0.8.0 > > > > loaded via a namespace (and not attached): > > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2 > > [8] lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2 survival_2.37-7 > > [15] XML_3.98-1.1 xtable_1.7-3 > > (I have ran the 1.3.47 version the same way besides a difference in the names of the time levels, but I do not believe that this is the reason for the differences) > I don't fully understand the results I receive using the 1.3.47 version and even more the difference between the versions. > > - From my understanding, the results I received using the 1.2.10 version are the more reasonable and they fit my settings of base levels in the data. Now after receiving these results I would love to understand how do I receive different contrast testing? For each time period metas_yes vs. metas_no (for example timetime_2.metasmet_yes vs. timetime_2.metasmet_no) > > > > Thank you in advance, > > Olga and Karen > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
deseq2 • 951 views
modified 5.3 years ago • written 5.3 years ago by Michael Love24k
Answer: DESeq2 extracting results in version >= 1.3 (was: "using DESeq2 with multi facto
0
5.3 years ago by
Michael Love24k
United States
Michael Love24k wrote:
hi Olga and Karen, I updated the results() function in DESeq2 version 1.3 to simplify extracting the kind of contrast we were discussing, combining main effects and interactions. While I emailed the following code to build a numeric contrast, > ctrst <- numeric(21) > rn <- resultsNames(dds) > ctrst[ rn == "metasmet_no" ] <- -1 > ctrst[ rn == "metasmet_yes" ] <- 1 > ctrst[ rn == "timetime_2.metasmet_no" ] <- -1 > ctrst[ rn == "timetime_2.metasmet_yes" ] <- 1 > results(dds, contrast=ctrst) this can now be accomplished by giving a list to the 'contrast' argument. The first element of the list should be the effects for the numerator of the log fold change, and the second element should be the effects for the denominator: results( dds, contrast = list( c("metasmet_yes", "timetime_2.metasmet_yes"), c("metasmet_no", "timetime_2.metasmet_no") ) ) There are more details and examples in the man page for ?results for the development branch. Mike On Mon, Mar 17, 2014 at 9:33 AM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: > > hi Olga and Karen, > > Thanks for your interest in DESeq2 and for using the development > branch, which allows us to clarify changes before they enter the > release branch. > > This concerns a modeling change from v1.2 to v1.3. Changes like these > are documented in the NEWS file with pointers to the > functions/arguments that changed: > (e.g. http://www.bioconductor.org/packages/2.14/bioc/news/DESeq2/NEWS) > I've been working on including more documentation in the package, but > haven't gotten to adding the parts about interaction terms yet, so > this gives me a chance to explain a bit. > > The new resultsNames(dds) are due to "expanded model matrices", which > include a column in the design matrix for each level of a factor, as > opposed to "standard model matrices" produced by model.matrix, which > have a column in the deisgn matrix for the levels other than the base > level. Expanded model matrices are described a bit in the man pages > and in the vignette, but not yet using an example with interaction > effects. This makes the DE analysis independent of the choice to base > level, whereas previously the log fold change (LFC) shrinkage was not > independent of base level choices. In addition, it simplifies certain > comparisons. > > You now have the columns "metasmet_no" and "metasmet_yes", whereas > before you had "metas_met_yes_vs_met_no". Previously you would use > the 'name' argument to extract this comparison, whereas with version > >= 1.3 we want to encourage users to use the 'contrast' argument: > > results(dds, contrast=c("metas","met_yes","met_no")) > > If the analysis includes interaction terms, previously the above > contrast would give the metastasis effect only for the base level of > the other factors in the design, but for version >= 1.3, this gives > the overall metastasis effect over all levels of the time variable. > > If you want the individual time-period-specific effects, this is the > overall effect plus the interaction effects for that time period.You > have the following columns of the model matrix: > > > [1] "Intercept" "timetime_1" "timetime_2" "timetime_3" "timetime_4" "timetime_5" > > [7] "timetime_6" "metasmet_no" "metasmet_yes" "timetime_1.metasmet_no" "timetime_2.metasmet_no" "timetime_3.metasmet_no" > > [13] "timetime_4.metasmet_no" "timetime_5.metasmet_no" "timetime_6.metasmet_no" "timetime_1.metasmet_yes" "timetime_2.metasmet_yes" "timetime_3.metasmet_yes" > > [19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes" "timetime_6.metasmet_yes" > > So the metastasis effect specific to time period 2 would be, the > contrast of "metasmet_yes" vs "metasmet_no" and additionally, the > contrast of "timetime_2.metasmet_yes" vs "timetime_2.metasmet_no". We > can pull these out using the numeric contrast vector, starting with > all 0's and then filing in the -1's and 1's: > > ctrst <- numeric(21) > rn <- resultsNames(dds) > ctrst[ rn == "metasmet_no" ] <- -1 > ctrst[ rn == "metasmet_yes" ] <- 1 > ctrst[ rn == "timetime_2.metasmet_no" ] <- -1 > ctrst[ rn == "timetime_2.metasmet_yes" ] <- 1 > results(dds, contrast=ctrst) > > > Mike > > > > On Mon, Mar 17, 2014 at 5:07 AM, solgakar at bi.technion.ac.il > <solgakar at="" bi.technion.ac.il=""> wrote: > > > > > > > > From: Karen Chait [mailto:kchait at tx.technion.ac.il] > > Sent: Monday, March 17, 2014 10:57 AM > > To: Olga Karinsky > > Subject: RE: using DESeq2 with multi factor data > > > > Hello all, > > I am trying to use the DESeq2 package to perform RNA-Seq analysis on a data containing several factors. > > I have been closely following the emails between Ming Yi and Michael Love, because I think that my problem is very similar to what they have discussed. But even though I received a lot of useful information from their discussion, I still have several questions regarding my specific data. > > Just as an overall information regarding my data, I have 96 samples and the two factors I am interested in exploring are "time" and "metastasis". > > In order to build my data set I used the following commands: > > > > > countData = read.table("merged_counts.txt", header=TRUE, row.names=1) > > > > > metasVector=c("met_no","met_no","met_no","met_no","met_no","met _no","met_no","met_no","met_no","met_no","met_no","met_no","met_no","m et_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no", "met_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no ","met_no","met_no","met_no","met_no","met_no","met_no","met_no","met_ no","met_no","met_no","met_no","met_no","met_no","met_no","met_no","me t_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no"," met_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no" ,"met_no","met_no","met_no","met_no","met_no","met_no","met_yes","met_ yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_ yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_ yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_ yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_ yes"( > > > > > timePointsVector=c("6","4","6","6","3","6","3","5","6","6","1", "5","3","4","3","6","6","6","2","6","1","2","4","6","5","5","5","3","6 ","5","6","2","6","6","1","5","5","6","6","6","6","6","6","4","2","6", "3","1","2","5","6","1","1","3","6","3","6","4","4","5","6","6","3","5 ","4","6","1","4","3","1","1","1","4","2","1","1","3","6","1","1","2", "1","6","3","3","2","5","3","2","3","1","4","1","1","6","1") > > > > > colData=data.frame(row.names=colnames(countData),metas=metasVec tor,gender=gendarVector) > > > > > colData$metas=factor(colData$metas, levels=c("met_no","met_yes")) > > > > > colData$time = factor(colData$time, levels = c("1", "2", "3", "4", "5", "6")) > > > > > dds=DESeqDataSetFromMatrix(countData=tmpcountData, colData=colData, design=~time + metas + metas:time) > > > > > dds=DESeq(dds) > > I have several questions: > > > > - first of all I have tried running those commands on DESeq2 version 1.2.10 (R version 3.0.2) and DESeq2 version 1.3.47 (R version 3.0.2) and what I have received from the resultsNames() function I both cases is very different. Using the 1.2.10 version I have received: > > > > > resultsNames(dds) > > > > [1] "Intercept" "time_2_vs_1" "time_3_vs_1" "time_4_vs_1" "time_5_vs_1" "time_6_vs_1" "metas_met_yes_vs_met_no" "time2.metasmet_yes" > > > > [9] "time3.metasmet_yes" "time4.metasmet_yes" "time5.metasmet_yes" "time6.metasmet_yes" > > > > > sessionInfo() > > > > R version 3.0.2 (2013-09-25) > > > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > > > > > > locale: > > > > [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C LC_TIME=Hebrew_Israel.1255 > > > > > > > > attached base packages: > > > > [1] parallel stats graphics grDevices utils datasets methods base > > > > > > > > other attached packages: > > > > [1] DESeq2_1.2.10 RcppArmadillo_0.4.100.2.1 Rcpp_0.11.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 BiocGenerics_0.8.0 > > > > > > > > loaded via a namespace (and not attached): > > > > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7 genefilter_1.44.0 grid_3.0.2 lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5 RSQLite_0.11.4 > > > > [11] splines_3.0.2 stats4_3.0.2 survival_2.37-7 tools_3.0.2 XML_3.98-1.1 xtable_1.7-3 > > > > > > > > Using the 1.3.47 version I have received: > > > > > resultsNames(dds) > > > > [1] "Intercept" "timetime_1" "timetime_2" "timetime_3" "timetime_4" "timetime_5" > > > > [7] "timetime_6" "metasmet_no" "metasmet_yes" "timetime_1.metasmet_no" "timetime_2.metasmet_no" "timetime_3.metasmet_no" > > > > [13] "timetime_4.metasmet_no" "timetime_5.metasmet_no" "timetime_6.metasmet_no" "timetime_1.metasmet_yes" "timetime_2.metasmet_yes" "timetime_3.metasmet_yes" > > > > [19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes" "timetime_6.metasmet_yes" > > > > > sessionInfo() > > > > R version 3.0.2 (2013-09-25) > > > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > > > > > > locale: > > > > [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C LC_TIME=Hebrew_Israel.1255 > > > > > > > > attached base packages: > > > > [1] parallel stats graphics grDevices utils datasets methods base > > > > > > > > other attached packages: > > > > [1] DESeq2_1.3.47 RcppArmadillo_0.4.100.0 Rcpp_0.11.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 > > > > [7] BiocGenerics_0.8.0 > > > > > > > > loaded via a namespace (and not attached): > > > > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2 > > > > [8] lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2 survival_2.37-7 > > > > [15] XML_3.98-1.1 xtable_1.7-3 > > > > (I have ran the 1.3.47 version the same way besides a difference in the names of the time levels, but I do not believe that this is the reason for the differences) > > I don't fully understand the results I receive using the 1.3.47 version and even more the difference between the versions. > > > > - From my understanding, the results I received using the 1.2.10 version are the more reasonable and they fit my settings of base levels in the data. Now after receiving these results I would love to understand how do I receive different contrast testing? For each time period metas_yes vs. metas_no (for example timetime_2.metasmet_yes vs. timetime_2.metasmet_no) > > > > > > > > Thank you in advance, > > > > Olga and Karen > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Hi Michael, Thank you for your great help and your fast replies. When I define 2 factors when each has only 2 levels I understand I can't use the expanded matrix. I want to perform the following analysis- > colData=data.frame(row.names=colnames(tmpcountData),metas=metasVecto r,time=timeVector) > colData$metas=factor(colData$metas,levels=c("met_no","met_yes")) > colData$time=factor(colData$time,levels=c("1","2")) > dds2=DESeqDataSetFromMatrix(countData=tmpcountData,colData=colData,d esign=~time+metas+metas:time) > dds2=DESeq(dds2) > resultsNames(dds2) [1] "Intercept" "time_2_vs_1" "metas_met_yes_vs_met_no" "time2.metasmet_yes" The comparisons I want to run are : 1. all samples met_yes vs. met_no 2. Samples with time=1 met_yes vs. met_no 3. Samples with time=2 met_yes vs. met_no I get the same results for comparisons 1 and 2 > res1=results(dds,contrast=c("metas","met_yes","met_no"),alpha=0.05) > res2=results(dds,contrast=c(0,0,1,0),alpha=0.05) > res3=results(dds,contrast=c(0,0,1,1),alpha=0.05) I understand that for the first comparison I can define the design without the time factor. For the second comparison, does DESeq calculate the differential expression only between the samples with time=1? If not, can you explain how DESeq2 calculates the DE for these comparisons. > sessionInfo() R version 3.0.3 (2014-03-06) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] DESeq2_1.3.60 RcppArmadillo_0.4.100.2.1 [3] Rcpp_0.11.1 GenomicRanges_1.14.4 [5] XVector_0.2.0 IRanges_1.20.7 [7] BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 [4] DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 [7] grid_3.0.3 lattice_0.20-27 locfit_1.5-9.1 [10] RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.3 [13] stats4_3.0.3 survival_2.37-7 tools_3.0.3 [16] XML_3.98-1.1 xtable_1.7-3 Thank you so much, Karen and Olga -----Original Message----- From: Michael Love [mailto:michaelisaiahlove@gmail.com] Sent: Monday, March 24, 2014 5:24 PM To: solgakar at bi.technion.ac.il; karen chait (kchait at tx.technion.ac.il) Cc: bioconductor at r-project.org Subject: Re: DESeq2 extracting results in version >= 1.3 (was: "using DESeq2 with multi factor data") hi Olga and Karen, I updated the results() function in DESeq2 version 1.3 to simplify extracting the kind of contrast we were discussing, combining main effects and interactions. While I emailed the following code to build a numeric contrast, > ctrst <- numeric(21) > rn <- resultsNames(dds) > ctrst[ rn == "metasmet_no" ] <- -1 > ctrst[ rn == "metasmet_yes" ] <- 1 > ctrst[ rn == "timetime_2.metasmet_no" ] <- -1 ctrst[ rn == > "timetime_2.metasmet_yes" ] <- 1 results(dds, contrast=ctrst) this can now be accomplished by giving a list to the 'contrast' argument. The first element of the list should be the effects for the numerator of the log fold change, and the second element should be the effects for the denominator: results( dds, contrast = list( c("metasmet_yes", "timetime_2.metasmet_yes"), c("metasmet_no", "timetime_2.metasmet_no") ) ) There are more details and examples in the man page for ?results for the development branch. Mike On Mon, Mar 17, 2014 at 9:33 AM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: > > hi Olga and Karen, > > Thanks for your interest in DESeq2 and for using the development > branch, which allows us to clarify changes before they enter the > release branch. > > This concerns a modeling change from v1.2 to v1.3. Changes like these > are documented in the NEWS file with pointers to the > functions/arguments that changed: > (e.g. http://www.bioconductor.org/packages/2.14/bioc/news/DESeq2/NEWS) > I've been working on including more documentation in the package, but > haven't gotten to adding the parts about interaction terms yet, so > this gives me a chance to explain a bit. > > The new resultsNames(dds) are due to "expanded model matrices", which > include a column in the design matrix for each level of a factor, as > opposed to "standard model matrices" produced by model.matrix, which > have a column in the deisgn matrix for the levels other than the base > level. Expanded model matrices are described a bit in the man pages > and in the vignette, but not yet using an example with interaction > effects. This makes the DE analysis independent of the choice to base > level, whereas previously the log fold change (LFC) shrinkage was not > independent of base level choices. In addition, it simplifies certain > comparisons. > > You now have the columns "metasmet_no" and "metasmet_yes", whereas > before you had "metas_met_yes_vs_met_no". Previously you would use > the 'name' argument to extract this comparison, whereas with version > >= 1.3 we want to encourage users to use the 'contrast' argument: > > results(dds, contrast=c("metas","met_yes","met_no")) > > If the analysis includes interaction terms, previously the above > contrast would give the metastasis effect only for the base level of > the other factors in the design, but for version >= 1.3, this gives > the overall metastasis effect over all levels of the time variable. > > If you want the individual time-period-specific effects, this is the > overall effect plus the interaction effects for that time period.You > have the following columns of the model matrix: > > > [1] "Intercept" "timetime_1" "timetime_2" "timetime_3" "timetime_4" "timetime_5" > > [7] "timetime_6" "metasmet_no" "metasmet_yes" "timetime_1.metasmet_no" "timetime_2.metasmet_no" "timetime_3.metasmet_no" > > [13] "timetime_4.metasmet_no" "timetime_5.metasmet_no" "timetime_6.metasmet_no" "timetime_1.metasmet_yes" "timetime_2.metasmet_yes" "timetime_3.metasmet_yes" > > [19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes" "timetime_6.metasmet_yes" > > So the metastasis effect specific to time period 2 would be, the > contrast of "metasmet_yes" vs "metasmet_no" and additionally, the > contrast of "timetime_2.metasmet_yes" vs "timetime_2.metasmet_no". We > can pull these out using the numeric contrast vector, starting with > all 0's and then filing in the -1's and 1's: > > ctrst <- numeric(21) > rn <- resultsNames(dds) > ctrst[ rn == "metasmet_no" ] <- -1 > ctrst[ rn == "metasmet_yes" ] <- 1 > ctrst[ rn == "timetime_2.metasmet_no" ] <- -1 ctrst[ rn == > "timetime_2.metasmet_yes" ] <- 1 results(dds, contrast=ctrst) > > > Mike > > > > On Mon, Mar 17, 2014 at 5:07 AM, solgakar at bi.technion.ac.il > <solgakar at="" bi.technion.ac.il=""> wrote: > > > > > > > > From: Karen Chait [mailto:kchait at tx.technion.ac.il] > > Sent: Monday, March 17, 2014 10:57 AM > > To: Olga Karinsky > > Subject: RE: using DESeq2 with multi factor data > > > > Hello all, > > I am trying to use the DESeq2 package to perform RNA-Seq analysis on a data containing several factors. > > I have been closely following the emails between Ming Yi and Michael Love, because I think that my problem is very similar to what they have discussed. But even though I received a lot of useful information from their discussion, I still have several questions regarding my specific data. > > Just as an overall information regarding my data, I have 96 samples and the two factors I am interested in exploring are "time" and "metastasis". > > In order to build my data set I used the following commands: > > > > > countData = read.table("merged_counts.txt", header=TRUE, > > > row.names=1) > > > > > > > > metasVector=c("met_no","met_no","met_no","met_no","met_no","met_no > > > ","met_no","met_no","met_no","met_no","met_no","met_no","met_no"," > > > met_no","met_no","met_no","met_no","met_no","met_no","met_no","met > > > _no","met_no","met_no","met_no","met_no","met_no","met_no","met_no > > > ","met_no","met_no","met_no","met_no","met_no","met_no","met_no"," > > > met_no","met_no","met_no","met_no","met_no","met_no","met_no","met > > > _no","met_no","met_no","met_no","met_no","met_no","met_no","met_no > > > ","met_no","met_no","met_no","met_no","met_no","met_no","met_no"," > > > met_no","met_no","met_no","met_no","met_no","met_no","met_no","met > > > _no","met_no","met_yes","met_yes","met_yes","met_yes","met_yes","m > > > et_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes > > > ","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met > > > _yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes", > > > "met_yes","met_yes","met_yes","met_yes","met_yes"( > > > > > > > > timePointsVector=c("6","4","6","6","3","6","3","5","6","6","1","5" > > > ,"3","4","3","6","6","6","2","6","1","2","4","6","5","5","5","3"," > > > 6","5","6","2","6","6","1","5","5","6","6","6","6","6","6","4","2" > > > ,"6","3","1","2","5","6","1","1","3","6","3","6","4","4","5","6"," > > > 6","3","5","4","6","1","4","3","1","1","1","4","2","1","1","3","6" > > > ,"1","1","2","1","6","3","3","2","5","3","2","3","1","4","1","1"," > > > 6","1") > > > > > > > > colData=data.frame(row.names=colnames(countData),metas=metasVector > > > ,gender=gendarVector) > > > > > colData$metas=factor(colData$metas, levels=c("met_no","met_yes")) > > > > > colData$time = factor(colData$time, levels = c("1", "2", "3", > > > "4", "5", "6")) > > > > > dds=DESeqDataSetFromMatrix(countData=tmpcountData, > > > colData=colData, design=~time + metas + metas:time) > > > > > dds=DESeq(dds) > > I have several questions: > > > > - first of all I have tried running those commands on DESeq2 version 1.2.10 (R version 3.0.2) and DESeq2 version 1.3.47 (R version 3.0.2) and what I have received from the resultsNames() function I both cases is very different. Using the 1.2.10 version I have received: > > > > > resultsNames(dds) > > > > [1] "Intercept" "time_2_vs_1" "time_3_vs_1" "time_4_vs_1" "time_5_vs_1" "time_6_vs_1" "metas_met_yes_vs_met_no" "time2.metasmet_yes" > > > > [9] "time3.metasmet_yes" "time4.metasmet_yes" "time5.metasmet_yes" "time6.metasmet_yes" > > > > > sessionInfo() > > > > R version 3.0.2 (2013-09-25) > > > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > > > > > > locale: > > > > [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C LC_TIME=Hebrew_Israel.1255 > > > > > > > > attached base packages: > > > > [1] parallel stats graphics grDevices utils datasets methods base > > > > > > > > other attached packages: > > > > [1] DESeq2_1.2.10 RcppArmadillo_0.4.100.2.1 Rcpp_0.11.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 BiocGenerics_0.8.0 > > > > > > > > loaded via a namespace (and not attached): > > > > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7 genefilter_1.44.0 grid_3.0.2 lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5 RSQLite_0.11.4 > > > > [11] splines_3.0.2 stats4_3.0.2 survival_2.37-7 tools_3.0.2 XML_3.98-1.1 xtable_1.7-3 > > > > > > > > Using the 1.3.47 version I have received: > > > > > resultsNames(dds) > > > > [1] "Intercept" "timetime_1" "timetime_2" "timetime_3" "timetime_4" "timetime_5" > > > > [7] "timetime_6" "metasmet_no" "metasmet_yes" "timetime_1.metasmet_no" "timetime_2.metasmet_no" "timetime_3.metasmet_no" > > > > [13] "timetime_4.metasmet_no" "timetime_5.metasmet_no" "timetime_6.metasmet_no" "timetime_1.metasmet_yes" "timetime_2.metasmet_yes" "timetime_3.metasmet_yes" > > > > [19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes" "timetime_6.metasmet_yes" > > > > > sessionInfo() > > > > R version 3.0.2 (2013-09-25) > > > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > > > > > > > locale: > > > > [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C LC_TIME=Hebrew_Israel.1255 > > > > > > > > attached base packages: > > > > [1] parallel stats graphics grDevices utils datasets methods base > > > > > > > > other attached packages: > > > > [1] DESeq2_1.3.47 RcppArmadillo_0.4.100.0 Rcpp_0.11.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 > > > > [7] BiocGenerics_0.8.0 > > > > > > > > loaded via a namespace (and not attached): > > > > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2 > > > > [8] lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2 survival_2.37-7 > > > > [15] XML_3.98-1.1 xtable_1.7-3 > > > > (I have ran the 1.3.47 version the same way besides a difference in the names of the time levels, but I do not believe that this is the reason for the differences) > > I don't fully understand the results I receive using the 1.3.47 version and even more the difference between the versions. > > > > - From my understanding, the results I received using the 1.2.10 version are the more reasonable and they fit my settings of base levels in the data. Now after receiving these results I would love to understand how do I receive different contrast testing? For each time period metas_yes vs. metas_no (for example timetime_2.metasmet_yes vs. timetime_2.metasmet_no) > > > > > > > > Thank you in advance, > > > > Olga and Karen > > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor at r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > > http://news.gmane.org/gmane.science.biology.informatics.conductor
Answer: DESeq2 extracting results in version >= 1.3 (was: "using DESeq2 with multi facto
0
5.3 years ago by
Michael Love24k
United States
Michael Love24k wrote:
hi Karen and Olga, On Mon, Mar 31, 2014 at 2:42 AM, Karen Chait <kchait at="" tx.technion.ac.il=""> wrote: > Hi Michael, > Thank you for your great help and your fast replies. > When I define 2 factors when each has only 2 levels I understand I can't use the expanded matrix. I want to perform the following analysis- >> colData=data.frame(row.names=colnames(tmpcountData),metas=metasVect or,time=timeVector) >> colData$metas=factor(colData$metas,levels=c("met_no","met_yes")) >> colData$time=factor(colData$time,levels=c("1","2")) >> dds2=DESeqDataSetFromMatrix(countData=tmpcountData,colData=colData, design=~time+metas+metas:time) >> dds2=DESeq(dds2) >> resultsNames(dds2) > [1] "Intercept" "time_2_vs_1" "metas_met_yes_vs_met_no" "time2.metasmet_yes" > > The comparisons I want to run are : > 1. all samples met_yes vs. met_no > 2. Samples with time=1 met_yes vs. met_no > 3. Samples with time=2 met_yes vs. met_no > We use standard design matrices for the 2x2 case because it simplifies the extraction of results. To test for interaction effects (is the effect of metastasis different at time 2 than time 1) we can simply call the single interaction term using 'name'. Concerning the comparisons you define above: The second comparison is: results(dds, contrast=c("metas","met_yes","met_no")) # equivalent to results(dds, contrast=c(0,0,1,0)) The third comparison is the main effect of metas_met_yes_vs_met_no plus the interaction effect at time=2: results(dds, contrast=list(c("metas_met_yes_vs_met_no", "time2.metasmet_yes"), character())) # equivalent to results(dds, contrast=c(0,0,1,1)) The first comparison is the main effect of metas_met_yes_vs_met_no plus half the interaction effect at time=2. This is somewhat intuitive given the two comparisons we just described. results(dds, contrast=c(0,0,1,1/2)) > I get the same results for comparisons 1 and 2 >> res1=results(dds,contrast=c("metas","met_yes","met_no"),alpha=0.05) >> res2=results(dds,contrast=c(0,0,1,0),alpha=0.05) >> res3=results(dds,contrast=c(0,0,1,1),alpha=0.05) > > I understand that for the first comparison I can define the design without the time factor. You could do this, or use the contrast I defined above. > For the second comparison, does DESeq calculate the differential expression only between the samples with time=1? Yes. -Mike > If not, can you explain how DESeq2 calculates the DE for these comparisons. > >> sessionInfo() > R version 3.0.3 (2014-03-06) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] DESeq2_1.3.60 RcppArmadillo_0.4.100.2.1 > [3] Rcpp_0.11.1 GenomicRanges_1.14.4 > [5] XVector_0.2.0 IRanges_1.20.7 > [7] BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 > [4] DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 > [7] grid_3.0.3 lattice_0.20-27 locfit_1.5-9.1 > [10] RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.3 > [13] stats4_3.0.3 survival_2.37-7 tools_3.0.3 > [16] XML_3.98-1.1 xtable_1.7-3 > > > Thank you so much, > Karen and Olga > > > -----Original Message----- > From: Michael Love [mailto:michaelisaiahlove at gmail.com] > Sent: Monday, March 24, 2014 5:24 PM > To: solgakar at bi.technion.ac.il; karen chait (kchait at tx.technion.ac.il) > Cc: bioconductor at r-project.org > Subject: Re: DESeq2 extracting results in version >= 1.3 (was: "using DESeq2 with multi factor data") > > hi Olga and Karen, > > I updated the results() function in DESeq2 version 1.3 to simplify extracting the kind of contrast we were discussing, combining main effects and interactions. While I emailed the following code to build a numeric contrast, > >> ctrst <- numeric(21) >> rn <- resultsNames(dds) >> ctrst[ rn == "metasmet_no" ] <- -1 >> ctrst[ rn == "metasmet_yes" ] <- 1 >> ctrst[ rn == "timetime_2.metasmet_no" ] <- -1 ctrst[ rn == >> "timetime_2.metasmet_yes" ] <- 1 results(dds, contrast=ctrst) > > this can now be accomplished by giving a list to the 'contrast' > argument. The first element of the list should be the effects for the numerator of the log fold change, and the second element should be the effects for the denominator: > > results( dds, contrast = list( > c("metasmet_yes", "timetime_2.metasmet_yes"), > c("metasmet_no", "timetime_2.metasmet_no") ) ) > > There are more details and examples in the man page for ?results for the development branch. > > Mike > > > On Mon, Mar 17, 2014 at 9:33 AM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: >> >> hi Olga and Karen, >> >> Thanks for your interest in DESeq2 and for using the development >> branch, which allows us to clarify changes before they enter the >> release branch. >> >> This concerns a modeling change from v1.2 to v1.3. Changes like these >> are documented in the NEWS file with pointers to the >> functions/arguments that changed: >> (e.g. http://www.bioconductor.org/packages/2.14/bioc/news/DESeq2/NEWS) >> I've been working on including more documentation in the package, but >> haven't gotten to adding the parts about interaction terms yet, so >> this gives me a chance to explain a bit. >> >> The new resultsNames(dds) are due to "expanded model matrices", which >> include a column in the design matrix for each level of a factor, as >> opposed to "standard model matrices" produced by model.matrix, which >> have a column in the deisgn matrix for the levels other than the base >> level. Expanded model matrices are described a bit in the man pages >> and in the vignette, but not yet using an example with interaction >> effects. This makes the DE analysis independent of the choice to base >> level, whereas previously the log fold change (LFC) shrinkage was not >> independent of base level choices. In addition, it simplifies certain >> comparisons. >> >> You now have the columns "metasmet_no" and "metasmet_yes", whereas >> before you had "metas_met_yes_vs_met_no". Previously you would use >> the 'name' argument to extract this comparison, whereas with version >> >= 1.3 we want to encourage users to use the 'contrast' argument: >> >> results(dds, contrast=c("metas","met_yes","met_no")) >> >> If the analysis includes interaction terms, previously the above >> contrast would give the metastasis effect only for the base level of >> the other factors in the design, but for version >= 1.3, this gives >> the overall metastasis effect over all levels of the time variable. >> >> If you want the individual time-period-specific effects, this is the >> overall effect plus the interaction effects for that time period.You >> have the following columns of the model matrix: >> >> > [1] "Intercept" "timetime_1" "timetime_2" "timetime_3" "timetime_4" "timetime_5" >> > [7] "timetime_6" "metasmet_no" "metasmet_yes" "timetime_1.metasmet_no" "timetime_2.metasmet_no" "timetime_3.metasmet_no" >> > [13] "timetime_4.metasmet_no" "timetime_5.metasmet_no" "timetime_6.metasmet_no" "timetime_1.metasmet_yes" "timetime_2.metasmet_yes" "timetime_3.metasmet_yes" >> > [19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes" "timetime_6.metasmet_yes" >> >> So the metastasis effect specific to time period 2 would be, the >> contrast of "metasmet_yes" vs "metasmet_no" and additionally, the >> contrast of "timetime_2.metasmet_yes" vs "timetime_2.metasmet_no". We >> can pull these out using the numeric contrast vector, starting with >> all 0's and then filing in the -1's and 1's: >> >> ctrst <- numeric(21) >> rn <- resultsNames(dds) >> ctrst[ rn == "metasmet_no" ] <- -1 >> ctrst[ rn == "metasmet_yes" ] <- 1 >> ctrst[ rn == "timetime_2.metasmet_no" ] <- -1 ctrst[ rn == >> "timetime_2.metasmet_yes" ] <- 1 results(dds, contrast=ctrst) >> >> >> Mike >> >> >> >> On Mon, Mar 17, 2014 at 5:07 AM, solgakar at bi.technion.ac.il >> <solgakar at="" bi.technion.ac.il=""> wrote: >> > >> > >> > >> > From: Karen Chait [mailto:kchait at tx.technion.ac.il] >> > Sent: Monday, March 17, 2014 10:57 AM >> > To: Olga Karinsky >> > Subject: RE: using DESeq2 with multi factor data >> > >> > Hello all, >> > I am trying to use the DESeq2 package to perform RNA-Seq analysis on a data containing several factors. >> > I have been closely following the emails between Ming Yi and Michael Love, because I think that my problem is very similar to what they have discussed. But even though I received a lot of useful information from their discussion, I still have several questions regarding my specific data. >> > Just as an overall information regarding my data, I have 96 samples and the two factors I am interested in exploring are "time" and "metastasis". >> > In order to build my data set I used the following commands: >> > >> > > countData = read.table("merged_counts.txt", header=TRUE, >> > > row.names=1) >> > >> > > >> > > metasVector=c("met_no","met_no","met_no","met_no","met_no","met_no >> > > ","met_no","met_no","met_no","met_no","met_no","met_no","met_no"," >> > > met_no","met_no","met_no","met_no","met_no","met_no","met_no","met >> > > _no","met_no","met_no","met_no","met_no","met_no","met_no","met_no >> > > ","met_no","met_no","met_no","met_no","met_no","met_no","met_no"," >> > > met_no","met_no","met_no","met_no","met_no","met_no","met_no","met >> > > _no","met_no","met_no","met_no","met_no","met_no","met_no","met_no >> > > ","met_no","met_no","met_no","met_no","met_no","met_no","met_no"," >> > > met_no","met_no","met_no","met_no","met_no","met_no","met_no","met >> > > _no","met_no","met_yes","met_yes","met_yes","met_yes","met_yes","m >> > > et_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes >> > > ","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met >> > > _yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes", >> > > "met_yes","met_yes","met_yes","met_yes","met_yes"( >> > >> > > >> > > timePointsVector=c("6","4","6","6","3","6","3","5","6","6","1","5" >> > > ,"3","4","3","6","6","6","2","6","1","2","4","6","5","5","5","3"," >> > > 6","5","6","2","6","6","1","5","5","6","6","6","6","6","6","4","2" >> > > ,"6","3","1","2","5","6","1","1","3","6","3","6","4","4","5","6"," >> > > 6","3","5","4","6","1","4","3","1","1","1","4","2","1","1","3","6" >> > > ,"1","1","2","1","6","3","3","2","5","3","2","3","1","4","1","1"," >> > > 6","1") >> > >> > > >> > > colData=data.frame(row.names=colnames(countData),metas=metasVector >> > > ,gender=gendarVector) >> > >> > > colData$metas=factor(colData$metas, levels=c("met_no","met_yes")) >> > >> > > colData$time = factor(colData$time, levels = c("1", "2", "3", >> > > "4", "5", "6")) >> > >> > > dds=DESeqDataSetFromMatrix(countData=tmpcountData, >> > > colData=colData, design=~time + metas + metas:time) >> > >> > > dds=DESeq(dds) >> > I have several questions: >> > >> > - first of all I have tried running those commands on DESeq2 version 1.2.10 (R version 3.0.2) and DESeq2 version 1.3.47 (R version 3.0.2) and what I have received from the resultsNames() function I both cases is very different. Using the 1.2.10 version I have received: >> > >> > > resultsNames(dds) >> > >> > [1] "Intercept" "time_2_vs_1" "time_3_vs_1" "time_4_vs_1" "time_5_vs_1" "time_6_vs_1" "metas_met_yes_vs_met_no" "time2.metasmet_yes" >> > >> > [9] "time3.metasmet_yes" "time4.metasmet_yes" "time5.metasmet_yes" "time6.metasmet_yes" >> > >> > > sessionInfo() >> > >> > R version 3.0.2 (2013-09-25) >> > >> > Platform: x86_64-w64-mingw32/x64 (64-bit) >> > >> > >> > >> > locale: >> > >> > [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C LC_TIME=Hebrew_Israel.1255 >> > >> > >> > >> > attached base packages: >> > >> > [1] parallel stats graphics grDevices utils datasets methods base >> > >> > >> > >> > other attached packages: >> > >> > [1] DESeq2_1.2.10 RcppArmadillo_0.4.100.2.1 Rcpp_0.11.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 BiocGenerics_0.8.0 >> > >> > >> > >> > loaded via a namespace (and not attached): >> > >> > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7 genefilter_1.44.0 grid_3.0.2 lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5 RSQLite_0.11.4 >> > >> > [11] splines_3.0.2 stats4_3.0.2 survival_2.37-7 tools_3.0.2 XML_3.98-1.1 xtable_1.7-3 >> > >> > >> > >> > Using the 1.3.47 version I have received: >> > >> > > resultsNames(dds) >> > >> > [1] "Intercept" "timetime_1" "timetime_2" "timetime_3" "timetime_4" "timetime_5" >> > >> > [7] "timetime_6" "metasmet_no" "metasmet_yes" "timetime_1.metasmet_no" "timetime_2.metasmet_no" "timetime_3.metasmet_no" >> > >> > [13] "timetime_4.metasmet_no" "timetime_5.metasmet_no" "timetime_6.metasmet_no" "timetime_1.metasmet_yes" "timetime_2.metasmet_yes" "timetime_3.metasmet_yes" >> > >> > [19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes" "timetime_6.metasmet_yes" >> > >> > > sessionInfo() >> > >> > R version 3.0.2 (2013-09-25) >> > >> > Platform: x86_64-w64-mingw32/x64 (64-bit) >> > >> > >> > >> > locale: >> > >> > [1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255 LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C LC_TIME=Hebrew_Israel.1255 >> > >> > >> > >> > attached base packages: >> > >> > [1] parallel stats graphics grDevices utils datasets methods base >> > >> > >> > >> > other attached packages: >> > >> > [1] DESeq2_1.3.47 RcppArmadillo_0.4.100.0 Rcpp_0.11.0 GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 >> > >> > [7] BiocGenerics_0.8.0 >> > >> > >> > >> > loaded via a namespace (and not attached): >> > >> > [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2 >> > >> > [8] lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2 survival_2.37-7 >> > >> > [15] XML_3.98-1.1 xtable_1.7-3 >> > >> > (I have ran the 1.3.47 version the same way besides a difference in the names of the time levels, but I do not believe that this is the reason for the differences) >> > I don't fully understand the results I receive using the 1.3.47 version and even more the difference between the versions. >> > >> > - From my understanding, the results I received using the 1.2.10 version are the more reasonable and they fit my settings of base levels in the data. Now after receiving these results I would love to understand how do I receive different contrast testing? For each time period metas_yes vs. metas_no (for example timetime_2.metasmet_yes vs. timetime_2.metasmet_no) >> > >> > >> > >> > Thank you in advance, >> > >> > Olga and Karen >> > >> > >> > >> > [[alternative HTML version deleted]] >> > >> > _______________________________________________ >> > Bioconductor mailing list >> > Bioconductor at r-project.org >> > https://stat.ethz.ch/mailman/listinfo/bioconductor >> > Search the archives: >> > http://news.gmane.org/gmane.science.biology.informatics.conductor >