Search
Question: DESeq2
0
4.4 years ago by
AROA SUÁREZ VEGA20 wrote:
Hi Michael, Now that the new version is available I have some doubts about how to do the contrast. I run the following commands: ##Loading DESeq2 source("http://bioconductor.org/biocLite.R") biocLite("DESeq2") library("DESeq2") setwd("~/DESeq2") milk<- read.csv("global_count_genes_milk.csv", header= TRUE, row.names=1, sep=" ") milk<-subset(milk,select=c("A1_D10","A2_D10","A3_D10","A4_D10","C1_D10 ","C2_D10","C3_D10","C4_D10","A1_D50","A2_D50","A3_D50","A4_D50","C1_D 50","C2_D50","C3_D50","C4_D50","A2_D120","A3_D120","A4_D120","C1_D120" ,"C3_D120","C4_D120","A1_D150","A2_D150","A3_D150","A4_D150","C1_D150" ,"C2_D150","C3_D150","C4_D150")) milk_design=data.frame(row.names = colnames(milk), breed = c("A", "A", "A","A","C","C","C","C","A","A","A","A","C","C","C", "C","A","A","A","C","C","C","A","A","A","A","C","C","C","C"), day = c("D10","D10","D10","D10","D10","D10","D10","D10","D50","D50","D50","D 50","D50","D50","D50","D50","D120","D120","D120","D120","D120","D120", "D150","D150","D150","D150","D150","D150","D150","D150")) ##Creating DESeqDataSetFromMatrix dds<- DESeqDataSetFromMatrix(countData= milk, colData= milk_design, design= ~ breed + day) dds$breed<- factor(dds$breed, levels=c("A","C")) dds$day<- factor(dds$day, levels=c("D10","D120","D150","D50")) ##Normalization procedure dds<-estimateSizeFactors(dds) dds<-estimateDispersions(dds) dds<-nbinomWaldTest(dds) ##Setting model and performing analysis ddsX<-dds design (ddsX)<-~breed+day+breed:day ddsX<-DESeq(ddsX) Running the model with interactions I obtain the following results names: resultsNames(ddsX) [1] "Intercept" "breedA" "breedC" [4] "dayD10" "dayD120" "dayD150" [7] "dayD50" "breedA.dayD10" "breedC.dayD10" [10] "breedA.dayD120" "breedC.dayD120" "breedA.dayD150" [13] "breedC.dayD150" "breedA.dayD50" "breedC.dayD50" I know that if I run the following contrast: res_CvsA<-results(ddsX, contrast=list("breedC","breedA")) I obtain the differentially expressed genes between C and A for all the points. And if I run: resD10_CvsA<-results(ddsX, contrast=list("breedC.dayD10","breedA.dayD10")) I obtain the interaction effect in addition to the main effect of breed C over breed A. But, if I want to obtain only the differentially expressed genes at that point without taken into account the baseline. In my experiment I want to know the differentially expressed genes between the two breeds but it would be also interesting to obtain the differentially expressed genes between the two breeds at each time point. How do I have to proceed? We also want to obtain the genes differentially expressed for each breed comparing the different time points. Thank you very much in advance. Aroa Aroa SuÃ¡rez Vega PhD student Dpto. ProducciÃ³n Animal Facultad de Veterinaria Campus de Vegazana s/n 24071 Leon Telef. 987291000 Ext. 5296 2014-04-04 18:14 GMT+02:00 Michael Love <michaelisaiahlove@gmail.com>: > âhi Aroa, > > With the design "~ breed + day", you assume that the breed effect is the > same for each day, and the day effect is the same for each breed. So there > is no way to get different breed effects for each time point, or vice versa > with this design.â If you investigated and found that the interaction > terms were small/not significant, then it means that there are not specific > breed effects for each day, but that the main effect is sufficient. > > Mike > > > On Fri, Apr 4, 2014 at 12:10 PM, AROA SUÃREZ VEGA <asuav@unileon.es>wrote: > >> Thank you very much for your answer, so if I understand well, even I >> don't have interaction I need to use the model with the interaction >> term, isn't it? >> >> Aroa SuÃ¡rez Vega >> PhD student >> Dpto. ProducciÃ³n Animal >> Facultad de Veterinaria >> Campus de Vegazana s/n >> 24071 Leon >> Telef. 987291000 Ext. 5296 >> >> >> 2014-04-04 17:59 GMT+02:00 Michael Love <michaelisaiahlove@gmail.com>: >> >>> >>> On Fri, Apr 4, 2014 at 11:30 AM, aroa [guest] <guest@bioconductor.org>wrote: >>> >>>> >>>> We have and experiment to measure the differences in the milk >>>> transcriptome for two breeds. We have RNA-seq samples for these two breeds >>>> in 4 different time points, for each breed and time point we have three >>>> replicates. >>>> >>>> Day1 Day2 Day3 Day4 >>>> Breed1 3 3 3 3 >>>> Breed2 3 3 3 3 >>>> >>>> >>>> We want to use DESeq2 to extract the differential expressed (DE) genes >>>> for each time point between the two breeds and the DE genes for each breed >>>> comparing the different time points. >>>> We have tested for the interaction (~breed+day+breed:day) and we cannot >>>> find interaction between the breeds. >>>> Now we are running the model like this: >>>> design=data.frame(row.names = colnames(milk), breed = c("breed1", >>>> "breed1", >>>> "breed1","breed2","breed2","breed2","breed1","breed1","breed1","b reed2","breed2","breed2","breed1","breed1","breed1","breed2","breed2", "breed2","breed1","breed1","breed1","breed2","breed2","breed2"), >>>> day = >>>> c("D1","D1","D1","D1","D1","D1","D2","D2","D2","D2","D2","D2","D3 ","D3","D3","D3","D3","D3","D4","D4","D4","D4","D4","D4")) >>>> >>> >>> >>> side note: i would recommend putting the phenotypic data in a CSV file >>> and reading it in with read.csv. >>> >>> >>> >>> >>>> dds<- DESeqDataSetFromMatrix(countData= milk, colData= design, design= >>>> ~ breed + day) >>>> >>>> dds$breed<- factor(dds$breed, levels=c("breed1","breed2")) >>>> dds$day<- factor(dds$day, levels=c("D1","D2","D3","D4")) >>>> dds<-DESeq(dds, betaPrior=FALSE) >>>> resultsNames(dds) >>>> [1] "Intercept" "breed_breed2_vs_breed1" "day_D2_vs_D1" >>>> [4] "day_D3_vs_D1" "day_D4_vs_D1" >>>> >>>> We would like to know what is the meaning of the resultsNames?, we >>>> understand them like this: >>>> Intercept: breed1D1 >>>> breed_breed2_vs_breed1: breed2-breed1 >>>> day_D2_vs_D1: (D2-D1)breed1 >>>> day_D3_vs_D1: (D3-D1)breed1 >>>> day_D4_vs_D1: (D4-D1)breed1 >>>> >>>> >>> If you fit a model without interactions, then the last three terms here >>> are, e.g. D2 - D1 across both breeds, not specifically for breed 1. >>> >>> >>> >>>> And how we can make the contrast to obtain the results that we want, >>>> that it is comparing the different breeds in each time point and the >>>> different time points in each breed? >>>> >>> >>> To compare different breeds at each time point, you need to use the >>> model with the interaction term. Interaction terms are how you encode the >>> hypothesis, say, that the "breed 2 vs 1 on day 2" effect is not simply the >>> sum (in log2FC space) of the main "breed 2 vs 1" effect and the main "day 2 >>> vs day 1" effect. >>> >>> In version >= 1.3, we have simplified the extraction of results for >>> contrasts like this. This version will be released in 10 days, so I'd >>> prefer to recommend you upgrade to this version. If you need to stick with >>> v1.2 just email me and I can try to give details. For version 1.3, you >>> would fit a model with an interaction term. Then the breed 2 vs 1 effect on >>> day 2 can be done with a list() argument to contrasts (see ?results in >>> DESeq2 version >= 1.3 for more details): >>> >>> results(dds, contrast=list(c("breed_breed2","breedbreed2.dayD2"), >>> c("breed_breed1","breedbreed1.dayD2")) >>> >>> Mike >>> >>> >>> >>> Thank you in advance. >>>> >>>> >>>> -- output of sessionInfo(): >>>> >>>> R version 3.0.3 (2014-03-06) >>>> Platform: i386-w64-mingw32/i386 (32-bit) >>>> >>>> locale: >>>> [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 >>>> [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C >>>> [5] LC_TIME=French_France.1252 >>>> >>>> attached base packages: >>>> [1] parallel stats graphics grDevices utils datasets methods >>>> base >>>> >>>> other attached packages: >>>> [1] gplots_2.12.1 RColorBrewer_1.0-5 DESeq2_1.2.10 >>>> [4] RcppArmadillo_0.4.100.2.1 Rcpp_0.11.1 >>>> GenomicRanges_1.14.4 >>>> [7] XVector_0.2.0 IRanges_1.20.7 >>>> BiocGenerics_0.8.0 >>>> [10] BiocInstaller_1.12.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0 >>>> bitops_1.0-6 >>>> [5] caTools_1.16 DBI_0.2-7 gdata_2.13.2 >>>> genefilter_1.44.0 >>>> [9] grid_3.0.3 gtools_3.3.1 KernSmooth_2.23-10 >>>> lattice_0.20-27 >>>> [13] locfit_1.5-9.1 RSQLite_0.11.4 splines_3.0.3 >>>> stats4_3.0.3 >>>> [17] survival_2.37-7 tools_3.0.3 XML_3.98-1.1 >>>> xtable_1.7-3 >>>> >>>> -- >>>> Sent via the guest posting facility at bioconductor.org. >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> >> > [[alternative HTML version deleted]]
modified 4.4 years ago by Michael Love19k • written 4.4 years ago by AROA SUÁREZ VEGA20
0
4.4 years ago by
Michael Love19k
United States
Michael Love19k wrote:
hi Aroa, On Mon, May 12, 2014 at 5:22 AM, AROA SU?REZ VEGA <asuav at="" unileon.es=""> wrote: > Hi Michael, > > > > Now that the new version is available I have some doubts about how to do the > contrast. > > > I run the following commands: > > ##Loading DESeq2 > > source("http://bioconductor.org/biocLite.R") > > biocLite("DESeq2") > > library("DESeq2") > > > > setwd("~/DESeq2") > > milk<- read.csv("global_count_genes_milk.csv", header= TRUE, row.names=1, > sep=" ") > > milk<-subset(milk,select=c("A1_D10","A2_D10","A3_D10","A4_D10","C1_D 10","C2_D10","C3_D10","C4_D10","A1_D50","A2_D50","A3_D50","A4_D50","C1 _D50","C2_D50","C3_D50","C4_D50","A2_D120","A3_D120","A4_D120","C1_D12 0","C3_D120","C4_D120","A1_D150","A2_D150","A3_D150","A4_D150","C1_D15 0","C2_D150","C3_D150","C4_D150")) > > milk_design=data.frame(row.names = colnames(milk), breed = c("A", "A", > "A","A","C","C","C","C","A","A","A","A","C","C","C", > "C","A","A","A","C","C","C","A","A","A","A","C","C","C","C"), day = > c("D10","D10","D10","D10","D10","D10","D10","D10","D50","D50","D50", "D50","D50","D50","D50","D50","D120","D120","D120","D120","D120","D120 ","D150","D150","D150","D150","D150","D150","D150","D150")) > > > > ##Creating DESeqDataSetFromMatrix > > dds<- DESeqDataSetFromMatrix(countData= milk, colData= milk_design, design= > ~ breed + day) > > dds$breed<- factor(dds$breed, levels=c("A","C")) > > dds$day<- factor(dds$day, levels=c("D10","D120","D150","D50")) > > The following code is unnecessary, from here... > > ##Normalization procedure > > dds<-estimateSizeFactors(dds) > > dds<-estimateDispersions(dds) > > dds<-nbinomWaldTest(dds) > > ...to here. These lines perform the same operation as the DESeq() call below, except using the wrong design formula. Furthermore, the results from these calls above are erased by calling DESeq(), except for the estimateSizeFactors(dds) call, which produces the same result as below because it doesn't depend on the design. > > ##Setting model and performing analysis > > ddsX<-dds > > design (ddsX)<-~breed+day+breed:day > > ddsX<-DESeq(ddsX) > > > > Running the model with interactions I obtain the following results names: > > resultsNames(ddsX) > > [1] "Intercept" "breedA" "breedC" > > [4] "dayD10" "dayD120" "dayD150" > > [7] "dayD50" "breedA.dayD10" "breedC.dayD10" > > [10] "breedA.dayD120" "breedC.dayD120" "breedA.dayD150" > > [13] "breedC.dayD150" "breedA.dayD50" "breedC.dayD50" > > > > I know that if I run the following contrast: > > > res_CvsA<-results(ddsX, contrast=list("breedC","breedA")) > > > I obtain the differentially expressed genes between C and A for all the > points. > Yes. And this is equivalent to using contrast=c("breed","C","A"). > > And if I run: > > > resD10_CvsA<-results(ddsX, contrast=list("breedC.dayD10","breedA.dayD10")) > > > I obtain the interaction effect in addition to the main effect of breed C > over breed A. Yes. > But, if I want to obtain only the differentially expressed > genes at that point without taken into account the baseline. This is then the main effect and interaction effect added together: results(ddsX, contrast=list(c("breedC","breedC.dayD10"),c("breedA","br eedA.dayD10"))) > In my > experiment I want to know the differentially expressed genes between the two > breeds but it would be also interesting to obtain the differentially > expressed genes between the two breeds at each time point. How do I have to > proceed? > > > We also want to obtain the genes differentially expressed for each breed > comparing the different time points. > For example, the day 120 vs 10 effect for breed A is: results(ddsX, contrast=list(c("dayD120","breedA.dayD120"),c("dayD10"," breedA.dayD10"))) Mike > > Thank you very much in advance. > > > Aroa > > > Aroa Su?rez Vega > PhD student > Dpto. Producci?n Animal > Facultad de Veterinaria > Campus de Vegazana s/n > 24071 Leon > Telef. 987291000 Ext. 5296 > >