Question: DESeq2

0

AROA SUÁREZ VEGA •

**20**wrote:Hi Michael,
Now that the new version is available I have some doubts about how to
do
the contrast.
I run the following commands:
##Loading DESeq2
source("http://bioconductor.org/biocLite.R")
biocLite("DESeq2")
library("DESeq2")
setwd("~/DESeq2")
milk<- read.csv("global_count_genes_milk.csv", header= TRUE,
row.names=1,
sep=" ")
milk<-subset(milk,select=c("A1_D10","A2_D10","A3_D10","A4_D10","C1_D10
","C2_D10","C3_D10","C4_D10","A1_D50","A2_D50","A3_D50","A4_D50","C1_D
50","C2_D50","C3_D50","C4_D50","A2_D120","A3_D120","A4_D120","C1_D120"
,"C3_D120","C4_D120","A1_D150","A2_D150","A3_D150","A4_D150","C1_D150"
,"C2_D150","C3_D150","C4_D150"))
milk_design=data.frame(row.names = colnames(milk), breed = c("A", "A",
"A","A","C","C","C","C","A","A","A","A","C","C","C",
"C","A","A","A","C","C","C","A","A","A","A","C","C","C","C"), day =
c("D10","D10","D10","D10","D10","D10","D10","D10","D50","D50","D50","D
50","D50","D50","D50","D50","D120","D120","D120","D120","D120","D120",
"D150","D150","D150","D150","D150","D150","D150","D150"))
##Creating DESeqDataSetFromMatrix
dds<- DESeqDataSetFromMatrix(countData= milk, colData= milk_design,
design=
~ breed + day)
dds$breed<- factor(dds$breed, levels=c("A","C"))
dds$day<- factor(dds$day, levels=c("D10","D120","D150","D50"))
##Normalization procedure
dds<-estimateSizeFactors(dds)
dds<-estimateDispersions(dds)
dds<-nbinomWaldTest(dds)
##Setting model and performing analysis
ddsX<-dds
design (ddsX)<-~breed+day+breed:day
ddsX<-DESeq(ddsX)
Running the model with interactions I obtain the following results
names:
resultsNames(ddsX)
[1] "Intercept" "breedA" "breedC"
[4] "dayD10" "dayD120" "dayD150"
[7] "dayD50" "breedA.dayD10" "breedC.dayD10"
[10] "breedA.dayD120" "breedC.dayD120" "breedA.dayD150"
[13] "breedC.dayD150" "breedA.dayD50" "breedC.dayD50"
I know that if I run the following contrast:
res_CvsA<-results(ddsX, contrast=list("breedC","breedA"))
I obtain the differentially expressed genes between C and A for all
the
points.
And if I run:
resD10_CvsA<-results(ddsX,
contrast=list("breedC.dayD10","breedA.dayD10"))
I obtain the interaction effect in addition to the main effect of
breed C
over breed A. But, if I want to obtain only the differentially
expressed
genes at that point without taken into account the baseline. In my
experiment I want to know the differentially expressed genes between
the
two breeds but it would be also interesting to obtain the
differentially
expressed genes between the two breeds at each time point. How do I
have to
proceed?
We also want to obtain the genes differentially expressed for each
breed
comparing the different time points.
Thank you very much in advance.
Aroa
Aroa SuÃ¡rez Vega
PhD student
Dpto. ProducciÃ³n Animal
Facultad de Veterinaria
Campus de Vegazana s/n
24071 Leon
Telef. 987291000 Ext. 5296
2014-04-04 18:14 GMT+02:00 Michael Love <michaelisaiahlove@gmail.com>:
> âhi Aroa,
>
> With the design "~ breed + day", you assume that the breed effect is
the
> same for each day, and the day effect is the same for each breed. So
there
> is no way to get different breed effects for each time point, or
vice versa
> with this design.â If you investigated and found that the
interaction
> terms were small/not significant, then it means that there are not
specific
> breed effects for each day, but that the main effect is sufficient.
>
> Mike
>
>
> On Fri, Apr 4, 2014 at 12:10 PM, AROA SUÃREZ VEGA
<asuav@unileon.es>wrote:
>
>> Thank you very much for your answer, so if I understand well, even
I
>> don't have interaction I need to use the model with the interaction
>> term, isn't it?
>>
>> Aroa SuÃ¡rez Vega
>> PhD student
>> Dpto. ProducciÃ³n Animal
>> Facultad de Veterinaria
>> Campus de Vegazana s/n
>> 24071 Leon
>> Telef. 987291000 Ext. 5296
>>
>>
>> 2014-04-04 17:59 GMT+02:00 Michael Love
<michaelisaiahlove@gmail.com>:
>>
>>>
>>> On Fri, Apr 4, 2014 at 11:30 AM, aroa [guest]
<guest@bioconductor.org>wrote:
>>>
>>>>
>>>> We have and experiment to measure the differences in the milk
>>>> transcriptome for two breeds. We have RNA-seq samples for these
two breeds
>>>> in 4 different time points, for each breed and time point we have
three
>>>> replicates.
>>>>
>>>> Day1 Day2 Day3 Day4
>>>> Breed1 3 3 3 3
>>>> Breed2 3 3 3 3
>>>>
>>>>
>>>> We want to use DESeq2 to extract the differential expressed (DE)
genes
>>>> for each time point between the two breeds and the DE genes for
each breed
>>>> comparing the different time points.
>>>> We have tested for the interaction (~breed+day+breed:day) and we
cannot
>>>> find interaction between the breeds.
>>>> Now we are running the model like this:
>>>> design=data.frame(row.names = colnames(milk), breed = c("breed1",
>>>> "breed1",
>>>> "breed1","breed2","breed2","breed2","breed1","breed1","breed1","b
reed2","breed2","breed2","breed1","breed1","breed1","breed2","breed2",
"breed2","breed1","breed1","breed1","breed2","breed2","breed2"),
>>>> day =
>>>> c("D1","D1","D1","D1","D1","D1","D2","D2","D2","D2","D2","D2","D3
","D3","D3","D3","D3","D3","D4","D4","D4","D4","D4","D4"))
>>>>
>>>
>>>
>>> side note: i would recommend putting the phenotypic data in a CSV
file
>>> and reading it in with read.csv.
>>>
>>>
>>>
>>>
>>>> dds<- DESeqDataSetFromMatrix(countData= milk, colData= design,
design=
>>>> ~ breed + day)
>>>>
>>>> dds$breed<- factor(dds$breed, levels=c("breed1","breed2"))
>>>> dds$day<- factor(dds$day, levels=c("D1","D2","D3","D4"))
>>>> dds<-DESeq(dds, betaPrior=FALSE)
>>>> resultsNames(dds)
>>>> [1] "Intercept" "breed_breed2_vs_breed1"
"day_D2_vs_D1"
>>>> [4] "day_D3_vs_D1" "day_D4_vs_D1"
>>>>
>>>> We would like to know what is the meaning of the resultsNames?,
we
>>>> understand them like this:
>>>> Intercept: breed1D1
>>>> breed_breed2_vs_breed1: breed2-breed1
>>>> day_D2_vs_D1: (D2-D1)breed1
>>>> day_D3_vs_D1: (D3-D1)breed1
>>>> day_D4_vs_D1: (D4-D1)breed1
>>>>
>>>>
>>> If you fit a model without interactions, then the last three terms
here
>>> are, e.g. D2 - D1 across both breeds, not specifically for breed
1.
>>>
>>>
>>>
>>>> And how we can make the contrast to obtain the results that we
want,
>>>> that it is comparing the different breeds in each time point and
the
>>>> different time points in each breed?
>>>>
>>>
>>> To compare different breeds at each time point, you need to use
the
>>> model with the interaction term. Interaction terms are how you
encode the
>>> hypothesis, say, that the "breed 2 vs 1 on day 2" effect is not
simply the
>>> sum (in log2FC space) of the main "breed 2 vs 1" effect and the
main "day 2
>>> vs day 1" effect.
>>>
>>> In version >= 1.3, we have simplified the extraction of results
for
>>> contrasts like this. This version will be released in 10 days, so
I'd
>>> prefer to recommend you upgrade to this version. If you need to
stick with
>>> v1.2 just email me and I can try to give details. For version 1.3,
you
>>> would fit a model with an interaction term. Then the breed 2 vs 1
effect on
>>> day 2 can be done with a list() argument to contrasts (see
?results in
>>> DESeq2 version >= 1.3 for more details):
>>>
>>> results(dds, contrast=list(c("breed_breed2","breedbreed2.dayD2"),
>>> c("breed_breed1","breedbreed1.dayD2"))
>>>
>>> Mike
>>>
>>>
>>>
>>> Thank you in advance.
>>>>
>>>>
>>>> -- output of sessionInfo():
>>>>
>>>> R version 3.0.3 (2014-03-06)
>>>> Platform: i386-w64-mingw32/i386 (32-bit)
>>>>
>>>> locale:
>>>> [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252
>>>> [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C
>>>> [5] LC_TIME=French_France.1252
>>>>
>>>> attached base packages:
>>>> [1] parallel stats graphics grDevices utils datasets
methods
>>>> base
>>>>
>>>> other attached packages:
>>>> [1] gplots_2.12.1 RColorBrewer_1.0-5
DESeq2_1.2.10
>>>> [4] RcppArmadillo_0.4.100.2.1 Rcpp_0.11.1
>>>> GenomicRanges_1.14.4
>>>> [7] XVector_0.2.0 IRanges_1.20.7
>>>> BiocGenerics_0.8.0
>>>> [10] BiocInstaller_1.12.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0
>>>> bitops_1.0-6
>>>> [5] caTools_1.16 DBI_0.2-7 gdata_2.13.2
>>>> genefilter_1.44.0
>>>> [9] grid_3.0.3 gtools_3.3.1 KernSmooth_2.23-10
>>>> lattice_0.20-27
>>>> [13] locfit_1.5-9.1 RSQLite_0.11.4 splines_3.0.3
>>>> stats4_3.0.3
>>>> [17] survival_2.37-7 tools_3.0.3 XML_3.98-1.1
>>>> xtable_1.7-3
>>>>
>>>> --
>>>> Sent via the guest posting facility at bioconductor.org.
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor@r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>>
>>
>
[[alternative HTML version deleted]]

ADD COMMENT
• link
•
modified 4.6 years ago
by
Michael Love ♦

**20k**• written 4.6 years ago by AROA SUÁREZ VEGA •**20**