Entering edit mode
solgakar@bi.technion.ac.il
▴
90
@solgakarbitechnionacil-6453
Last seen 7.7 years ago
European Union
From: Karen Chait [mailto:kchait@tx.technion.ac.il]
Sent: Monday, March 17, 2014 10:57 AM
To: Olga Karinsky
Subject: RE: using DESeq2 with multi factor data
Hello all,
I am trying to use the DESeq2 package to perform RNA-Seq analysis on a
data containing several factors.
I have been closely following the emails between Ming Yi and Michael
Love, because I think that my problem is very similar to what they
have discussed. But even though I received a lot of useful
information from their discussion, I still have several questions
regarding my specific data.
Just as an overall information regarding my data, I have 96 samples
and the two factors I am interested in exploring are "time" and
"metastasis".
In order to build my data set I used the following commands:
> countData = read.table("merged_counts.txt", header=TRUE,
row.names=1)
> metasVector=c("met_no","met_no","met_no","met_no","met_no","met_no"
,"met_no","met_no","met_no","met_no","met_no","met_no","met_no","met_n
o","met_no","met_no","met_no","met_no","met_no","met_no","met_no","met
_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no","m
et_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no",
"met_no","met_no","met_no","met_no","met_no","met_no","met_no","met_no
","met_no","met_no","met_no","met_no","met_no","met_no","met_no","met_
no","met_no","met_no","met_no","met_no","met_no","met_no","met_no","me
t_no","met_no","met_no","met_no","met_no","met_no","met_yes","met_yes"
,"met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes"
,"met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes"
,"met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes"
,"met_yes","met_yes","met_yes","met_yes","met_yes","met_yes","met_yes"
(
> timePointsVector=c("6","4","6","6","3","6","3","5","6","6","1","5",
"3","4","3","6","6","6","2","6","1","2","4","6","5","5","5","3","6","5
","6","2","6","6","1","5","5","6","6","6","6","6","6","4","2","6","3",
"1","2","5","6","1","1","3","6","3","6","4","4","5","6","6","3","5","4
","6","1","4","3","1","1","1","4","2","1","1","3","6","1","1","2","1",
"6","3","3","2","5","3","2","3","1","4","1","1","6","1")
> colData=data.frame(row.names=colnames(countData),metas=metasVector,
gender=gendarVector)
> colData$metas=factor(colData$metas, levels=c("met_no","met_yes"))
> colData$time = factor(colData$time, levels = c("1", "2", "3", "4",
"5", "6"))
> dds=DESeqDataSetFromMatrix(countData=tmpcountData, colData=colData,
design=~time + metas + metas:time)
> dds=DESeq(dds)
I have several questions:
- first of all I have tried running those commands on DESeq2
version 1.2.10 (R version 3.0.2) and DESeq2 version 1.3.47 (R version
3.0.2) and what I have received from the resultsNames() function I
both cases is very different. Using the 1.2.10 version I have
received:
> resultsNames(dds)
[1] "Intercept" "time_2_vs_1" "time_3_vs_1"
"time_4_vs_1" "time_5_vs_1" "time_6_vs_1"
"metas_met_yes_vs_met_no" "time2.metasmet_yes"
[9] "time3.metasmet_yes" "time4.metasmet_yes"
"time5.metasmet_yes" "time6.metasmet_yes"
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255
LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C
LC_TIME=Hebrew_Israel.1255
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] DESeq2_1.2.10 RcppArmadillo_0.4.100.2.1 Rcpp_0.11.0
GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7
BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0
DBI_0.2-7 genefilter_1.44.0 grid_3.0.2
lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5
RSQLite_0.11.4
[11] splines_3.0.2 stats4_3.0.2 survival_2.37-7
tools_3.0.2 XML_3.98-1.1 xtable_1.7-3
Using the 1.3.47 version I have received:
> resultsNames(dds)
[1] "Intercept" "timetime_1" "timetime_2"
"timetime_3" "timetime_4" "timetime_5"
[7] "timetime_6" "metasmet_no"
"metasmet_yes" "timetime_1.metasmet_no"
"timetime_2.metasmet_no" "timetime_3.metasmet_no"
[13] "timetime_4.metasmet_no" "timetime_5.metasmet_no"
"timetime_6.metasmet_no" "timetime_1.metasmet_yes"
"timetime_2.metasmet_yes" "timetime_3.metasmet_yes"
[19] "timetime_4.metasmet_yes" "timetime_5.metasmet_yes"
"timetime_6.metasmet_yes"
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Hebrew_Israel.1255 LC_CTYPE=Hebrew_Israel.1255
LC_MONETARY=Hebrew_Israel.1255 LC_NUMERIC=C
LC_TIME=Hebrew_Israel.1255
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] DESeq2_1.3.47 RcppArmadillo_0.4.100.0 Rcpp_0.11.0
GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7
[7] BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] annotate_1.40.1 AnnotationDbi_1.24.0 Biobase_2.22.0
DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0
grid_3.0.2
[8] lattice_0.20-27 locfit_1.5-9.1 RColorBrewer_1.0-5
RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2
survival_2.37-7
[15] XML_3.98-1.1 xtable_1.7-3
(I have ran the 1.3.47 version the same way besides a difference in
the names of the time levels, but I do not believe that this is the
reason for the differences)
I don't fully understand the results I receive using
the 1.3.47 version and even more the difference between the versions.
- From my understanding, the results I received using the
1.2.10 version are the more reasonable and they fit my settings of
base levels in the data. Now after receiving these results I would
love to understand how do I receive different contrast testing? For
each time period metas_yes vs. metas_no (for example
timetime_2.metasmet_yes vs. timetime_2.metasmet_no)
Thank you in advance,
Olga and Karen
[[alternative HTML version deleted]]