Question: Identifying cell line-independent DEG in a time series (DESeq2)
0
gravatar for jhnwphillips
10 weeks ago by
jhnwphillips0 wrote:

Hello there.

I would like to identify the cell line-independent differentially expressed genes in a time series experiment, i.e. those genes that respond similarly to the stimulus regardless of cell line background. This is the opposite scenario from the one described in the DESeq2 documentation for the fission dataset and I am unsure how to proceed. Please help!

I have three cell lines (X9342, X10742, X10743), a five point time course (0h, 4h, 8h, 16h, 24h), and three biological replicates (T1, T2, T4) for each (total 45 samples). The code is as follows:

#Display coldata
> coldata
                 cell_line time_point
X9342_0h_T1          X9342         t0
X10743_0h_T2        X10743         t0
X10743_0h_T4        X10743         t0
...
X10742_24h_T1       X10742        t24
X10742_24h_T2       X10742        t24
X10742_24h_T4       X10742        t24


#Run DESeq with the interaction term
> dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~cell_line + time_point + cell_line:time_point)
dds <- DESeq(dds)

#Define a contrast and write an output
> res <- results(dds, contrast=c("time_point", "t0", "t24"))
> res <- res[order(res$padj),]
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds, normalized=T)), by="row.names", sort=F)
write.table(resdata, file="filename.txt", sep="\t", quote=F, row.names=F)

By inspection of the DEG output, this seems a bit off as some of the genes given low p-adj with large LFC don't show similar behavior when I look at the listed values in the accompanying matrix.

I suspect I have to do something similar as the fission example just with different terms. Please advise.

#fission example from DESeq2 workflow
ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain + minute)
resTC <- results(ddsTC)
resTC$symbol <- mcols(ddsTC)$symbol
head(resTC[order(resTC$padj),], 4)
#sessionInfo just in case
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.6

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DESeq2_1.14.1              SummarizedExperiment_1.4.0 Biobase_2.34.0             GenomicRanges_1.26.4      
[5] GenomeInfoDb_1.10.3        IRanges_2.8.2              S4Vectors_0.12.2           BiocGenerics_0.20.0       

loaded via a namespace (and not attached):
 [1] genefilter_1.56.0    locfit_1.5-9.1       splines_3.3.3        lattice_0.20-38      colorspace_1.4-1    
 [6] htmltools_0.3.6      base64enc_0.1-3      survival_2.44-1.1    XML_3.98-1.20        foreign_0.8-71      
[11] DBI_1.0.0            BiocParallel_1.8.1   RColorBrewer_1.1-2   plyr_1.8.4           stringr_1.4.0       
[16] zlibbioc_1.20.0      munsell_0.5.0        gtable_0.3.0         htmlwidgets_1.3      memoise_1.1.0       
[21] latticeExtra_0.6-28  knitr_1.23           geneplotter_1.52.0   AnnotationDbi_1.36.2 htmlTable_1.13.1    
[26] Rcpp_1.0.2           acepack_1.4.1        xtable_1.8-4         scales_1.0.0         backports_1.1.4     
[31] checkmate_1.9.4      Hmisc_4.1-1          annotate_1.52.1      XVector_0.14.1       gridExtra_2.3       
[36] ggplot2_3.2.0        digest_0.6.20        stringi_1.4.3        grid_3.3.3           tools_3.3.3         
[41] bitops_1.0-6         magrittr_1.5         lazyeval_0.2.2       RCurl_1.95-4.12      tibble_2.1.3        
[46] RSQLite_2.1.2        Formula_1.2-3        cluster_2.0.6        Matrix_1.2-12        data.table_1.12.2   
[51] assertthat_0.2.1     rpart_4.1-15         nnet_7.3-12         
deseq2 • 149 views
ADD COMMENTlink modified 10 weeks ago by Michael Love25k • written 10 weeks ago by jhnwphillips0
Answer: Identifying cell line-independent DEG in a time series (DESeq2)
1
gravatar for Michael Love
10 weeks ago by
Michael Love25k
United States
Michael Love25k wrote:

It seems like you could use a design of ~cell + time and perform and LRT with a reduced design of ~cell giving you the genes where there is a consistent time profile across the cell lines.

ADD COMMENTlink written 10 weeks ago by Michael Love25k

This appears to have worked! Thanks so much. Code follows (please double check) in case anyone else has this issue.

#Corrected design
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~cell_line + time_point)
dds <- DESeq(dds)

#Perform LRT to identify genes with consistent profile
ddsTC <- DESeq(dds, test="LRT", reduced = ~cell_line)
resTC <- results(ddsTC, contrast = c("time_point", "t0", "t24"))
resTC$symbol <- mcols(ddsTC)$symbol
head(resTC[order(resTC$padj),], 4)

#Output

#log2 fold change (MLE): time_point t0 vs t24 
#LRT p-value: '~ cell_line + time_point' vs '~ cell_line' 
#DataFrame with 4 rows and 6 columns
#       baseMean log2FoldChange     lfcSE      stat        pvalue          padj
#      <numeric>      <numeric> <numeric> <numeric>     <numeric>     <numeric>
#YPEL3 1282.4245      -3.710903 0.1829484  718.9134 2.797581e-154 4.802328e-150
#PDK2   716.8922      -1.900368 0.1099170  684.5640 7.664157e-147 6.578146e-143
#WDR45 1154.2460      -1.985449 0.1052600  558.1900 1.729407e-119 9.895665e-116
#CAPS   302.4570      -2.839824 0.1523674  526.9082 1.013159e-112 4.347973e-109
ADD REPLYlink written 10 weeks ago by jhnwphillips0
1

That’s fine, except the first DESeq() is not necessary, should be removed.

ADD REPLYlink written 10 weeks ago by Michael Love25k

Thanks again, Michael. Posting correction here for posterity:

#Corrected design
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~cell_line + time_point)

#Perform LRT to identify genes with consistent profile
ddsTC <- DESeq(dds, test="LRT", reduced = ~cell_line)
resTC <- results(ddsTC, contrast = c("time_point", "t0", "t24"))
resTC$symbol <- mcols(ddsTC)$symbol
ADD REPLYlink written 10 weeks ago by jhnwphillips0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 190 users visited in the last hour