DESeq2 design with different cell lines and disease status
1
0
Entering edit mode
lu.ne • 0
@lune-19644
Last seen 3.5 years ago

Hello,

I've been trying to analyse a set of 18 samples using DESeq2 and I am unsure about the design. Here is the sample information:

       cell.line     disease
L1_A   L1            Yes
L1_B   L1            Yes
L1_C   L1            Yes
L1_D   L1            No
L1_E   L1            No
L1_F   L1            No
L2_A   L2            Yes
L2_B   L2            Yes
L2_C   L2            Yes
L2_D   L2            No
L2_E   L2            No
L2_F   L2            No
L3_A   L3            Yes
L3_B   L3            Yes
L3_C   L3            Yes
L3_D   L3            No
L3_E   L3            No
L3_F   L3            No

My aim is to identify differentially expressed genes between healthy and disease samples for each one of the cell lines. As I am not interested here in the effect of disease or cell line across all samples I assumed my design should have an interaction term such as cell.line:disease, however, I am unsure about whether or not I should also add cell.line+disease to my design so that it makes sense to extract the comparisons I am interested in? I have also considered splitting the analysis in different DESeq objects, one for each cell line, but I was worried I might run into multiple comparison issues. I have been through many of the available posts related to this but found it difficult to determine whether or not the same strategies would apply here. Pointers/advice would be greatly appreciated.

Thank you.

sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS  10.14.2

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DESeq2_1.20.0               SummarizedExperiment_1.10.1 DelayedArray_0.6.6          BiocParallel_1.14.2        
 [5] matrixStats_0.54.0          Biobase_2.40.0              GenomicRanges_1.32.7        GenomeInfoDb_1.16.0        
 [9] IRanges_2.14.12             S4Vectors_0.18.3            BiocGenerics_0.26.0        

loaded via a namespace (and not attached):
 [1] bit64_0.9-7            splines_3.5.0          Formula_1.2-3          assertthat_0.2.0       latticeExtra_0.6-28   
 [6] blob_1.1.1             GenomeInfoDbData_1.1.0 yaml_2.2.0             pillar_1.3.0           RSQLite_2.1.1         
[11] backports_1.1.2        lattice_0.20-38        glue_1.3.0             digest_0.6.18          RColorBrewer_1.1-2    
[16] XVector_0.20.0         checkmate_1.8.5        colorspace_1.3-2       htmltools_0.3.6        Matrix_1.2-15         
[21] plyr_1.8.4             XML_3.98-1.16          pkgconfig_2.0.2        genefilter_1.62.0      zlibbioc_1.26.0       
[26] purrr_0.2.5            xtable_1.8-3           scales_1.0.0           htmlTable_1.12         tibble_1.4.2          
[31] annotate_1.58.0        ggplot2_3.1.0          nnet_7.3-12            lazyeval_0.2.1         survival_2.43-3       
[36] magrittr_1.5           crayon_1.3.4           memoise_1.1.0          foreign_0.8-71         tools_3.5.0           
[41] data.table_1.11.8      stringr_1.3.1          locfit_1.5-9.1         munsell_0.5.0          cluster_2.0.7-1       
[46] AnnotationDbi_1.42.1   bindrcpp_0.2.2         compiler_3.5.0         rlang_0.3.0.1          grid_3.5.0            
[51] RCurl_1.95-4.11        rstudioapi_0.8         htmlwidgets_1.3        bitops_1.0-6           base64enc_0.1-3       
[56] gtable_0.2.0           DBI_1.0.0              R6_2.3.0               gridExtra_2.3          knitr_1.20            
[61] dplyr_0.7.8            bit_1.1-14             bindr_0.1.1            Hmisc_4.1-1            stringi_1.2.4         
[66] Rcpp_1.0.0             geneplotter_1.58.0     rpart_4.1-13           acepack_1.4.1          tidyselect_0.2.5
deseq2 design • 942 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

If the goal is "to identify differentially expressed genes between healthy and disease samples for each one of the cell lines", take a look at the first paragraph on the vignette section on Interactions.

ADD COMMENT
0
Entering edit mode

Ah, I had somehow assumed that this approach would result in a sensitivity loss because everything was analysed together and that I would need to drop one of the terms in the design to prevent that. I now see how combining the factors could solve that. Thank you for making the parallel, it seems obvious now...

ADD REPLY

Login before adding your answer.

Traffic: 451 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6