DESeq2 with continuous variable
1
0
Entering edit mode
@jasminelee0603-23364
Last seen 4.2 years ago

Hello Mike, please kindly advise on the following question regarding analysis with deseq2. This might be a very general inquiry about running deseq2 with a continuous variable, but the characteristic of this variable is confusing me out.

My dataset is from a human bulk RNA-sequencing, consisting of samples age ranging between 16-76. The age isn't completely continuous, regarding that there are missing ages and some ages have multiple samples of that age. My goal is to get a list of differentially expressed genes across this entire age range. To do so I'm considering my variable as continuous, but I'm not sure if I'm conducting this correctly within DESeq2.

Below is part of my coldata:

> coldata
       Age Sex
N1864   16   M
N1854   18   M
N1724   19   M
N1744   20   F
N1216   22   F
N1505   24   M
N1311   25   M
N1749   25   M

and so on.

I ran the below codes to set up the data and run the analysis using LRT.

Data Setup

```{r} cts<-as.matrix(read.csv("counts.protein.ordered.reduced.csv",fileEncoding="UTF-8-BOM", row.names=1)) colnames(cts)<-NULL coldata<-read.csv("sample.info.reduced.csv",fileEncoding="UTF-8-BOM", row.names=1) coldata<-coldata[,c("Age","Sex")] coldata$Age<-as.factor(coldata$Age) dds <- DESeqDataSetFromMatrix(countData=cts, colData=coldata, design=~Sex+Age)

# Run Analysis
```{r}
dds <- DESeq(dds, test="LRT", reduced=~Sex)
resultsNames(dds)
res <- results(dds)
res

When I do so, I get the below:

> res
log2 fold change (MLE): Age 76 vs 16 
LRT p-value: '~ Sex + Age' vs '~ Sex' 
DataFrame with 18015 rows and 6 columns

This made me think that the reported log2FC is simply comparing the oldest vs. youngest sample, and not the change per unit change of age. (Is that true?) So I ran this again without factorizing age, which seemed like it worked. However, then the plotcounts function would not work since intgroup=age is not a factor.

Is this the correct way to approach this kind of variable in DESeq2? Please let me know your opinion on this. Thanks! Jasmine

Session info attached:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggrepel_0.8.1               pcaExplorer_2.10.1          RColorBrewer_1.1-2         
 [4] vsn_3.52.0                  pheatmap_1.0.12             IHW_1.12.0                 
 [7] apeglm_1.6.0                ggplot2_3.2.1               DESeq2_1.24.0              
[10] SummarizedExperiment_1.14.1 DelayedArray_0.10.0         BiocParallel_1.18.1        
[13] matrixStats_0.55.0          Biobase_2.44.0              GenomicRanges_1.36.1       
[16] GenomeInfoDb_1.20.0         IRanges_2.18.3              S4Vectors_0.22.1           
[19] BiocGenerics_0.30.0        

loaded via a namespace (and not attached):
  [1] GOstats_2.50.0         backports_1.1.5        Hmisc_4.2-0            NMF_0.21.0            
  [5] plyr_1.8.4             igraph_1.2.4.1         lazyeval_0.2.2         GSEABase_1.46.0       
  [9] shinydashboard_0.7.1   splines_3.6.1          crosstalk_1.0.0        gridBase_0.4-7        
 [13] lpsymphony_1.12.0      digest_0.6.21          foreach_1.4.7          htmltools_0.4.0       
 [17] GO.db_3.8.2            magrittr_1.5           checkmate_1.9.4        memoise_1.1.0         
 [21] cluster_2.1.0          doParallel_1.0.15      limma_3.40.6           annotate_1.62.0       
 [25] prettyunits_1.0.2      colorspace_1.4-1       blob_1.2.0             xfun_0.11             
 [29] dplyr_0.8.3            crayon_1.3.4           RCurl_1.95-4.12        jsonlite_1.6          
 [33] graph_1.62.0           genefilter_1.66.0      zeallot_0.1.0          survival_2.44-1.1     
 [37] iterators_1.0.12       glue_1.3.1             registry_0.5-1         gtable_0.3.0          
 [41] zlibbioc_1.30.0        XVector_0.24.0         Rgraphviz_2.28.0       SparseM_1.77          
 [45] scales_1.0.0           DBI_1.0.0              rngtools_1.4           bibtex_0.4.2          
 [49] Rcpp_1.0.2             xtable_1.8-4           progress_1.2.2         emdbook_1.3.11        
 [53] htmlTable_1.13.2       foreign_0.8-72         bit_1.1-14             preprocessCore_1.46.0 
 [57] Formula_1.2-3          DT_0.10                AnnotationForge_1.26.0 htmlwidgets_1.5.1     
 [61] httr_1.4.1             threejs_0.3.1          shinyAce_0.4.1         acepack_1.4.1         
 [65] pkgconfig_2.0.3        XML_3.98-1.20          nnet_7.3-12            locfit_1.5-9.1        
 [69] tidyselect_0.2.5       rlang_0.4.0            reshape2_1.4.4         later_1.0.0           
 [73] AnnotationDbi_1.46.1   munsell_0.5.0          tools_3.6.1            RSQLite_2.1.2         
 [77] fdrtool_1.2.15         evaluate_0.14          shinyBS_0.61           stringr_1.4.0         
 [81] fastmap_1.0.1          knitr_1.26             bit64_0.9-7            purrr_0.3.3           
 [85] RBGL_1.60.0            mime_0.7               slam_0.1-45            biomaRt_2.40.5        
 [89] compiler_3.6.1         rstudioapi_0.10        png_0.1-7              affyio_1.54.0         
 [93] tibble_2.1.3           geneplotter_1.62.0     stringi_1.4.3          lattice_0.20-38       
 [97] Matrix_1.2-17          vctrs_0.2.0            pillar_1.4.2           lifecycle_0.1.0       
[101] BiocManager_1.30.9     d3heatmap_0.6.1.2      data.table_1.12.4      bitops_1.0-6          
[105] httpuv_1.5.2           R6_2.4.1               latticeExtra_0.6-28    affy_1.62.0           
[109] promises_1.1.0         topGO_2.36.0           gridExtra_2.3          codetools_0.2-16      
[113] MASS_7.3-51.4          assertthat_0.2.1       Category_2.50.0        pkgmaker_0.27         
[117] withr_2.1.2            GenomeInfoDbData_1.2.1 hms_0.5.2              grid_3.6.1            
[121] rpart_4.1-15           tidyr_1.0.0            coda_0.19-3            rmarkdown_1.17        
[125] bbmle_1.0.20           numDeriv_2016.8-1.1    shiny_1.4.0            base64enc_0.1-3   
deseq2 • 3.2k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 11 days ago
United States

Yes, you likely either want age as a continuous variable (I’d recommend centering and scaling), or binned into meaningful groups.

But the statistical analysis plan is up to you. That’s beyond the scope of support I can provide.

ADD COMMENT

Login before adding your answer.

Traffic: 708 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6