Different results if "treated/untreated" is switched to "untreated/treated" with apeglm
1
0
Entering edit mode
serdarT • 0
@serdart
Last seen 3.4 years ago
USA, Seattle, Institute for Systems Bio…

Hello, I am trying to get differentially expressed genes after running Kallisto and importing abundances to DESeq2 by using tximport. I run DESEq2 after I relevel the condition factors for the specific comparison and then collect the specific results by using lfcShrink with apeglm.

When I switch the reference condition from untreated to treated, I get slightly different results. While most genes have the same log2FoldChange with only a sign change, some are different. For example, treated/untreated gives me 1608 significant differences, reverse comparison gives me 1618.

This only happens if I use lfcShrink with "apeglm". It doesn't happen with "normal" or "ashr" or if I just use "results".

Attached is an example of my code and also my ColData for dds1. Due to sequencing issues, I have some missing data that I do not include in the comparison, and I don't know if this is actually the problem.

I really appreciate it if you have any idea why this might be happening. Thank you

## Import kallisto abundances
  txi.kallisto <- tximport(files, type = "kallisto", txOut = TRUE)

  ## load data into DESEQ2
  dds <- DESeqDataSetFromTximport(txi.kallisto, sampleTable, ~condition)

## First comparison
  condition1 = "condition_36_ICN_vs_30_ICN"
## Reverse comparison
  condition2 = "condition_30_ICN_vs_36_ICN"

# Run DEseq2
dds1 <- DESeq(dds)

# 1st comparison
# relevel to set cond2 as our reference sample
dds1$condition <- relevel(dds1$condition, ref = "30_ICN")

# Re-run DEseq2
dds1 <- DESeq(dds1)

# get results and use lfc shrinkage
resLFC1 <- lfcShrink(dds = dds1, coef = condition1, type = "apeglm", lfcThreshold = lfc)

## Order output on svalue
res1.ordered <- resLFC1[order(resLFC1$svalue),]

# 2nd comparison
dds2 <- DESeq(dds)

# relevel to set cond1 as our reference sample
dds2$condition <- relevel(dds2$condition, ref = "36_ICN")

# Re-run DEseq2
dds2 <- DESeq(dds2)

# get results and use lfc shrinkage
resLFC2 <- lfcShrink(dds = dds2, coef = condition2, type = "apeglm", lfcThreshold = lfc)

## Order output on svalue
res2.ordered <- resLFC2[order(resLFC2$svalue),]

sum(res1.ordered$svalue < 0.0005, na.rm = TRUE)
1608
sum(res2.ordered$svalue < 0.0005, na.rm = TRUE)
1618

Here is my colData for dds1

sampleName  fileName    sample  Temp    Site.Name   condition   replaceable
ES1-30A abundance.h5    ES1_30A 30  ICN 30_ICN  FALSE
ES1-33A abundance.h5    ES1_33A 33  ICN 33_ICN  FALSE
ES2-30A abundance.h5    ES2_30A 30  ICN 30_ICN  FALSE
ES2-33A abundance.h5    ES2_33A 33  ICN 33_ICN  FALSE
ES2-36A abundance.h5    ES2_36A 36  ICN 36_ICN  FALSE
ES3-30A abundance.h5    ES3_30A 30  ICN 30_ICN  FALSE
ES3-36A abundance.h5    ES3_36A 36  ICN 36_ICN  FALSE
ES4-30A abundance.h5    ES4_30A 30  ICN 30_ICN  FALSE
ES4-33A abundance.h5    ES4_33A 33  ICN 33_ICN  FALSE
ES4-36A abundance.h5    ES4_36A 36  ICN 36_ICN  FALSE
ES5-30A abundance.h5    ES5_30A 30  ICN 30_ICN  FALSE
ES5-33A abundance.h5    ES5_33A 33  ICN 33_ICN  FALSE
ES5-36A abundance.h5    ES5_36A 36  ICN 36_ICN  FALSE
ES6-33A abundance.h5    ES6_33A 33  ICN 33_ICN  FALSE
ES6-36A abundance.h5    ES6_36A 36  ICN 36_ICN  FALSE
ES7-30A abundance.h5    ES7_30A 30  ICN 30_ICN  FALSE
ES7-33A abundance.h5    ES7_33A 33  ICN 33_ICN  FALSE
ES7-36A abundance.h5    ES7_36A 36  ICN 36_ICN  FALSE
KS1-30A abundance.h5    KS1_30A 30  AF  30_AF   TRUE
KS1-33A abundance.h5    KS1_33A 33  AF  33_AF   FALSE
KS1-36A abundance.h5    KS1_36A 36  AF  36_AF   TRUE
KS2-30A abundance.h5    KS2_30A 30  AF  30_AF   TRUE
KS2-36A abundance.h5    KS2_36A 36  AF  36_AF   TRUE
KS3-30A abundance.h5    KS3_30A 30  AF  30_AF   TRUE
KS3-33A abundance.h5    KS3_33A 33  AF  33_AF   FALSE
KS3-36A abundance.h5    KS3_36A 36  AF  36_AF   TRUE
KS4-30A abundance.h5    KS4_30A 30  AF  30_AF   TRUE
KS4-33A abundance.h5    KS4_33A 33  AF  33_AF   FALSE
KS4-36A abundance.h5    KS4_36A 36  AF  36_AF   TRUE
KS5-30A abundance.h5    KS5_30A 30  AF  30_AF   TRUE
KS5-33A abundance.h5    KS5_33A 33  AF  33_AF   FALSE
KS5-36A abundance.h5    KS5_36A 36  AF  36_AF   TRUE
KS6-30A abundance.h5    KS6_30A 30  AF  30_AF   TRUE
KS6-33A abundance.h5    KS6_33A 33  AF  33_AF   FALSE
KS6-36A abundance.h5    KS6_36A 36  AF  36_AF   TRUE
KS7-30A abundance.h5    KS7_30A 30  AF  30_AF   TRUE
KS7-33A abundance.h5    KS7_33A 33  AF  33_AF   FALSE
KS7-36A abundance.h5    KS7_36A 36  AF  36_AF   TRUE
S1-E-ST-30B abundance.h5    S1_E_ST_30B 30  ExT 30_ExT  TRUE
S1-E-ST-33B abundance.h5    S1_E_ST_33B 33  ExT 33_ExT  TRUE
S1-E-ST-36B abundance.h5    S1_E_ST_36B 36  ExT 36_ExT  TRUE
S1-P-ST-30B abundance.h5    S1_P_ST_30B 30  PrT 30_PrT  TRUE
S1-P-ST-33B abundance.h5    S1_P_ST_33B 33  PrT 33_PrT  TRUE
S1-P-ST-36B abundance.h5    S1_P_ST_36B 36  PrT 36_PrT  FALSE
S2-E-ST-30B abundance.h5    S2_E_ST_30B 30  ExT 30_ExT  TRUE
S2-E-ST-33B abundance.h5    S2_E_ST_33B 33  ExT 33_ExT  TRUE
S2-E-ST-36B abundance.h5    S2_E_ST_36B 36  ExT 36_ExT  TRUE
S2-P-ST-30B abundance.h5    S2_P_ST_30B 30  PrT 30_PrT  TRUE
S2-P-ST-33B abundance.h5    S2_P_ST_33B 33  PrT 33_PrT  TRUE
S2-P-ST-36B abundance.h5    S2_P_ST_36B 36  PrT 36_PrT  FALSE
S3-E-ST-30B abundance.h5    S3_E_ST_30B 30  ExT 30_ExT  TRUE
S3-E-ST-33B abundance.h5    S3_E_ST_33B 33  ExT 33_ExT  TRUE
S3-E-ST-36B abundance.h5    S3_E_ST_36B 36  ExT 36_ExT  TRUE
S3-P-ST-30B abundance.h5    S3_P_ST_30B 30  PrT 30_PrT  TRUE
S3-P-ST-33B abundance.h5    S3_P_ST_33B 33  PrT 33_PrT  TRUE
S3-P-ST-36B abundance.h5    S3_P_ST_36B 36  PrT 36_PrT  FALSE
S4-E-ST-30B abundance.h5    S4_E_ST_30B 30  ExT 30_ExT  TRUE
S4-E-ST-33B abundance.h5    S4_E_ST_33B 33  ExT 33_ExT  TRUE
S4-E-ST-36B abundance.h5    S4_E_ST_36B 36  ExT 36_ExT  TRUE
S4-P-ST-30B abundance.h5    S4_P_ST_30B 30  PrT 30_PrT  TRUE
S4-P-ST-33B abundance.h5    S4_P_ST_33B 33  PrT 33_PrT  TRUE
S4-P-ST-36B abundance.h5    S4_P_ST_36B 36  PrT 36_PrT  FALSE
S5-E-ST-30B abundance.h5    S5_E_ST_30B 30  ExT 30_ExT  TRUE
S5-E-ST-33B abundance.h5    S5_E_ST_33B 33  ExT 33_ExT  TRUE
S5-E-ST-36B abundance.h5    S5_E_ST_36B 36  ExT 36_ExT  TRUE
S5-P-ST-30B abundance.h5    S5_P_ST_30B 30  PrT 30_PrT  TRUE
S5-P-ST-33B abundance.h5    S5_P_ST_33B 33  PrT 33_PrT  TRUE
S5-P-ST-36B abundance.h5    S5_P_ST_36B 36  PrT 36_PrT  FALSE
S6-E-ST-30B abundance.h5    S6_E_ST_30B 30  ExT 30_ExT  TRUE
S6-E-ST-33B abundance.h5    S6_E_ST_33B 33  ExT 33_ExT  TRUE
S6-E-ST-36B abundance.h5    S6_E_ST_36B 36  ExT 36_ExT  TRUE
S6-P-ST-30B abundance.h5    S6_P_ST_30B 30  PrT 30_PrT  TRUE
S6-P-ST-33B abundance.h5    S6_P_ST_33B 33  PrT 33_PrT  TRUE
S7-E-ST-30B abundance.h5    S7_E_ST_30B 30  ExT 30_ExT  TRUE
S7-E-ST-33B abundance.h5    S7_E_ST_33B 33  ExT 33_ExT  TRUE
S7-E-ST-36B abundance.h5    S7_E_ST_36B 36  ExT 36_ExT  TRUE
S7-P-ST-30B abundance.h5    S7_P_ST_30B 30  PrT 30_PrT  TRUE
S7-P-ST-33B abundance.h5    S7_P_ST_33B 33  PrT 33_PrT  TRUE
S7-P-ST-36B abundance.h5    S7_P_ST_36B 36  PrT 36_PrT  FALSE
sessionInfo( )
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /local/R-3.6.2/lib/libRblas.so
LAPACK: /local/R-3.6.2/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.5.0               stringr_1.4.0               dplyr_1.0.2                
 [4] purrr_0.3.4                 readr_1.4.0                 tidyr_1.1.2                
 [7] tibble_3.0.3                ggplot2_3.3.2               tidyverse_1.3.0            
[10] DT_0.15                     DESeq2_1.26.0               SummarizedExperiment_1.16.1
[13] DelayedArray_0.12.3         BiocParallel_1.20.1         matrixStats_0.57.0         
[16] Biobase_2.46.0              GenomicRanges_1.38.0        GenomeInfoDb_1.22.1        
[19] IRanges_2.20.2              S4Vectors_0.24.4            BiocGenerics_0.32.0        
[22] rhdf5_2.30.1                tximport_1.14.2             workflowr_1.6.2            

loaded via a namespace (and not attached):
  [1] colorspace_1.4-1       ellipsis_0.3.1         htmlTable_2.1.0        XVector_0.26.0        
  [5] base64enc_0.1-3        fs_1.5.0               rstudioapi_0.11        bit64_4.0.5           
  [9] mvtnorm_1.1-1          apeglm_1.8.0           AnnotationDbi_1.48.0   fansi_0.4.1           
 [13] lubridate_1.7.9        xml2_1.3.2             splines_3.6.2          geneplotter_1.64.0    
 [17] knitr_1.30             Formula_1.2-3          jsonlite_1.7.1         broom_0.7.0           
 [21] annotate_1.64.0        ashr_2.2-47            cluster_2.1.0          dbplyr_1.4.4          
 [25] png_0.1-7              compiler_3.6.2         httr_1.4.2             backports_1.1.10      
 [29] assertthat_0.2.1       Matrix_1.2-18          cli_2.0.2              later_1.1.0.1         
 [33] htmltools_0.5.0        tools_3.6.2            coda_0.19-4            gtable_0.3.0          
 [37] glue_1.4.2             GenomeInfoDbData_1.2.2 Rcpp_1.0.5             bbmle_1.0.23.1        
 [41] cellranger_1.1.0       vctrs_0.3.4            xfun_0.18              rvest_0.3.6           
 [45] irlba_2.3.3            lifecycle_0.2.0        XML_3.99-0.3           MASS_7.3-53           
 [49] zlibbioc_1.32.0        scales_1.1.1           hms_0.5.3              promises_1.1.1        
 [53] RColorBrewer_1.1-2     memoise_1.1.0          gridExtra_2.3          emdbook_1.3.12        
 [57] bdsmatrix_1.3-4        rpart_4.1-15           SQUAREM_2020.5         latticeExtra_0.6-29   
 [61] stringi_1.5.3          RSQLite_2.2.1          genefilter_1.68.0      checkmate_2.0.0       
 [65] truncnorm_1.0-8        rlang_0.4.8            pkgconfig_2.0.3        bitops_1.0-6          
 [69] invgamma_1.1           evaluate_0.14          lattice_0.20-41        Rhdf5lib_1.8.0        
 [73] htmlwidgets_1.5.2      bit_4.0.4              tidyselect_1.1.0       plyr_1.8.6            
 [77] magrittr_1.5           R6_2.4.1               generics_0.0.2         Hmisc_4.4-1           
 [81] DBI_1.1.0              pillar_1.4.6           haven_2.3.1            foreign_0.8-76        
 [85] withr_2.3.0            mixsqp_0.3-43          survival_3.2-7         RCurl_1.98-1.2        
 [89] nnet_7.3-14            modelr_0.1.8           crayon_1.3.4           rmarkdown_2.4         
 [93] jpeg_0.1-8.1           locfit_1.5-9.4         grid_3.6.2             readxl_1.3.1          
 [97] data.table_1.13.0      blob_1.2.1             reprex_0.3.0           digest_0.6.25         
[101] xtable_1.8-4           numDeriv_2016.8-1.1    httpuv_1.5.4           munsell_0.5.0
DESeq2 • 599 views
ADD COMMENT
2
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

I think it is possible that the posterior may change upon recoding the design, at least, I am certain it will be the case for more complex designs.

So I would just use the natural coding for an experiment, and not change during the analysis.

ADD COMMENT
0
Entering edit mode

Hi Michael, Thanks a lot for your prompt reply. I will continue with the natural coding. I wanted to make sure this is not because of an issue with the way I build the comparisons.

ADD REPLY

Login before adding your answer.

Traffic: 846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6