summarizeToGene in tximeta causing vastly different results
1
0
Entering edit mode
akrunic • 0
@5f47c959
Last seen 20 months ago
United States

Hello

I am trying to run Deseq2 analysis on a salmon output. When I import my data using tximeta, I get completely different results based on whether I used the summarizeToGene function. When I run the first set of parameters, I get 172 DEGs, while without running summarizeToGene I get 438. Why is there such a large difference when I use summarizeToGene and which result should I trust for downstream analysis


hippo12APP <- tximeta(coldata12APP)

##Filtering and Adjusting
hippo12APP <- addExons(hippo12APP)
hippo12APP <- summarizeToGene(hippo12APP)



ddsTxi12 <-DESeqDataSet(hippo12APP, design = ~ diet)

dds12 <- DESeq(ddsTxi12)

  row               baseMean        log2FoldChange        lfcSE             stat             pvalue         
 Length:172         Min.   :    5.96   Min.   :-6.2495   Min.   :0.1065   Min.   :-5.9385   Min.   :2.900e-09  
 Class :character   1st Qu.:   61.23   1st Qu.:-1.7460   1st Qu.:0.1518   1st Qu.:-3.8379   1st Qu.:2.666e-05  
 Mode  :character   Median :  281.04   Median :-0.5681   Median :0.2407   Median :-3.2730   Median :1.570e-04  
                    Mean   : 1445.57   Mean   :-0.5843   Mean   :0.3281   Mean   :-0.1817   Mean   :3.280e-04  
                    3rd Qu.: 1071.05   3rd Qu.: 0.5838   3rd Qu.:0.4575   3rd Qu.: 3.7496   3rd Qu.:5.804e-04  
                    Max.   :74531.16   Max.   : 2.8181   Max.   :1.6449   Max.   : 5.9377   Max.   :1.085e-03  
      padj              SYMBOL         
 Min.   :2.121e-05   Length:172        
 1st Qu.:9.582e-03   Class :character  
 Median :2.859e-02   Mode  :character  
 Mean   :4.042e-02                     
 3rd Qu.:7.098e-02                     
 Max.   :9.992e-02                     

##Running code above without summarizeToGene
 summary(resSigAPP12)
     row               baseMean        log2FoldChange         lfcSE              stat             pvalue         
 Length:438         Min.   :    5.60   Min.   :-22.9352   Min.   :0.03517   Min.   :-10.018   Min.   :0.000e+00  
 Class :character   1st Qu.:   20.20   1st Qu.: -5.3601   1st Qu.:0.14304   1st Qu.: -3.719   1st Qu.:9.065e-07  
 Mode  :character   Median :   82.43   Median :  0.3354   Median :0.65931   Median :  3.389   Median :9.234e-05  
                    Mean   : 1264.74   Mean   : -1.0568   Mean   :1.05338   Mean   :  0.574   Mean   :2.564e-04  
                    3rd Qu.:  920.82   3rd Qu.:  1.1407   3rd Qu.:1.66442   3rd Qu.:  4.073   3rd Qu.:4.601e-04  
                    Max.   :74163.86   Max.   : 20.5094   Max.   :3.91060   Max.   :  9.387   Max.   :1.078e-03  
      padj              SYMBOL         
 Min.   :0.0000000   Length:438        
 1st Qu.:0.0003319   Class :character  
 Median :0.0169791   Mode  :character  
 Mean   :0.0297324                     
 3rd Qu.:0.0564928                     
 Max.   :0.0993225                     

Sessioninfo()
R version 4.2.3 (2023-03-15 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tximeta_1.16.1              org.Mm.eg.db_3.16.0         GO.db_3.16.0                AnnotationDbi_1.60.2       
 [5] lubridate_1.9.2             forcats_1.0.0               stringr_1.5.0               dplyr_1.1.1                
 [9] purrr_1.0.1                 tidyr_1.3.0                 tibble_3.2.1                ggplot2_3.4.2              
[13] tidyverse_2.0.0             readr_2.1.4                 DESeq2_1.38.3               SummarizedExperiment_1.28.0
[17] Biobase_2.58.0              MatrixGenerics_1.10.0       matrixStats_0.63.0          GenomicRanges_1.50.2       
[21] GenomeInfoDb_1.34.9         IRanges_2.32.0              S4Vectors_0.36.2            BiocGenerics_0.44.0        
[25] tximport_1.26.1            

loaded via a namespace (and not attached):
 [1] colorspace_2.1-0              rjson_0.2.21                  ellipsis_0.3.2               
 [4] XVector_0.38.0                rstudioapi_0.14               DT_0.27                      
 [7] bit64_4.0.5                   interactiveDisplayBase_1.36.0 fansi_1.0.4                  
[10] xml2_1.3.3                    codetools_0.2-19              cachem_1.0.7                 
[13] geneplotter_1.76.0            jsonlite_1.8.4                Rsamtools_2.14.0             
[16] annotate_1.76.0               dbplyr_2.3.2                  png_0.1-8                    
[19] shiny_1.7.4                   BiocManager_1.30.20           compiler_4.2.3               
[22] httr_1.4.5                    Matrix_1.5-3                  fastmap_1.1.1                
[25] lazyeval_0.2.2                cli_3.6.1                     later_1.3.0                  
[28] htmltools_0.5.5               prettyunits_1.1.1             tools_4.2.3                  
[31] gtable_0.3.3                  glue_1.6.2                    GenomeInfoDbData_1.2.9       
[34] rappdirs_0.3.3                Rcpp_1.0.10                   vctrs_0.6.1                  
[37] Biostrings_2.66.0             rtracklayer_1.58.0            timechange_0.2.0             
[40] mime_0.12                     lifecycle_1.0.3               restfulr_0.0.15              
[43] ensembldb_2.22.0              XML_3.99-0.14                 AnnotationHub_3.6.0          
[46] zlibbioc_1.44.0               scales_1.2.1                  vroom_1.6.1                  
[49] hms_1.1.3                     promises_1.2.0.1              ProtGenerics_1.30.0          
[52] parallel_4.2.3                AnnotationFilter_1.22.0       RColorBrewer_1.1-3           
[55] yaml_2.3.7                    curl_5.0.0                    memoise_2.0.1                
[58] biomaRt_2.54.1                stringi_1.7.12                RSQLite_2.3.1                
[61] BiocVersion_3.16.0            BiocIO_1.8.0                  GenomicFeatures_1.50.4       
[64] filelock_1.0.2                BiocParallel_1.32.6           rlang_1.1.0                  
[67] pkgconfig_2.0.3               bitops_1.0-7                  lattice_0.20-45              
[70] GenomicAlignments_1.34.1      htmlwidgets_1.6.2             bit_4.0.5                    
[73] tidyselect_1.2.0              magrittr_2.0.3                R6_2.5.1                     
[76] generics_0.1.3                DelayedArray_0.23.2           DBI_1.1.3                    
[79] pillar_1.9.0                  withr_2.5.0                   KEGGREST_1.38.0              
[82] RCurl_1.98-1.12               crayon_1.5.2                  utf8_1.2.3                   
[85] BiocFileCache_2.6.1           tzdb_0.3.0                    progress_1.2.2               
[88] locfit_1.5-9.7                grid_4.2.3                    blob_1.2.4                   
[91] digest_0.6.31                 xtable_1.8-4                  httpuv_1.6.9                 
[94] munsell_0.5.0
DESeq2 deseq txime tximeta • 935 views
ADD COMMENT
1
Entering edit mode
urwah ▴ 10
@5a24aca2
Last seen 10 months ago
Australia

When you import counts by tximeta, it imports them as transcript-level quantifications. summarizeToGene() turns these counts from transcript to gene-level. The reason why you're probably seeing different results is because in the first chunk of code, you've performed a differential gene expression analysis, whereas in the second (without summarizeToGene()), it's a differential transcript expression analysis.

ADD COMMENT
0
Entering edit mode

Agree. The biological meaning is different. Both are valid questions. You can have DTE in a gene leading to no DGE (DTU) or the other way around. The DGE question regards the total RNA output of the isoforms when summed together.

ADD REPLY

Login before adding your answer.

Traffic: 929 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6