Question

DEXSeq aggregated genes/exons result in NA values

0

Entering edit mode

osieman52 • 0

@osieman52-15026

Last seen 3.2 years ago

Netherlands

Using DEXSeq im trying to identify tissue specific exons. I created a DEXSeqDataSet for my data by following the vignette DEXSeq . When preparing the annotations I used the "aggregate gene" option which means that exons that overlap with other exons from different genes are used in my count files like: ENSG00000007062.11+ENSG00000137441.7 , followed by the exon read counts. And for some reason the aggregated genes in my dataset contain NA values even though they have good amount of read counts. So I understand that low read counts are filtered out by DEXSeq, but why are all the aggregated genes in my data set "filtered out" so to say. The exon usage coefficients are just not calculated..

                                                                groupID       featureID exonBaseMean  dispersion         stat      pvalue      padj  control experiment
ENSG00000000003.14:E001                                     ENSG00000000003.14      E001   21.5934367 0.011905251 6.9262557835 0.008493933 0.5423502 1.210454   1.429919
ENSG00000000003.14:E002                                     ENSG00000000003.14      E002  216.5326504 0.110281274 1.0275864363 0.310726219 0.5423502 2.457322   2.144896
ENSG00000000003.14:E003                                     ENSG00000000003.14      E003   52.0953996 0.006237274 0.9808538869 0.321988086 0.5423502 1.736622   1.601070
ENSG00000000003.14:E004                                     ENSG00000000003.14      E004    0.0000000          NA           NA          NA        NA       NA         NA
ENSG00000000003.14:E005                                     ENSG00000000003.14      E005   30.8065941 0.010928089 0.0084259473 0.926862539 0.9377536 1.501295   1.447958
ENSG00000000003.14:E006                                     ENSG00000000003.14      E006   21.8680338 0.018217991 0.3621809429 0.547297484 0.6856254 1.334083   1.345448
ENSG00000000003.14:E007                                     ENSG00000000003.14      E007   20.9092883 0.039292282 1.7782336563 0.182366373 0.5423502 1.263682   1.394588
ENSG00000000003.14:E008                                     ENSG00000000003.14      E008   23.5370862 0.030771007 3.3871217252 0.065707558 0.5423502 1.272502   1.455404
ENSG00000000003.14:E009                                     ENSG00000000003.14      E009   23.6855564 0.110540408 0.8169928461 0.366060857 0.5564125 1.339067   1.450256
ENSG00000000003.14:E010                                     ENSG00000000003.14      E010   14.9047361 0.170168976 0.2318733494 0.630138234 0.7340537 1.199031   1.230302
ENSG00000007062.11+ENSG00000137441.7:E001 ENSG00000007062.11+ENSG00000137441.7      E001    1.8947354 0.371061249 3.7215845815 0.053713374 0.5423502       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E002 ENSG00000007062.11+ENSG00000137441.7      E002   41.0638397 0.043644187 5.6370408998 0.017584862 0.5423502       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E003 ENSG00000007062.11+ENSG00000137441.7      E003   22.1230452 0.009472472 2.7653982357 0.096322713 0.5423502       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E004 ENSG00000007062.11+ENSG00000137441.7      E004    4.4719901 0.009095405 0.3996968876 0.527245838 0.6729068       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E005 ENSG00000007062.11+ENSG00000137441.7      E005    2.4483538 0.127028255 0.2648086938 0.606835598 0.7282027       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E006 ENSG00000007062.11+ENSG00000137441.7      E006    2.6379103 0.038700079 0.0003874227 0.984296207 0.9842962       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E007 ENSG00000007062.11+ENSG00000137441.7      E007    7.3760144 0.420739076 2.0974714818 0.147542955 0.5423502       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E008 ENSG00000007062.11+ENSG00000137441.7      E008   57.7554843 0.593533431 1.0996967570 0.294332663 0.5423502       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E009 ENSG00000007062.11+ENSG00000137441.7      E009  151.4436753 0.501405393 1.4215176025 0.233153775 0.5423502       NA         NA
ENSG00000007062.11+ENSG00000137441.7:E010 ENSG00000007062.11+ENSG00000137441.7      E010 6564.2611474 0.504871434 1.3415073630 0.246768317 0.5423502       NA         NA

Exon Read Counts:

                                              retina1 retina2 brain1 brain2
ENSG00000000003.14:E001                        36      34     10     19
ENSG00000000003.14:E002                       183     191    208    173
ENSG00000000003.14:E003                        45      60     40     59
ENSG00000000003.14:E004                         0       0      0      0
ENSG00000000003.14:E005                        32      41     19     39
ENSG00000000003.14:E006                        21      36     12     27
ENSG00000000003.14:E007                        22      42      9     24
ENSG00000000003.14:E008                        24      50     11     22
ENSG00000000003.14:E009                        26      47      8     34
ENSG00000000003.14:E010                        17      26      4     28
ENSG00000007062.11+ENSG00000137441.7:E001       0       0      3      1
ENSG00000007062.11+ENSG00000137441.7:E002      36      19     37     58
ENSG00000007062.11+ENSG00000137441.7:E003      27      17     11     46
ENSG00000007062.11+ENSG00000137441.7:E004       8       2      2      9
ENSG00000007062.11+ENSG00000137441.7:E005       8       1      0      5
ENSG00000007062.11+ENSG00000137441.7:E006       5       3      0      7
ENSG00000007062.11+ENSG00000137441.7:E007       1       2     11      3
ENSG00000007062.11+ENSG00000137441.7:E008      15      11     88     15
ENSG00000007062.11+ENSG00000137441.7:E009      54      25    224     51
ENSG00000007062.11+ENSG00000137441.7:E010    6709    2939   6553   6290

As you can see the aggregated genes that do have a count will contain NA values for some reason. Is any1 familiar with this issue or might know what is causing it.

Thanks in advance, Osman

sessionInfo( )

R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.3.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BatchJobs_1.9               BBmisc_1.12                 DEXSeq_1.40.0               RColorBrewer_1.1-3          AnnotationDbi_1.56.2        DESeq2_1.34.0              
 [7] SummarizedExperiment_1.24.0 GenomicRanges_1.46.1        GenomeInfoDb_1.30.1         IRanges_2.28.0              S4Vectors_0.32.4            MatrixGenerics_1.6.0       
[13] matrixStats_0.62.0          Biobase_2.54.0              BiocGenerics_0.40.0         BiocParallel_1.28.3        

loaded via a namespace (and not attached):
 [1] bitops_1.0-7           bit64_4.0.5            filelock_1.0.2         progress_1.2.2         httr_1.4.3             tools_4.1.2            backports_1.4.1       
 [8] utf8_1.2.2             R6_2.5.1               DBI_1.1.3              colorspace_2.0-3       tidyselect_1.1.2       prettyunits_1.1.1      bit_4.0.4             
[15] curl_4.3.2             compiler_4.1.2         sendmailR_1.2-1.1      cli_3.3.0              xml2_1.3.3             DelayedArray_0.20.0    scales_1.2.0          
[22] checkmate_2.1.0        genefilter_1.76.0      rappdirs_0.3.3         stringr_1.4.0          digest_0.6.29          Rsamtools_2.10.0       rmarkdown_2.14        
[29] XVector_0.34.0         base64enc_0.1-3        pkgconfig_2.0.3        htmltools_0.5.3        dbplyr_2.2.1           fastmap_1.1.0          rlang_1.0.4           
[36] rstudioapi_0.13        RSQLite_2.2.15         generics_0.1.3         hwriter_1.3.2.1        dplyr_1.0.9            RCurl_1.98-1.8         magrittr_2.0.3        
[43] GenomeInfoDbData_1.2.7 Matrix_1.4-1           Rcpp_1.0.9             munsell_0.5.0          fansi_1.0.3            lifecycle_1.0.1        stringi_1.7.8         
[50] yaml_2.3.5             zlibbioc_1.40.0        BiocFileCache_2.2.1    grid_4.1.2             blob_1.2.3             parallel_4.1.2         crayon_1.5.1          
[57] lattice_0.20-45        Biostrings_2.62.0      splines_4.1.2          annotate_1.72.0        hms_1.1.1              KEGGREST_1.34.0        locfit_1.5-9.6        
[64] knitr_1.39             pillar_1.8.0           geneplotter_1.72.0     biomaRt_2.50.3         XML_3.99-0.10          glue_1.6.2             evaluate_0.15         
[71] data.table_1.14.2      BiocManager_1.30.18    png_0.1-7              vctrs_0.4.1            gtable_0.3.0           purrr_0.3.4            assertthat_0.2.1      
[78] cachem_1.0.6           ggplot2_3.3.6          xfun_0.31              xtable_1.8-4           survival_3.3-1         tibble_3.1.8           memoise_2.0.1         
[85] statmod_1.4.36         brew_1.0-7             ellipsis_0.3.2

```

RNASeq R DEXSeq • 868 views

ADD COMMENT • link 3.5 years ago osieman52 • 0