Using DEXSeq im trying to identify tissue specific exons. I created a DEXSeqDataSet for my data by following the vignette DEXSeq . When preparing the annotations I used the "aggregate gene" option which means that exons that overlap with other exons from different genes are used in my count files like: ENSG00000007062.11+ENSG00000137441.7
, followed by the exon read counts. And for some reason the aggregated genes in my dataset contain NA values even though they have good amount of read counts. So I understand that low read counts are filtered out by DEXSeq, but why are all the aggregated genes in my data set "filtered out" so to say. The exon usage coefficients are just not calculated..
groupID featureID exonBaseMean dispersion stat pvalue padj control experiment
ENSG00000000003.14:E001 ENSG00000000003.14 E001 21.5934367 0.011905251 6.9262557835 0.008493933 0.5423502 1.210454 1.429919
ENSG00000000003.14:E002 ENSG00000000003.14 E002 216.5326504 0.110281274 1.0275864363 0.310726219 0.5423502 2.457322 2.144896
ENSG00000000003.14:E003 ENSG00000000003.14 E003 52.0953996 0.006237274 0.9808538869 0.321988086 0.5423502 1.736622 1.601070
ENSG00000000003.14:E004 ENSG00000000003.14 E004 0.0000000 NA NA NA NA NA NA
ENSG00000000003.14:E005 ENSG00000000003.14 E005 30.8065941 0.010928089 0.0084259473 0.926862539 0.9377536 1.501295 1.447958
ENSG00000000003.14:E006 ENSG00000000003.14 E006 21.8680338 0.018217991 0.3621809429 0.547297484 0.6856254 1.334083 1.345448
ENSG00000000003.14:E007 ENSG00000000003.14 E007 20.9092883 0.039292282 1.7782336563 0.182366373 0.5423502 1.263682 1.394588
ENSG00000000003.14:E008 ENSG00000000003.14 E008 23.5370862 0.030771007 3.3871217252 0.065707558 0.5423502 1.272502 1.455404
ENSG00000000003.14:E009 ENSG00000000003.14 E009 23.6855564 0.110540408 0.8169928461 0.366060857 0.5564125 1.339067 1.450256
ENSG00000000003.14:E010 ENSG00000000003.14 E010 14.9047361 0.170168976 0.2318733494 0.630138234 0.7340537 1.199031 1.230302
ENSG00000007062.11+ENSG00000137441.7:E001 ENSG00000007062.11+ENSG00000137441.7 E001 1.8947354 0.371061249 3.7215845815 0.053713374 0.5423502 NA NA
ENSG00000007062.11+ENSG00000137441.7:E002 ENSG00000007062.11+ENSG00000137441.7 E002 41.0638397 0.043644187 5.6370408998 0.017584862 0.5423502 NA NA
ENSG00000007062.11+ENSG00000137441.7:E003 ENSG00000007062.11+ENSG00000137441.7 E003 22.1230452 0.009472472 2.7653982357 0.096322713 0.5423502 NA NA
ENSG00000007062.11+ENSG00000137441.7:E004 ENSG00000007062.11+ENSG00000137441.7 E004 4.4719901 0.009095405 0.3996968876 0.527245838 0.6729068 NA NA
ENSG00000007062.11+ENSG00000137441.7:E005 ENSG00000007062.11+ENSG00000137441.7 E005 2.4483538 0.127028255 0.2648086938 0.606835598 0.7282027 NA NA
ENSG00000007062.11+ENSG00000137441.7:E006 ENSG00000007062.11+ENSG00000137441.7 E006 2.6379103 0.038700079 0.0003874227 0.984296207 0.9842962 NA NA
ENSG00000007062.11+ENSG00000137441.7:E007 ENSG00000007062.11+ENSG00000137441.7 E007 7.3760144 0.420739076 2.0974714818 0.147542955 0.5423502 NA NA
ENSG00000007062.11+ENSG00000137441.7:E008 ENSG00000007062.11+ENSG00000137441.7 E008 57.7554843 0.593533431 1.0996967570 0.294332663 0.5423502 NA NA
ENSG00000007062.11+ENSG00000137441.7:E009 ENSG00000007062.11+ENSG00000137441.7 E009 151.4436753 0.501405393 1.4215176025 0.233153775 0.5423502 NA NA
ENSG00000007062.11+ENSG00000137441.7:E010 ENSG00000007062.11+ENSG00000137441.7 E010 6564.2611474 0.504871434 1.3415073630 0.246768317 0.5423502 NA NA
Exon Read Counts:
retina1 retina2 brain1 brain2
ENSG00000000003.14:E001 36 34 10 19
ENSG00000000003.14:E002 183 191 208 173
ENSG00000000003.14:E003 45 60 40 59
ENSG00000000003.14:E004 0 0 0 0
ENSG00000000003.14:E005 32 41 19 39
ENSG00000000003.14:E006 21 36 12 27
ENSG00000000003.14:E007 22 42 9 24
ENSG00000000003.14:E008 24 50 11 22
ENSG00000000003.14:E009 26 47 8 34
ENSG00000000003.14:E010 17 26 4 28
ENSG00000007062.11+ENSG00000137441.7:E001 0 0 3 1
ENSG00000007062.11+ENSG00000137441.7:E002 36 19 37 58
ENSG00000007062.11+ENSG00000137441.7:E003 27 17 11 46
ENSG00000007062.11+ENSG00000137441.7:E004 8 2 2 9
ENSG00000007062.11+ENSG00000137441.7:E005 8 1 0 5
ENSG00000007062.11+ENSG00000137441.7:E006 5 3 0 7
ENSG00000007062.11+ENSG00000137441.7:E007 1 2 11 3
ENSG00000007062.11+ENSG00000137441.7:E008 15 11 88 15
ENSG00000007062.11+ENSG00000137441.7:E009 54 25 224 51
ENSG00000007062.11+ENSG00000137441.7:E010 6709 2939 6553 6290
As you can see the aggregated genes that do have a count will contain NA values for some reason. Is any1 familiar with this issue or might know what is causing it.
Thanks in advance, Osman
sessionInfo( )
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.3.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] BatchJobs_1.9 BBmisc_1.12 DEXSeq_1.40.0 RColorBrewer_1.1-3 AnnotationDbi_1.56.2 DESeq2_1.34.0
[7] SummarizedExperiment_1.24.0 GenomicRanges_1.46.1 GenomeInfoDb_1.30.1 IRanges_2.28.0 S4Vectors_0.32.4 MatrixGenerics_1.6.0
[13] matrixStats_0.62.0 Biobase_2.54.0 BiocGenerics_0.40.0 BiocParallel_1.28.3
loaded via a namespace (and not attached):
[1] bitops_1.0-7 bit64_4.0.5 filelock_1.0.2 progress_1.2.2 httr_1.4.3 tools_4.1.2 backports_1.4.1
[8] utf8_1.2.2 R6_2.5.1 DBI_1.1.3 colorspace_2.0-3 tidyselect_1.1.2 prettyunits_1.1.1 bit_4.0.4
[15] curl_4.3.2 compiler_4.1.2 sendmailR_1.2-1.1 cli_3.3.0 xml2_1.3.3 DelayedArray_0.20.0 scales_1.2.0
[22] checkmate_2.1.0 genefilter_1.76.0 rappdirs_0.3.3 stringr_1.4.0 digest_0.6.29 Rsamtools_2.10.0 rmarkdown_2.14
[29] XVector_0.34.0 base64enc_0.1-3 pkgconfig_2.0.3 htmltools_0.5.3 dbplyr_2.2.1 fastmap_1.1.0 rlang_1.0.4
[36] rstudioapi_0.13 RSQLite_2.2.15 generics_0.1.3 hwriter_1.3.2.1 dplyr_1.0.9 RCurl_1.98-1.8 magrittr_2.0.3
[43] GenomeInfoDbData_1.2.7 Matrix_1.4-1 Rcpp_1.0.9 munsell_0.5.0 fansi_1.0.3 lifecycle_1.0.1 stringi_1.7.8
[50] yaml_2.3.5 zlibbioc_1.40.0 BiocFileCache_2.2.1 grid_4.1.2 blob_1.2.3 parallel_4.1.2 crayon_1.5.1
[57] lattice_0.20-45 Biostrings_2.62.0 splines_4.1.2 annotate_1.72.0 hms_1.1.1 KEGGREST_1.34.0 locfit_1.5-9.6
[64] knitr_1.39 pillar_1.8.0 geneplotter_1.72.0 biomaRt_2.50.3 XML_3.99-0.10 glue_1.6.2 evaluate_0.15
[71] data.table_1.14.2 BiocManager_1.30.18 png_0.1-7 vctrs_0.4.1 gtable_0.3.0 purrr_0.3.4 assertthat_0.2.1
[78] cachem_1.0.6 ggplot2_3.3.6 xfun_0.31 xtable_1.8-4 survival_3.3-1 tibble_3.1.8 memoise_2.0.1
[85] statmod_1.4.36 brew_1.0-7 ellipsis_0.3.2
```