Entering edit mode
Hi, I'm running a DEXSEQ differential exon usage analysis on a 152 sample dataset. I've tested a variety of simple and more complex models as such:
full_model <- ~sample + exon + batch:exon + day:exon + product2:exon
reduced_model <- ~sample + exon + batch:exon + day:exon
In a sense, the model doesn't seem to matter since I have the same error messages being returned. I am using the standard step-wise analysis from the vignette but with a separate load function for feature counts inputs taken from https://github.com/vivekbhr/Subread_to_DEXSeq
dxd.fc <- DEXSeqDataSetFromFeatureCounts("exon_count.out",
flattenedfile = "flat.gtf",
sampleData = coldata_all,
design = full_model)
dxd_dtu <- dxd.fc
rm(dxd.fc)
dxd_dtu1 = estimateSizeFactors(dxd_dtu)
print("size factors done")
dxd_dtu2 = estimateDispersions(dxd_dtu1, formula = full_model)
print("dispersions done")
dxd_dtu3 = testForDEU(dxd_dtu2, fullModel = design(dxd_dtu2), reducedModel = reduced_model)
print("DEU done")
dxd_dtu3 = estimateExonFoldChanges( dxd_dtu3, fitExpToVar="product2")
print("fold changes done")
dxr = DEXSeqResults(dxd_dtu3, independentFiltering = TRUE)
However, halfway through estimating size dispersions I am encountering this error:
Reading and adding Exon IDs for DEXSeq
converting counts to integer mode
Warning message:
0 aggregate geneIDs were found truncated in featureCounts output
Error in `$<-.data.frame`(`*tmp*`, "dispersion", value = NA) :
replacement has 1 row, data has 0
Calls: estimateDispersions ... colData<- -> makeBigModelFrame -> $<- -> $<-.data.frame
Execution halted
Can anyone help me solve this? Is this a bug or package clash? Here is my session information:
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 LC_PAPER=en_GB.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] DEXSeq_1.32.0 RColorBrewer_1.1-3 AnnotationDbi_1.48.0 DESeq2_1.26.0 SummarizedExperiment_1.16.1 DelayedArray_0.12.3 matrixStats_0.62.0
[8] GenomicRanges_1.38.0 GenomeInfoDb_1.22.1 IRanges_2.20.2 S4Vectors_0.24.4 Biobase_2.46.0 BiocGenerics_0.32.0 BiocParallel_1.20.1
loaded via a namespace (and not attached):
[1] bitops_1.0-7 bit64_4.0.5 httr_1.4.4 progress_1.2.2 tools_3.6.1 backports_1.4.1 utf8_1.2.2 R6_2.5.1 rpart_4.1.16
[10] Hmisc_4.7-0 DBI_1.1.3 colorspace_2.0-3 nnet_7.3-17 tidyselect_1.1.2 gridExtra_2.3 prettyunits_1.1.1 curl_4.3.2 bit_4.0.5
[19] compiler_3.6.1 cli_3.4.1 htmlTable_2.4.0 scales_1.2.1 checkmate_2.1.0 genefilter_1.68.0 askpass_1.1 rappdirs_0.3.3 Rsamtools_2.2.3
[28] stringr_1.4.1 digest_0.6.29 foreign_0.8-76 XVector_0.26.0 base64enc_0.1-3 jpeg_0.1-9 pkgconfig_2.0.3 htmltools_0.5.2 dbplyr_2.1.1
[37] fastmap_1.1.0 htmlwidgets_1.5.4 rlang_1.0.2 rstudioapi_0.14 RSQLite_2.2.13 generics_0.1.3 hwriter_1.3.2.1 dplyr_1.0.9 RCurl_1.98-1.6
[46] magrittr_2.0.3 GenomeInfoDbData_1.2.2 Formula_1.2-4 interp_1.1-3 Matrix_1.4-1 Rcpp_1.0.8.3 munsell_0.5.0 fansi_1.0.3 lifecycle_1.0.1
[55] yaml_2.3.5 stringi_1.7.6 zlibbioc_1.32.0 BiocFileCache_1.10.2 grid_3.6.1 blob_1.2.3 crayon_1.5.2 deldir_1.0-6 lattice_0.20-45
[64] Biostrings_2.54.0 splines_3.6.1 annotate_1.64.0 hms_1.1.1 locfit_1.5-9.4 knitr_1.39 pillar_1.8.0 geneplotter_1.64.0 biomaRt_2.42.1
[73] XML_3.99-0.3 glue_1.6.2 latticeExtra_0.6-30 data.table_1.14.2 png_0.1-8 vctrs_0.4.1 openssl_2.0.0 gtable_0.3.1 purrr_0.3.5
[82] assertthat_0.2.1 cachem_1.0.6 ggplot2_3.3.6 xfun_0.30 xtable_1.8-4 survival_3.3-1 tibble_3.1.7 memoise_2.0.1 cluster_2.1.3
[91] statmod_1.4.36 ellipsis_0.3.2
Hi Ben,
your first step should be to upgrade to the current versions of R and Bioconductor. There is really little point in trouble-shooting obsolete software.
Also, just a hunch, the error that you get looks like you have an empty (0 rows) data table, which could, among other reasons, arise if your gene names are not matching each other in different input data/annotation files(?).
If you still have problems, please post a minimal reproducible example that others can run on their machines, i.e. with the input files and the code needed to arrive at the error message.
HTH Wolfgang
Many thanks for the advice - I will try this and get back to you if I have further issues.
BW
Ben