I'm suddenly unable to load cufflinks data into cummerbund. I get this output from the readCufflinks() function:
Creating database I:/EpiCenter_ISI/Hart/coga/better/Hs/cuffHs/cuffData.db Reading Run Info File I:/EpiCenter_ISI/Hart/coga/better/Hs/cuffHs/run.info Writing runInfo Table Reading Read Group Info I:/EpiCenter_ISI/Hart/coga/better/Hs/cuffHs/read_groups.info Writing replicates Table Reading Var Model Info I:/EpiCenter_ISI/Hart/coga/better/Hs/cuffHs/var_model.info Writing varModel Table Reading I:/EpiCenter_ISI/Hart/coga/better/Hs/cuffHs/genes.fpkm_tracking Checking samples table... Populating samples table... Error: Column name mismatch. In addition: There were 50 or more warnings (use warnings() to see the first 50)
And then the warnings are all like this:
Warning messages: 1: In rsqlite_fetch(res@ptr, n = n) : Don't need to call dbFetch() for statements, only for queries 2: In rsqlite_fetch(res@ptr, n = n) : Don't need to call dbFetch() for statements, only for queries
Here the traceback into:
> traceback() 8: stop("Column name mismatch.", call. = FALSE) 7: match_col(value, col_names) 6: .local(conn, name, value, ...) 5: dbWriteTable(dbConn, "samples", samples, row.names = F, append = T) 4: dbWriteTable(dbConn, "samples", samples, row.names = F, append = T) 3: populateSampleTable(samples, dbConn) 2: loadGenes(geneFPKM, geneDiff, promoterFile, countFile = geneCount, replicateFile = geneRep, dbConn) 1: readCufflinks(rebuild = T)
And my session info (after getting this error, I just used biocLite to upgrade cummeRbund):
> sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] grid stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] cummeRbund_2.18.0 Gviz_1.20.0 rtracklayer_1.36.4 GenomicRanges_1.28.4 GenomeInfoDb_1.12.2 IRanges_2.10.2 S4Vectors_0.14.3 [8] fastcluster_1.1.22 reshape2_1.4.2 ggplot2_2.2.1 RSQLite_2.0 BiocGenerics_0.22.0 loaded via a namespace (and not attached): [1] httr_1.3.0 Biobase_2.36.2 AnnotationHub_2.8.2 bit64_0.9-7 splines_3.4.1 [6] shiny_1.0.4 Formula_1.2-2 interactiveDisplayBase_1.14.0 latticeExtra_0.6-28 blob_1.1.0 [11] BSgenome_1.44.0 GenomeInfoDbData_0.99.0 Rsamtools_1.28.0 yaml_2.1.14 backports_1.1.0 [16] lattice_0.20-35 biovizBase_1.24.0 digest_0.6.12 RColorBrewer_1.1-2 XVector_0.16.0 [21] checkmate_1.8.3 colorspace_1.3-2 httpuv_1.3.5 htmltools_0.3.6 Matrix_1.2-11 [26] plyr_1.8.4 pkgconfig_2.0.1 XML_3.98-1.9 biomaRt_2.32.1 zlibbioc_1.22.0 [31] xtable_1.8-2 scales_0.4.1 BiocParallel_1.10.1 htmlTable_1.9 tibble_1.3.3 [36] AnnotationFilter_1.0.0 SummarizedExperiment_1.6.3 GenomicFeatures_1.28.4 nnet_7.3-12 lazyeval_0.2.0 [41] mime_0.5 survival_2.41-3 magrittr_1.5 memoise_1.1.0 foreign_0.8-69 [46] BiocInstaller_1.26.0 tools_3.4.1 data.table_1.10.4 matrixStats_0.52.2 stringr_1.2.0 [51] munsell_0.4.3 cluster_2.0.6 DelayedArray_0.2.7 AnnotationDbi_1.38.2 ensembldb_2.0.4 [56] Biostrings_2.44.2 compiler_3.4.1 rlang_0.1.2 RCurl_1.95-4.8 dichromat_2.0-0 [61] VariantAnnotation_1.22.3 htmlwidgets_0.9 bitops_1.0-6 base64enc_0.1-3 gtable_0.2.0 [66] curl_2.8.1 DBI_0.7 R6_2.2.2 GenomicAlignments_1.12.1 gridExtra_2.2.1 [71] knitr_1.17 bit_1.1-12 Hmisc_4.0-3 ProtGenerics_1.8.0 stringi_1.1.5 [76] Rcpp_0.12.12 rpart_4.1-11 acepack_1.4.1
Also, here's the header and first line of genes.fpkm_tracking -- the file that seemed to generate the error:
tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage Ctrl0_FPKM Ctrl0_conf_lo Ctrl0_conf_hi Ctrl0_status Case0_FPKM Case0_conf_lo Case0_conf_hi Case0_status Ctrl24_FPKM Ctrl24_conf_lo Ctrl24_conf_hi Ctrl24_status Case24_FPKM Case24_conf_lo Case24_conf_hi Case24_status
A1BG - - A1BG A1BG TSS7852 chr19:58346805-58362848 - - 1.08063 0 2.83236 OK 1.28488 0 3.27684 OK 2.83622 0 6.7891 OK 3.59967 0.396499 6.80284 OK
I've made progress with diagnostics. The error is in the loadGenes() function as listed in the database-setup.R source file. First, under the "Handle Samples Names" section, on line 152, make.db.names() is called. This is deprecated and replaced with dbQuoteIdentifier() . If I use this function, and proceed to the populateSampleTable(samples,dbConn) step on line 162, I generate the same error as in readCufflinks(). It seems that the samples object is not in the right format for this function.
This seems to indicate that RSQLite has been updated and the existing cummeRbund code is built on older versions.