Dear all,
I am trying to test SGSeq with my own data, but I did not go that far:
I am simply trying to get your package working on a test set composed of 8 bam files (paried end) merged into 4, subset to only one chromosome.
Also the annotation is subset accordingly. I get an obscure "non-numeric argument to binary operator" error, that is not very helpful.
I tried with different bam files, generated with a different aligner, but I still get the same error. Different chromosome, produced the same results. I noticed that the fragment length column contained NAs and tried to replace with values, but no effect.
I would be grateful if someone could give me some pointers to move forward with the analysis.
Thanks,
Paolo Guarnieri, M.D.
Department of Systems Biology,
Herbert Irving Comprehensive Cancer Center,
Columbia University,
1130 St. Nicholas Ave, New York, NY 10032 (USA)
> myChr <- "chr6" > txdb <- TxDb.Mmusculus.UCSC.mm10.knownGene > txdb <- keepSeqlevels(txdb, myChr) > txf_annotated <- convertToTxFeatures(txdb) > threads <- 8 > mySamplesComplete <- getBamInfo(mySamplesMerged_si,BPPARAM = MulticoreParam(threads)) > mySamplesComplete sample_name file_bam paired_end read_length frag_length lib_size 1 SRSF1_1 /home/pg2296/Desktop/data_X/Datasets/CMF/rawdata/SRSF1_1.merged.bam FALSE 100 NA 914422 2 SRSF1_2 /home/pg2296/Desktop/data_X/Datasets/CMF/rawdata/SRSF1_2.merged.bam FALSE 100 NA 877404 3 SRSF1_3 /home/pg2296/Desktop/data_X/Datasets/CMF/rawdata/SRSF1_3.merged.bam FALSE 100 NA 972719 4 SRSF1_4 /home/pg2296/Desktop/data_X/Datasets/CMF/rawdata/SRSF1_4.merged.bam FALSE 100 NA 750876 > sgfc_dmd <- analyzeFeatures(sample_info = mySamplesComplete, + features = txf_annotated, + cores_per_sample = 2, + BPPARAM = MulticoreParam(threads)) Process features... Obtain counts... Error in X/(E * 0.001) : non-numeric argument to binary operator > > txf_annotated TxFeatures object with 186 ranges and 0 metadata columns: seqnames ranges strand type txName geneName <Rle> <IRanges> <Rle> <factor> <CharacterList> <CharacterList> [1] chrX [82814910, 82815024] + F uc009trg.1 13405 [2] chrX [82815024, 83136628] + J uc009trg.1 13405 [3] chrX [82948870, 82949147] + F uc009trh.2,uc009tri.2 13405 [4] chrX [82949147, 83136628] + J uc009trh.2,uc009tri.2 13405 [5] chrX [83043196, 83043478] + F uc009trj.1 13405 ... ... ... ... ... ... ... [182] chrX [85202471, 85205050] + L uc009tri.2,uc009trp.2,uc009trr.2 13405 [183] chrX [85249677, 85249744] + F uc009trs.2 71996 [184] chrX [85249744, 85269520] + J uc009trs.2 71996 [185] chrX [85269520, 85270291] + L uc009trs.2 71996 [186] chrX [85269613, 85270161] + U uc033jqz.1 ------- seqinfo: 1 sequence from mm10 genome > sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] XVector_0.6.0 TxDb.Mmusculus.UCSC.mm10.knownGene_3.0.0 [3] GenomicFeatures_1.18.2 AnnotationDbi_1.28.1 [5] Biobase_2.26.0 SGSeq_1.0.6 [7] GenomicRanges_1.18.1 GenomeInfoDb_1.2.2 [9] IRanges_2.0.0 S4Vectors_0.4.0 [11] BiocGenerics_0.12.0 BiocParallel_1.0.0 loaded via a namespace (and not attached): [1] base64enc_0.1-2 BatchJobs_1.5 BBmisc_1.8 biomaRt_2.22.0 [5] Biostrings_2.34.0 bitops_1.0-6 brew_1.0-6 checkmate_1.5.0 [9] codetools_0.2-9 DBI_0.3.1 digest_0.6.4 fail_1.2 [13] foreach_1.4.2 GenomicAlignments_1.2.1 igraph_0.7.1 iterators_1.0.7 [17] RCurl_1.96-0 Rsamtools_1.18.1 RSQLite_1.0.0 rtracklayer_1.26.1 [21] sendmailR_1.2-1 stringr_0.6.2 tools_3.1.2 XML_3.98-1.1 [25] zlibbioc_1.12.0