import bed file with track line fails
1
0
Entering edit mode
cbaribault • 0
@cbaribault-12277
Last seen 6.2 years ago

According to the doc for the trackLine parameter...

For import, an imported track line will be stored in a TrackLine object, as part of the returned UCSCData.

Separate but related to rtracklayer bed import, in the code below, the resulting bed file, test.bed, has only a single track line preceding the tabular data. I can load said bed file successfully into the UCSC browser, but the error message during import via rtracklayer is similar to that of said issue. Please see R response and session info further below.

BED_TEXT <- "track name=\"My track\" description=\"My description\" visibility=2 itemRgb=\"On\"
chr1 567890 567899 DAT:0.1234 0 . 0 0 255,0,0"
BED_FILE <- "test.bed"
writeLines(text = BED_TEXT, con = BED_FILE)
tryCatch(
  import.bed(BED_FILE),
  error = function(e) e, finally = print("Hello 1")
)
tryCatch(
  import(con = BED_FILE, format = "bed", trackLine = TRUE, genome = "hg19", extraCols = "character"),
  error = function(e) e, finally = print("Hello 2")
)

And the R response for the above...

> BED_TEXT <- "track name=\"My track\" description=\"My description\" visibility=2 itemRgb=\"On\"
+ chr1 567890 567899 DAT:0.1234 0 . 0 0 255,0,0"

> BED_FILE <- "test.bed"

> writeLines(text = BED_TEXT, con = BED_FILE)

> tryCatch(
+   import.bed(BED_FILE), #fails
+   error = function(e) e, finally = print("Hello 1")
+ )
[1] "Hello 1"
<simpleError in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,     nmax = nrows, skip = 0, na.strings = na.strings, quiet = TRUE,     fill = fill, strip.white = strip.white, blank.lines.skip = blank.lines.skip,     multi.line = FALSE, comment.char = comment.char, allowEscapes = allowEscapes,     flush = flush, encoding = encoding, skipNul = skipNul): line 1 did not have 9 elements>

> tryCatch(
+   import(con = BED_FILE, format = "bed", trackLine = TRUE, genome = "hg19", extraCols = "character"),
+   error = function(e) e, finally .... [TRUNCATED] 
[1] "Hello 2"
<simpleError in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,     nmax = nrows, skip = 0, na.strings = na.strings, quiet = TRUE,     fill = fill, strip.white = strip.white, blank.lines.skip = blank.lines.skip,     multi.line = FALSE, comment.char = comment.char, allowEscapes = allowEscapes,     flush = flush, encoding = encoding, skipNul = skipNul): line 1 did not have 8 elements>

And here's my session info...

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] genomation_1.11.3    goldmine_1.0         rtracklayer_1.38.3   GenomicRanges_1.30.1 GenomeInfoDb_1.14.0 
 [6] IRanges_2.12.0       S4Vectors_0.16.0     BiocGenerics_0.24.0  stringi_1.1.6        stringr_1.2.0       
[11] gridExtra_2.3        ggplot2_2.2.1        Rcpp_0.12.15         data.table_1.10.4-3  devtools_1.13.4     
[16] BiocInstaller_1.28.0

loaded via a namespace (and not attached):
 [1] SummarizedExperiment_1.8.1 reshape2_1.4.3             lattice_0.20-35            seqPattern_1.10.0         
 [5] colorspace_1.3-2           yaml_2.1.18                utf8_1.1.3                 XML_3.98-1.10             
 [9] rlang_0.2.0                pillar_1.2.1               withr_2.1.2                BiocParallel_1.12.0       
[13] matrixStats_0.53.1         GenomeInfoDbData_1.0.0     plyr_1.8.4                 zlibbioc_1.24.0           
[17] Biostrings_2.46.0          munsell_0.4.3              gtable_0.2.0               memoise_1.1.0             
[21] Biobase_2.38.0             KernSmooth_2.23-15         readr_1.1.1                scales_0.5.0              
[25] BSgenome_1.46.0            plotrix_3.7                DelayedArray_0.4.1         XVector_0.18.0            
[29] Rsamtools_1.30.0           impute_1.52.0              hms_0.4.2                  digest_0.6.15             
[33] cli_1.0.0                  tools_3.4.4                bitops_1.0-6               magrittr_1.5              
[37] lazyeval_0.2.1             RCurl_1.95-4.10            tibble_1.4.2               crayon_1.3.4              
[41] pkgconfig_2.0.1            Matrix_1.2-12              gridBase_0.4-7             assertthat_0.2.0          
[45] rstudioapi_0.7             R6_2.2.2                   GenomicAlignments_1.14.1   compiler_3.4.4 

Please let me know if you need more info.

Best,

Carl Baribault

 

rtracklayer • 832 views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 2.4 years ago
United States

I don't think this has to do with the trackline. It's just that your file is not tab-separated, which is required by the BED spec. UCSC might be more forgiving, but we recently moved to require tab separation, because not abiding by the spec was causing problems.

Also, btw, extraCols needs to be named, which is why that error message is slightly different. Not sure why you would use extraCols in this case though. Added some sanity checks to devel for this.

ADD COMMENT

Login before adding your answer.

Traffic: 702 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6