Hi all,
I have a simple question regarding the export function of rtracklayer.
I have a GRanges object, and I would like to export a BedGraph file which can be loaded into the UCSC genome browser.
myTrackLine <- new("GraphTrackLine", type="bedGraph", name="mysample", description="mysample")
export(x, con = "out.bedgraph", trackLine=myTrackLine, format="bedGraph")
Now, looking at the bedGraph file ;
track name=mysample description="mysample"
chr1 3073252 3074322 . 0 +
chr1 3102015 3102125 . 0 +
chr1 3252756 3253236 . 0 +
However, as there is no "type=bedGraph", the file is loaded as a 'bed' file, and is not displayed as bar.
Did I miss something ? how can I add this values ?
Best
Nicolas
Looks like it is outputting BED instead of bedGraph. But I can't reproduce it. Maybe paste your
sessionInfo()
?sure. Thanks
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] DT_0.4 edgeR_3.20.6
[3] limma_3.34.9 ggrepel_0.7.0
[5] genefilter_1.60.0 RColorBrewer_1.1-2
[7] pheatmap_1.0.10 gridExtra_2.3
[9] DESeq2_1.18.1 SummarizedExperiment_1.8.1
[11] DelayedArray_0.4.1 matrixStats_0.53.1
[13] Biobase_2.38.0 reshape2_1.4.3
[15] ggplot2_2.2.1 rtracklayer_1.38.2
[17] GenomicRanges_1.30.1 GenomeInfoDb_1.14.0
[19] IRanges_2.12.0 S4Vectors_0.16.0
[21] BiocGenerics_0.24.0
I don't know if it can help, here ismy GRanges object to export. I double checked that it has a 'score' field for bedGraph export.
GRanges object with 3 ranges and 4 metadata columns:
seqnames ranges strand | gene_id gene_name
<Rle> <IRanges> <Rle> | <character> <character>
[1] chr1 [3073253, 3074322] + | ENSMUSG00000102693.1 4933401J01Rik
[2] chr1 [3102016, 3102125] + | ENSMUSG00000064842.1 Gm26206
[3] chr1 [3252757, 3253236] + | ENSMUSG00000102851.1 Gm18956
gene_type score
<character> <numeric>
[1] TEC 0
[2] snRNA 0
[3] processed_pseudogene 0
-------
seqinfo: 22 sequences from an unspecified genome; no seqlengths
Please update your R/Bioconductor and see if it helps.
I did, but it does not help.
N
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /bioinfo/local/build/Centos/R/R-3.5.0/lib64/R/lib/libRblas.so
LAPACK: /bioinfo/local/build/Centos/R/R-3.5.0/lib64/R/lib/libRlapack.so
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] DT_0.4 edgeR_3.22.3 limma_3.36.3 ggrepel_0.8.0
[5] genefilter_1.62.0 RColorBrewer_1.1-2 pheatmap_1.0.10 gridExtra_2.3
[9] DESeq2_1.20.0 SummarizedExperiment_1.10.1 DelayedArray_0.6.5 BiocParallel_1.14.2
[13] matrixStats_0.54.0 Biobase_2.40.0 reshape2_1.4.3 ggplot2_3.0.0
[17] rtracklayer_1.40.6 GenomicRanges_1.32.6 GenomeInfoDb_1.16.0 IRanges_2.14.11
[21] S4Vectors_0.18.3 BiocGenerics_0.26.0
Alright, please send a reproducible example, i.e., include the input data.
I've got it. It is linked to the Granges object.
x1<- GRanges(c("chr1","chr2"), IRanges(c(1,10), c(100,1000)), score=c(12,15))
x2 <- GRanges(c("chr1","chr1"), IRanges(c(1,10), c(15,20)), score=c(12,15))
myTrackLine <- new("GraphTrackLine", type="bedGraph", name="test", description="test")
export(x1, format="bedGraph", con="test.bedGraph", trackLine=myTrackLine) ## Works
export(x2, format="bedGraph", con="test.bedGraph", trackLine=myTrackLine) ## Failed
The difference is that the 'x2' has overlapping intervals ...
Thanks !
Do you think you can fix it ? or is there any reason that explain the results ?
Thanks
My overlapping intervals were caused by 0-to-1-based conversion.
For me, my ranges in R are 0-based exclusive, whereas the rtracklayer manual states they should be 1-based, and therefore assumes that it must subtract all chromStart values by 1. Therefore, rtracklayer::export.bedGraph creates several overlapping intervals in situations such as below:
My GRanges object in R:
Notice that the end point of the first range is the same as the start point of the following one: 5081896.
bedGraphToBigWig: After sorting, bedGraphToBigWig halts on this interval:
The offending line of the file (+/- one line):
The second interval now starts before the end of the first one 5081895 < 5081896. After correcting the file manually, the subsequent cases continue to occur, and they appear to be the same problem.
The following operation fixed the problem for me:
followed by the rest of the operations (export, bedSort, bedGraphToBigWig). There was no error after adjusting the start points.
Idk if the OP's ranges were overlapping in both coordinate spaces, or just contiguous like mine.
Please note that the GRanges (and IRanges) convention is that ranges are 1-based, inclusive. rtracklayer just follows that convention.