Question: rtracklayer export bedGraph cannot be loaded into UCSC
1
gravatar for Nicolas Servant
14 months ago by
France
Nicolas Servant260 wrote:

Hi all,

I have a simple question regarding the export function of rtracklayer.

I have a GRanges object, and I would like to export a BedGraph file which can be loaded into the UCSC genome browser.

  myTrackLine <- new("GraphTrackLine", type="bedGraph", name="mysample", description="mysample")
  export(x, con = "out.bedgraph", trackLine=myTrackLine, format="bedGraph")

Now, looking at the bedGraph file ;

track name=mysample description="mysample"
chr1    3073252    3074322    .    0    +
chr1    3102015    3102125    .    0    +
chr1    3252756    3253236    .    0    +

However, as there is no "type=bedGraph", the file is loaded as a 'bed' file, and is not displayed as bar.

Did I miss something ? how can I add this values ?

Best

Nicolas

rtracklayer • 403 views
ADD COMMENTlink modified 14 months ago by Michael Lawrence11k • written 14 months ago by Nicolas Servant260

Looks like it is outputting BED instead of bedGraph. But I can't reproduce it. Maybe paste your sessionInfo()?

ADD REPLYlink written 14 months ago by Michael Lawrence11k

sure. Thanks


R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

attached base packages:

[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] DT_0.4                     edgeR_3.20.6              
 [3] limma_3.34.9               ggrepel_0.7.0             
 [5] genefilter_1.60.0          RColorBrewer_1.1-2        
 [7] pheatmap_1.0.10            gridExtra_2.3             
 [9] DESeq2_1.18.1              SummarizedExperiment_1.8.1
[11] DelayedArray_0.4.1         matrixStats_0.53.1        
[13] Biobase_2.38.0             reshape2_1.4.3            
[15] ggplot2_2.2.1              rtracklayer_1.38.2        
[17] GenomicRanges_1.30.1       GenomeInfoDb_1.14.0       
[19] IRanges_2.12.0             S4Vectors_0.16.0          
[21] BiocGenerics_0.24.0      

I don't know if it can help, here ismy GRanges object to export. I double checked that it has a 'score' field for bedGraph export.

GRanges object with 3 ranges and 4 metadata columns:
      seqnames             ranges strand |              gene_id     gene_name
         <Rle>          <IRanges>  <Rle> |          <character>   <character>
  [1]     chr1 [3073253, 3074322]      + | ENSMUSG00000102693.1 4933401J01Rik
  [2]     chr1 [3102016, 3102125]      + | ENSMUSG00000064842.1       Gm26206
  [3]     chr1 [3252757, 3253236]      + | ENSMUSG00000102851.1       Gm18956
                 gene_type     score
               <character> <numeric>
  [1]                  TEC         0
  [2]                snRNA         0
  [3] processed_pseudogene         0
  -------
  seqinfo: 22 sequences from an unspecified genome; no seqlengths

 

 

ADD REPLYlink written 14 months ago by Nicolas Servant260

Please update your R/Bioconductor and see if it helps.

ADD REPLYlink written 14 months ago by Michael Lawrence11k

I did, but it does not help.

N

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /bioinfo/local/build/Centos/R/R-3.5.0/lib64/R/lib/libRblas.so
LAPACK: /bioinfo/local/build/Centos/R/R-3.5.0/lib64/R/lib/libRlapack.so

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DT_0.4                      edgeR_3.22.3                limma_3.36.3                ggrepel_0.8.0              
 [5] genefilter_1.62.0           RColorBrewer_1.1-2          pheatmap_1.0.10             gridExtra_2.3              
 [9] DESeq2_1.20.0               SummarizedExperiment_1.10.1 DelayedArray_0.6.5          BiocParallel_1.14.2        
[13] matrixStats_0.54.0          Biobase_2.40.0              reshape2_1.4.3              ggplot2_3.0.0              
[17] rtracklayer_1.40.6          GenomicRanges_1.32.6        GenomeInfoDb_1.16.0         IRanges_2.14.11            
[21] S4Vectors_0.18.3            BiocGenerics_0.26.0        

 

ADD REPLYlink written 14 months ago by Nicolas Servant260

Alright, please send a reproducible example, i.e., include the input data.

ADD REPLYlink written 14 months ago by Michael Lawrence11k

I've got it. It is linked to the Granges object.

 

x1<- GRanges(c("chr1","chr2"), IRanges(c(1,10), c(100,1000)), score=c(12,15))
x2 <- GRanges(c("chr1","chr1"), IRanges(c(1,10), c(15,20)), score=c(12,15))

myTrackLine <- new("GraphTrackLine", type="bedGraph", name="test", description="test")


export(x1, format="bedGraph", con="test.bedGraph", trackLine=myTrackLine) ## Works

export(x2, format="bedGraph", con="test.bedGraph", trackLine=myTrackLine) ## Failed

The difference is that the 'x2' has overlapping intervals ...

Thanks !

ADD REPLYlink modified 14 months ago • written 14 months ago by Nicolas Servant260

Do you think you can fix it ? or is there any reason that explain the results ?

Thanks

ADD REPLYlink written 14 months ago by Nicolas Servant260

My overlapping intervals were caused by 0-to-1-based conversion.

For me, my ranges in R are 0-based exclusive, whereas the rtracklayer manual states they should be 1-based, and therefore assumes that it must subtract all chromStart values by 1. Therefore, rtracklayer::export.bedGraph creates several overlapping intervals in situations such as below:

My GRanges object in R:

> FDR_LE_L1_05[c("4848","4849")]
GRanges object with 2 ranges and 1 metadata column:
       seqnames          ranges strand |             score
          <Rle>       <IRanges>  <Rle> |         <numeric>
  4848    chrII 5081136-5081896      * | -11.6373555433484
  4849    chrII 5081896-5082599      * | -5.01917162511854
  -------
  seqinfo: 6 sequences from an unspecified genome; no seqlengths
> rtracklayer::export.bedGraph(FDR_LE_L1_05, "FDR_LE_L1_05.bedGraph")

Notice that the end point of the first range is the same as the start point of the following one: 5081896.

bedGraphToBigWig: After sorting, bedGraphToBigWig halts on this interval:

$ bedSort FDR_LE_L1_05.bedGraph FDR_LE_L1_05.bedGraph
$ bedGraphToBigWig FDR_LE_L1_05.bedGraph chrom.sizes FDR_LE_L1_05.bigWig
Overlapping regions in bedGraph line 3412 of FDR_LE_L1_05.bedGraph

The offending line of the file (+/- one line):

$ sed '3411,3413!d' FDR_LE_L1_05.bedGraph
chrII   5081135 5081896 -11.6373555433484
chrII   5081895 5082599 -5.01917162511854
chrII   5082685 5083978 -25.8326171039818

The second interval now starts before the end of the first one 5081895 < 5081896. After correcting the file manually, the subsequent cases continue to occur, and they appear to be the same problem.

The following operation fixed the problem for me:

> start(FDR_LE_L1_05) = start(FDR_LE_L1_05) + 1

followed by the rest of the operations (export, bedSort, bedGraphToBigWig). There was no error after adjusting the start points.

Idk if the OP's ranges were overlapping in both coordinate spaces, or just contiguous like mine.

ADD REPLYlink written 4 months ago by David K0

Please note that the GRanges (and IRanges) convention is that ranges are 1-based, inclusive. rtracklayer just follows that convention.

ADD REPLYlink written 4 months ago by Michael Lawrence11k
Answer: rtracklayer export bedGraph cannot be loaded into UCSC
0
gravatar for Michael Lawrence
14 months ago by
United States
Michael Lawrence11k wrote:

bedGraph (at least as strictly defined by what UCSC supports) does not support overlapping intervals. The bug is that rtracklayer fails  silently to produce bedGraph and instead produces BED.

For your work, you will need to aggregate the data so that they can be displayed in UCSC correctly.

ADD COMMENTlink written 14 months ago by Michael Lawrence11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 204 users visited in the last hour