Convert an R data.frame to a bed file in R
1
1
Entering edit mode
L • 0
@dd482574
Last seen 5 days ago
Germany

Dear all,

I have a data.frame that looks like this.

bed <- data.frame(chrom=c(rep("Chr1",5)),
                        chromStart=c(18915152,24199229,73730,81430,89350),
                        chromEnd=c(18915034,24199347,74684,81550,89768), 
                         strand=c("-","+","+","+","+"))

write.table(bed, "test_xRNA.bed",row.names = F,col.names = F, sep="\t", quote=FALSE)

Created on 2022-07-29 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)

and I want to convert it into a bed file. I try to do it with the writing.table function, but I fail miserably by taking this error comment when I look at the intersect

Error: unable to open file or unable to determine types for file test_xRNA.bed

- Please ensure that your file is TAB delimited (e.g., cat -t FILE).
- Also ensure that your file has integer chromosome coordinates in the 
  expected columns (e.g., cols 2 and 3 for BED).

Any ideas of how I can properly convert a data.frame to a .bed file in R?

I have heard about the rtracklayer package, does anyone have an experience with it?

rtracklayer R bed • 194 views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode
ADD REPLY
3
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
> bed <- data.frame(chrom=c(rep("Chr1",5)),
                        chromStart=c(18915152,24199229,73730,81430,89350),
                        chromEnd=c(18915034,24199347,74684,81550,89768), 
                         strand=c("-","+","+","+","+"))

## fix mis-ordered rows
> for(i in seq_len(nrow(bed))) if(bed[i,"strand"] == "-") bed[i,2:3] <- bed[i,3:2]
> bed
  chrom chromStart chromEnd strand
1  Chr1   18915034 18915152      -
2  Chr1   24199229 24199347      +
3  Chr1      73730    74684      +
4  Chr1      81430    81550      +
5  Chr1      89350    89768      +
> bedgr <- GRanges(bed[,1], IRanges(bed[,2], bed[,3]), bed$strand)
> bedgr
GRanges object with 5 ranges and 0 metadata columns:
      seqnames            ranges strand
         <Rle>         <IRanges>  <Rle>
  [1]     Chr1 18915034-18915152      -
  [2]     Chr1 24199229-24199347      +
  [3]     Chr1       73730-74684      +
  [4]     Chr1       81430-81550      +
  [5]     Chr1       89350-89768      +
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths
> export(bedgr, "tmp.bed", "bed")
> import("tmp.bed")
GRanges object with 5 ranges and 2 metadata columns:
      seqnames            ranges strand |        name     score
         <Rle>         <IRanges>  <Rle> | <character> <numeric>
  [1]     Chr1 18915034-18915152      - |        <NA>         0
  [2]     Chr1 24199229-24199347      + |        <NA>         0
  [3]     Chr1       73730-74684      + |        <NA>         0
  [4]     Chr1       81430-81550      + |        <NA>         0
  [5]     Chr1       89350-89768      + |        <NA>         0
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths

But normally a bed file has a meaningful score, and yours is just the postions.

0
Entering edit mode

Thank you James, that looks beautiful. Thanks a lot

ADD REPLY

Login before adding your answer.

Traffic: 229 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6