Hi,
I want to convet :
myFile.txt
======
MotifName strand targetName targetStart targetEnd
GATA1 * chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1 48847536,127527717,102379250,140094948,113485701,28285784,119187237,178110784,105353030,1206548448848140,127528321,102379672,140095753,113486305,28286388,119187841,178111113,105353634,12065981
to a GRangesList in R:
library(GenomicRanges)
dat <- read.table("myFile.txt",header=TRUE, stringsAsFactors=FALSE)
grl2 <- with(dat,
makeGRangesListFromFeatureFragments(seqnames=Rle(factor(targetName)),
fragmentStarts=targetStart,
fragmentEnds=targetEnd,
strand=strand,
sep=","
))
names(grl2) <- dat$MotifName
But it does not work. It get:
> grl2
GRangesList object of length 1:
$GATA1
GRanges object with 10 ranges and 0 metadata columns:
seqnames
<Rle>
[1] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[2] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[3] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[4] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[5] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[6] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[7] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[8] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[9] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
[10] chr15,chr7,chrX,chr9,chr2,chr18,chr3,chr5,chr12,chr1
ranges strand
<IRanges> <Rle>
[1] [ 48847536, 48848140] *
[2] [127527717, 127528321] *
[3] [102379250, 102379672] *
[4] [140094948, 140095753] *
[5] [113485701, 113486305] *
[6] [ 28285784, 28286388] *
[7] [119187237, 119187841] *
[8] [178110784, 178111113] *
[9] [105353030, 105353634] *
[10] [ 12065484, 12065981] *
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
if I do not collapse chr in original file:
myFile2.txt:
MotifName strand targetName targetStart targetEnd
GATA1 * chr4 159756456,6910762 159757078,6911248
GATA1 * chr1 87230642 87231449
GATA1 * chr12 34175051 34175655
GATA1 * chr8 100135438 100135841
GATA1 * chr12 31478801 31479405
GATA1 * chr20 45989193 45989872
GATA1 * chr13 45289893,37573340 45290497,37573944
GATA1 * chr9 127150856 127151460
> grl2
GRangesList object of length 8:
$GATA1
GRanges object with 2 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr4 [159756456, 159757078] *
[2] chr4 [ 6910762, 6911248] *
$GATA1
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
[1] chr1 [87230642, 87231449] *
$GATA1
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
[1] chr12 [34175051, 34175655] *
...
<5 more elements>
-------
seqinfo: 7 sequences from an unspecified genome; no seqlengths
I get multiple GATA1, which I do not want and would like only one $GATA1 contains all chrs, starts and ends
Thanks for helping me to solve this problem
But it would make sense for
makeGRangesListFromFeatureFragments()
to support collapsed seqnames.