currently I read bed files as set of GRanges objects, but I want to process row by row of bed files in R for some meta analysis. I mean, given bed files that contain more than 20 thousands with 2 metadata column, the objective is to extract or pull out only one genome interval from original bed files and throw this interval to set of interval from second bed files, to detect overlapped regions by using intervaltree algorithms. during the process, every time only taking one genome interval to be processed. How can I extract from original bed files and do some analysis, then put this interval to completely new bed files at the end. Is there any one can help me or point me out how to implement this in R? Thanks all of bioinformatics fans !!
This is how my sample bed files looks like:
seqnames ranges strand | name score <Rle> <IRanges> <Rle> | <character> <numeric> [1] chr1 [ 32727, 32817] * | MACS_peak_1 8.69 [2] chr1 [ 52489, 52552] * | MACS_peak_2 4.26 [3] chr1 [ 65527, 65590] * | MACS_peak_3 4.19 [4] chr1 [ 65773, 65904] * | MACS_peak_4 2.02 [5] chr1 [ 66001, 66117] * | MACS_peak_5 5.66 [6] chr1 [115700, 115769] * | MACS_peak_6 10.30
(a) This is really a duplicate of your How to search overlapped peak regions in parallel for multiple bed file in R? (b) and also a duplicate of the discussion we had on StackOverflow. You could select a single range and
findOverlaps(query[1], subject)
but that's inefficient and you'd rather figure out what I tried to convey earlier. Maybe someone else can answer your question in a way that you'll understand, or understand the part of the question that I apparently do not?sorry for that. I will pay attention how to ask question later on. Thanks a lot
To extract each row (genome interval) from a bed file one by one in R, you can use a loop to iterate over the rows of the file and extract the interval information for each row. More information: How to extract / pull out each row(a.k.a, single genome interval ) from given bed files by row by row for meta analysis?vampire survivors