Entering edit mode
addyS
•
0
@addys-11305
Last seen 8.0 years ago
Is there a R package to identify long stretches of gaps (Ns) in human genome (hg19)?
Is there a R package to identify long stretches of gaps (Ns) in human genome (hg19)?
The 'masked' versions of the reference genomes (e.g. http://bioconductor.org/packages/BSgenome.Hsapiens.UCSC.hg19.masked/) come with a mask that represents the gaps in the assembly.
You haven't said what you're trying to do, but you can get the locations of all the gaps on chromosome 1 by doing something like:
library(BSgenome.Hsapiens.UCSC.hg19.masked) masks(BSgenome.Hsapiens.UCSC.hg19.masked$chr1)[["AGAPS"]]
NormalIRanges object with 39 ranges and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] 1 10000 10000
[2] 177418 227417 50000
[3] 267720 317719 50000
[4] 471369 521368 50000
[5] 2634221 2684220 50000
... ... ... ...
[35] 206332222 206482221 150000
[36] 223747847 223797846 50000
[37] 235192212 235242211 50000
[38] 248908211 249058210 150000
[39] 249240622 249250621 10000
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.