Entering edit mode
addyS
•
0
@addys-11305
Last seen 7.0 years ago
Is there a R package to identify long stretches of gaps (Ns) in human genome (hg19)?
Is there a R package to identify long stretches of gaps (Ns) in human genome (hg19)?
The 'masked' versions of the reference genomes (e.g. http://bioconductor.org/packages/BSgenome.Hsapiens.UCSC.hg19.masked/) come with a mask that represents the gaps in the assembly.
You haven't said what you're trying to do, but you can get the locations of all the gaps on chromosome 1 by doing something like:
library(BSgenome.Hsapiens.UCSC.hg19.masked) masks(BSgenome.Hsapiens.UCSC.hg19.masked$chr1)[["AGAPS"]]
NormalIRanges object with 39 ranges and 0 metadata columns: start end width <integer> <integer> <integer> [1] 1 10000 10000 [2] 177418 227417 50000 [3] 267720 317719 50000 [4] 471369 521368 50000 [5] 2634221 2684220 50000 ... ... ... ... [35] 206332222 206482221 150000 [36] 223747847 223797846 50000 [37] 235192212 235242211 50000 [38] 248908211 249058210 150000 [39] 249240622 249250621 10000
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.