Identifying long stretches of Ns in hg19
1
0
Entering edit mode
addyS • 0
@addys-11305
Last seen 7.1 years ago

Is there a R package to identify long stretches of gaps (Ns) in human genome (hg19)?

bsgenome.hsapiens.ucsc.hg19 hg19 • 1.1k views
ADD COMMENT
0
Entering edit mode
Mike Smith ★ 6.6k
@mike-smith
Last seen 14 hours ago
EMBL Heidelberg

The 'masked' versions of the reference genomes (e.g. http://bioconductor.org/packages/BSgenome.Hsapiens.UCSC.hg19.masked/) come with a mask that represents the gaps in the assembly.

You haven't said what you're trying to do, but you can get the locations of all the gaps on chromosome 1 by doing something like:

library(BSgenome.Hsapiens.UCSC.hg19.masked)
masks(BSgenome.Hsapiens.UCSC.hg19.masked$chr1)[["AGAPS"]]
NormalIRanges object with 39 ranges and 0 metadata columns:
           start       end     width
       <integer> <integer> <integer>
   [1]         1     10000     10000
   [2]    177418    227417     50000
   [3]    267720    317719     50000
   [4]    471369    521368     50000
   [5]   2634221   2684220     50000
   ...       ...       ...       ...
  [35] 206332222 206482221    150000
  [36] 223747847 223797846     50000
  [37] 235192212 235242211     50000
  [38] 248908211 249058210    150000
  [39] 249240622 249250621     10000
ADD COMMENT

Login before adding your answer.

Traffic: 793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6