Search
Question: Identifying long stretches of Ns in hg19
0
gravatar for addyS
15 days ago by
addyS0
addyS0 wrote:

Is there a R package to identify long stretches of gaps (Ns) in human genome (hg19)?

ADD COMMENTlink modified 15 days ago by Mike Smith2.1k • written 15 days ago by addyS0
0
gravatar for Mike Smith
15 days ago by
Mike Smith2.1k
EMBL Heidelberg / de.NBI
Mike Smith2.1k wrote:

The 'masked' versions of the reference genomes (e.g. http://bioconductor.org/packages/BSgenome.Hsapiens.UCSC.hg19.masked/) come with a mask that represents the gaps in the assembly.

You haven't said what you're trying to do, but you can get the locations of all the gaps on chromosome 1 by doing something like:

library(BSgenome.Hsapiens.UCSC.hg19.masked)
masks(BSgenome.Hsapiens.UCSC.hg19.masked$chr1)[["AGAPS"]]
NormalIRanges object with 39 ranges and 0 metadata columns:
           start       end     width
       <integer> <integer> <integer>
   [1]         1     10000     10000
   [2]    177418    227417     50000
   [3]    267720    317719     50000
   [4]    471369    521368     50000
   [5]   2634221   2684220     50000
   ...       ...       ...       ...
  [35] 206332222 206482221    150000
  [36] 223747847 223797846     50000
  [37] 235192212 235242211     50000
  [38] 248908211 249058210    150000
  [39] 249240622 249250621     10000
ADD COMMENTlink modified 15 days ago • written 15 days ago by Mike Smith2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 225 users visited in the last hour