Search
Question: Identifying long stretches of Ns in hg19
0
gravatar for addyS
13 months ago by
addyS0
addyS0 wrote:

Is there a R package to identify long stretches of gaps (Ns) in human genome (hg19)?

ADD COMMENTlink modified 13 months ago by Mike Smith3.1k • written 13 months ago by addyS0
0
gravatar for Mike Smith
13 months ago by
Mike Smith3.1k
EMBL Heidelberg / de.NBI
Mike Smith3.1k wrote:

The 'masked' versions of the reference genomes (e.g. http://bioconductor.org/packages/BSgenome.Hsapiens.UCSC.hg19.masked/) come with a mask that represents the gaps in the assembly.

You haven't said what you're trying to do, but you can get the locations of all the gaps on chromosome 1 by doing something like:

library(BSgenome.Hsapiens.UCSC.hg19.masked)
masks(BSgenome.Hsapiens.UCSC.hg19.masked$chr1)[["AGAPS"]]
NormalIRanges object with 39 ranges and 0 metadata columns:
           start       end     width
       <integer> <integer> <integer>
   [1]         1     10000     10000
   [2]    177418    227417     50000
   [3]    267720    317719     50000
   [4]    471369    521368     50000
   [5]   2634221   2684220     50000
   ...       ...       ...       ...
  [35] 206332222 206482221    150000
  [36] 223747847 223797846     50000
  [37] 235192212 235242211     50000
  [38] 248908211 249058210    150000
  [39] 249240622 249250621     10000
ADD COMMENTlink modified 13 months ago • written 13 months ago by Mike Smith3.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 336 users visited in the last hour