Question: Overlap the snps
gravatar for anastasiya-terskih
5.1 years ago by
Russian Federation
anastasiya-terskih10 wrote:

Hello! Could you help me please.

I use  predictCoding() to get information about nucleotide substitutions:

> nonsyn <- predictCoding(vcf_mod, txdb, Hsapiens)

Then I need to overlap the GRanges I've got from predictCoding() with the positition in
the SNPloc package:

> library(SNPlocs.Hsapiens.dbSNP.20100427)

Get the locations and alleles of all SNPs on chromosomes 1:

> ch1snps <- getSNPlocs("ch1")
> ch1gr <-getSNPlocs("ch1", as.GRanges=TRUE)

Overlap the snps with my ranges from predictCoding():

> mysnps <- GRanges("ch1", IRanges(865692:249150116, width=1))

When 865692 is the first start position for chr1 , 249150116 is the last start position for chr1.

> hits <- findOverlaps(mysnps, ch1gr)
> hits
Hits of length 1418032
queryLength: 242470673
subjectLength: 1422439

Pull out the snp that was hit:
> ch1gr[subjectHits(hits),]

GRanges with 1418032 ranges and 2 metadata columns

I don't understand which input is used for GRanges?I want to apply GRanges to my nonsyn.

my nonsyn table contains 20859 rows for all chromosomes.

i need to overlap only ranges for chr2 in nonsyn (2294 rows), but hits contain too many overlaps.

I know how overlap 1 range:

> mysnps <- GRanges("ch1", IRanges(865692, width=1))

but how overlape all ranges for chr1 from nonsyn?

Thank you!

iranges granges overlap snps • 1.0k views
ADD COMMENTlink modified 5.1 years ago by Michael Lawrence11k • written 5.1 years ago by anastasiya-terskih10
Answer: Overlap the snps
gravatar for Michael Lawrence
5.1 years ago by
United States
Michael Lawrence11k wrote:

If all you want to know is whether a coding variant is a SNP, then you might do this:

vr_mod <- as(vcf_mod, "VRanges")
vr_dbsnp <- readVcfAsVRanges("dbsnp.vcf", genome="hg19")
# may need to adapt your seqnameStyle here so that things match
# then
vr_mod$dbSNP <- vr_mod %in% vr_dbsnp
vr_coding <- predictCoding(vr_mod, txdb, Hsapiens)

Then you have your coding predictions and dbSNP concordance status all in one place. This does require downloading and reading the dbSNP VCF file. It would also be possible to convert the SNPlocs data to a VRanges, but it's a bit painful with the ambiguity codes. I think this part of Bioconductor needs some attention next release cycle.

ADD COMMENTlink written 5.1 years ago by Michael Lawrence11k
Answer: Overlap the snps
gravatar for Hervé Pagès
5.1 years ago by
Hervé Pagès ♦♦ 14k
United States
Hervé Pagès ♦♦ 14k wrote:

Hi Anastasiya,

Here is how you can proceed to map the genomic ranges returned by predictCoding() to the SNPs from the SNPlocs package.

nonsyn <- predictCoding(vcf_mod, txdb, Hsapiens)
ch1gr <- getSNPlocs("ch1", as.GRanges=TRUE)
seqlevelsStyle(ch1gr) <- "UCSC"
genome(ch1gr) <- "hg19"
hits <- findOverlaps(nonsyn, ch1gr)

Before we go further, we check that the mapping from nonsyn to ch1gr is not ambiguous, i.e. that the genomic ranges returned by predictCoding() are mapped to at most 1 SNP from the SNPlocs package:

stopifnot(sum(duplicated(queryHits(hits))) == 0)

Then we compute the mapping from nonsyn to ch1gr. This will be an integer vector parallel to nonsyn i.e. of the same length as nonsyn:

nonsyn_to_ch1gr <- as.integer(as(hits, "List"))

Note that if we didn't care about the risk of ambiguous mapping, we could have obtained nonsyn_to_ch1gr directly with findOverlaps(nonsyn, ch1gr, select="first")

Now we can use nonsyn_to_ch1gr to add RefSNP_id and alleles_as_ambig metadata column to nonsyn:

mcols(nonsyn)$RefSNP_id <- mcols(ch1gr)$RefSNP_id[nonsyn_to_ch1gr]
mcols(nonsyn)$alleles_as_ambig <- mcols(ch1gr)$alleles_as_ambig[nonsyn_to_ch1gr]

2 notes:

  1. Not all ranges in nonsyn will necessarily be mapped to a SNP from the SNPlocs package. In that case, the RefSNP_id and alleles_as_ambig metadata columns added to nonsyn will contain NAs.
  2. The alleles in the alleles_as_ambig metadata column are always reported with respect to the plus strand.

Hope this helps,




ADD COMMENTlink written 5.1 years ago by Hervé Pagès ♦♦ 14k

Thank you very much!It works)

ADD REPLYlink written 5.1 years ago by anastasiya-terskih10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 412 users visited in the last hour