Hello,
This is a follow-up question to my previous post on rsID-toHGNC mapping.
I am having trouble retrieving the positional information for a set of rsids in which some of those rsids do not have a match in the curated dbSNP database accessed via snpid2grange() and rsidsToGRanges() functions.
I have done test runs on smaller subsets of rsids and some return results while others produce an error, e.g.,:
#test1:
>rsIDsSUBSET <- rsidList[1:100,1]
>rsids <- stri_sub(rsIDsSUBSET,3)
>gr <- snpid2grange(SNPlocs.Hsapiens.dbSNP.20120608, rsids)
# Outuput: Error in .snpid2rowidx(x, snpid) :
SNP id(s) not found: 35418599, 35367238, 4438386, 28833602, 9730959, 28451502, 553358
#test 2
>rsIDsSUBSET <- rsidList[10000:10200,1]
>rsids <- stri_sub(rsIDsSUBSET,3)
>gr <- snpid2grange(SNPlocs.Hsapiens.dbSNP.20120608, rsids)
# Outuput: GRanges with 201 ranges and 2 metadata columns
It's difficult to tell if the error is generated when ANY rsid is missing from dbSNP or only if some threshold number are missing. In either case, since I have a list of 1.9M rsids for which I need to retrieve data, this will be prohibitive.
Is there any way to get either of these functions to simply omit the rsids for which an entry is missing in dbSNP?
Thank you,
Kathleen