How often are SNPlocs.Hsapiens.dbSNP-* packages released? and why that time scale?
1
0
Entering edit mode
Ramiro Magno ▴ 100
@ramiro-magno-12376
Last seen 5.4 years ago
CBMR, Faro, Portugal

I am interested in accessing SNP annotation from dbSNP using a package like SNPlocs.Hsapiens.dbSNP-*. It seems however that these packages are not updated for every dbSNP build. 

  1. Why are these pkgs not built more often?
  2. If I wanted to build a SNPlocs.Hsapiens.dbSNP-* package for the most recent build of the NCBI SNP database, what would I need to do? Would it suffice to run those tools indicated in the package's SNPlocs.Hsapiens.dbSNP***.GRCh38/inst/tools/README.TXT?
  3. If I do build a new SNPlocs.Hsapiens.dbSNP-* pkg, does it still work nicely with BSgenome, namely, could I still inject SNPs and have them landing at the correction locations?

Thank you very much in advance.

SNPlocs SNP dbsnp • 1.7k views
ADD COMMENT
2
Entering edit mode
@herve-pages-1542
Last seen 13 hours ago
Seattle, WA, United States

Hi Ramiro,

FWIW I recently made a SNPlocs package for dbSNP Build 149 (the latest dbSNP build) but it's only available in BioC devel (i.e. BioC 3.5, requires R 3.4):

https://bioconductor.org/packages/SNPlocs.Hsapiens.dbSNP149.GRCh38

I actually highly recommend that you upgrade your installation to use BioC devel if you're planning to do any serious work with the SNPlocs package because these packages have been refactored to allow much faster data access.

There is currently no established schedule for updating these packages. These packages are big, making them is time-consuming, and they are not heavily used, so it's kind of hard to keep up with every dbSNP new build and it would also be hard to justify spending too much resources on doing that (we have to choose our priorities). However, if the community starts to express more interest in these packages, I will update them more often e.g. make a new one for every other dbSNP build.

You could follow the instructions in SNPlocs.Hsapiens.dbSNP***.GRCh38/inst/tools/README.TXT to make your own SNPlocs package. Some users have done it before e.g. https://bioconductor.org/packages/SNPlocs.Hsapiens.dbSNP142.GRCh37. But as I said, this procedure can be tedious and time-consuming. Note that if you decide to do so, I would highly recommend that you follow the procedure described in the README.TXT file of the SNPlocs.Hsapiens.dbSNP149.GRCh38 package because the procedure has changed significantly in BioC 3.5. In particular it's faster now: it takes about 3h instead of 14h. The README.TXT file in SNPlocs.Hsapiens.dbSNP149.GRCh38 describes the most up-to-date version of this procedure. And yes, the resulting package should work for injection in a BSgenome object, and the SNPs should land at the correct positions.

H.

ADD COMMENT
0
Entering edit mode

Thank you for your quick reply.

May I ask another thing? For each dbSNP build, NCBI provides this file RsMergeArch.bcp.gz that contains a translation table for SNPs that have been merged to new rsIDs. Is this type of resolution taken into account in SNPlocs.Hsapiens.dbSNP149.GRCh38, or do I need to translate my old SNP IDs before passing them to SNPlocs' functions?

ADD REPLY
1
Entering edit mode

Hi,

I didn't know about RsMergeArch.bcp.gz. Sounds like a valuable resource. Right now this information is not included in the SNPlocs packages and the snpsById() extractor doesn't perform any translation of the supplied ids.

H.

ADD REPLY
0
Entering edit mode

Hi Hervé,

Thanks anyways. I am going to install R devel and your recent SNPlocs package.

RM

ADD REPLY

Login before adding your answer.

Traffic: 950 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6