Hello, sorry for the probably silly question, but I've tried for days now. I'm a beginner. I have a list of genomic coordinates in GenomicRanges format, e.g.
> coords_GR
GRanges object with 3320 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] 1 1950293 *
[2] 1 18336405-20142656 *
[3] 1 29684252 *
[4] 1 71218722-73861224 *
[5] 1 80888506-83526065 *
... ... ... ...
[3316] 20 716956-2488855 *
[3317] 21 41901419-44757190 *
[3318] 22 29255810-31043932 *
[3319] 22 35134992-38911889 *
[3320] X 154717327 *
-------
seqinfo: 24 sequences from an unspecified genome; no seqlengths
And I would like to annotate all SNVs in the range, obtaining information such as RSid, MAF, and, if exonic, if it's synonimous/non-synonimous. Also it would be good to be able to annotate non-SNP variations such as indels, CNVs, deletions etc for potential protein coding/clinical impact. I found a number of tools that can do that for genotyping data in VCF (such as ensembl VEP), but no tools that can do it based on genomic coordinates alone.
Many thanks for your help
Emanuele
Dear Hervé, this is amazing, thank you so much. Bioconductor is great but so difficult to start with! At least for me. When you say "Then use your preferred tool for retrieving details about the rs ids." which tools can you recommend to annotate protein coding effects, both deletions/frameshift/truncations etc?
Many thanks Emanuele
You could try Annovar, or the UCSC Genome Browser or any of the other choices listed by dbNSFP.