I'm trying to get gene names for the chromosomal location of SNPs (i.e. not trying to gene expression, etc. with SNP, literally just which gene the SNP falls in) using the dbSNP IDs (the ones that start with "rs"). When I supply a long list of attributes to return, I get no results (for this arbitrary SNP), but when I ask only for ENSG gene ID, I get a result. I've looked through the documentation for a flag to change, but have been unable to find one. Thanks!
mart.snps <- useMart('ENSEMBL_MART_SNP', 'hsapiens_snp')
snp = "rs368991480"
getBM(attributes = c("refsnp_id", "ensembl_gene_stable_id", "associated_gene", "chr_name", "chrom_start", "chrom_end", "chrom_strand"),
filters = "snp_filter",
values = snp,
mart = mart.snps)
# returns 0 rows
getBM(attributes = c("refsnp_id", "ensembl_gene_stable_id"),
filters = "snp_filter",
values = snp,
mart = mart.snps)
# returns 1 row
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X Yosemite 10.10.5
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomaRt_2.30.0
loaded via a namespace (and not attached):
[1] IRanges_2.8.1 parallel_3.3.2 DBI_0.5-1 tools_3.3.2 RCurl_1.95-4.8 memoise_1.0.0 Rcpp_0.12.8
[8] Biobase_2.34.0 AnnotationDbi_1.36.0 RSQLite_1.1-1 S4Vectors_0.12.1 BiocGenerics_0.20.0 digest_0.6.11 stats4_3.3.2
[15] bitops_1.0-6 XML_3.98-1.5
Yes, it smelled like a join issue :) When I looked earlier, I checked to make sure that all the attributes were on the same "page" of attributes (as per attributePages()) to debug the join issue, saw that they were, and moved along. Now I see that there are multiple attributes named refsnp_id, chr_name, etc., and so maybe a join is [effectively] being forced where it's not needed?
The block starting at 448 here https://github.com/Bioconductor-mirror/biomaRt/blob/master/R/biomaRt.R should either prevent or object to this, but it appears to be wrapped in an "if(false)"
I guess, to summarize - bug report: listAttributes() appears to tell me I can do something I can't do, which silently fails.
Good tip about trying to perform the query on the Biomart site, thanks. I don't get the hourglassing, I just get the empty query, which is the same result in the long run, I suppose. Well, you're right that it's a phenotype associated gene anyway, so I guess it's moot. Thanks!
Thank you both for looking into this. James' suggestion of trying the query on the Ensembl web interface is exactly how I debug a lot of the issues that get reported. At the moment I find it much easier than trying to process the interim pages returned by the
biomaRt
code.I'll take a look at the code block that can never be run, try to understand why it was blanked out and hopefully write something to catch this problem in the future.