I'm trying to match a list of uniprot mouse IDs with ensembl protein, transcript, and gene IDs.
So I download
mart <- useMart(biomart="ensembl",dataset = "mmusculus_gene_ensembl")
mart.df <- getBM(attributes = c("uniprot_swissprot","uniprot_sptrembl","ensembl_gene_id","external_gene_name","description"),mart=mart)
However, many in my list of uniprot mouse IDs do not have a match in the
mart.df I downloaded.
Q922S4 is in my data, has a uniprot page (http://www.uniprot.org/uniprot/Q922S4) but does not match neither in
mart.df$uniprot_swissprot nor in
mart.df$uniprot_sptrembl. However, if I follow the cross-ref link to UCSC (http://genome.ucsc.edu/cgi-bin/hgGene?hgg_gene=uc009ioq.3&org=mouse) and from there follow the cross-ref link to ensembl (http://uswest.ensembl.org/Mus_musculus/Gene/Summary?g=ENSMUSG00000030653;r=7:101410886-101512819;t=ENSMUST00000084894), the latter page has a link to uniprot accession: F7D3W5 (http://www.uniprot.org/uniprot/F7D3W5). In contrast to Q922S4, F7D3W5 is an unreviewed entry.
So my questions are:
1. Is there any way to download from
biomaRt both reviewed and unreviewed uniprot proteins? So I can guarantee to match my data?
2. Why is
biomaRt even holding the unreviewed accession over the reviewed ones?