Question: biomaRt getBM() function does not recognize gene ID character vector values generated from csv file for annotation
0
gravatar for agdif
12 months ago by
agdif0
agdif0 wrote:

I have a multi-column csv file. In the first column are ensembl IDs that I would like to annotate using biomaRt in RStudio.

After reading the csv file into R as a dataframe I have converted the first column with the IDs to a character vector, since the getBM() parameter, values, requires a list of vectors as an argument. However, after running getBM() function it outputs 0 observations of 4 variables, thus not recognizing my character vector as the appropriate values to assigned the annotation to. 

Any input as to how this can be resolved? Does it have to do with the data frame or character vector? 

(file structure and R code below)

rbh,id,baseMean,log2FoldChange,lfcSE,stat,pvalue,padj
ENSACAP00000006835,gopAga1_00006538-RA,4.014843684,22.3613989,3.478310891,6.428809729,1.29E-10,1.58E-06
ENSACAP00000013416,gopAga1_00003775-RA,5.311678741,-3.207734101,0.845538991,-3.793715173,0.00014841,0.036437969
ENSACAP00000021108,gopAga1_00009907-RA,13.1840533,-2.49788257,0.67830511,-3.682535384,0.000230926,0.044218683
ENSACAP00000006847,gopAga1_00001219-RA,16.53058893,-1.282170299,0.351344313,-3.64932703,0.000262928,0.048092316
ENSACAP00000020399,gopAga1_00019120-RA,23.57386411,2.167299405,0.537139555,4.034890717,5.46E-05,0.021658211

#load matched id file into R as csv
dat=read.csv("/Users/cindyxu/Desktop/test/rbh_healthy_vs_unhealthy.csv", header=TRUE)

#set column with gene IDs as a list of character vectors
geneid=as.character(dat$rbh)

#load biomart and acar dataset
library("biomaRt")
ensembl=useMart("ensembl", dataset="acarolinensis_gene_ensembl")
genemap=getBM(attributes=c("ensembl_gene_id", "entrezgene", "hgnc_symbol", "description"), filters="ensembl_gene_id", values=geneid, mart=ensembl)

Output: genemap 0 obs. of 4 variables

 

ADD COMMENTlink modified 12 months ago by Mike Smith3.8k • written 12 months ago by agdif0
Answer: biomaRt getBM() function does not recognize gene ID character vector values gene
1
gravatar for Mike Smith
12 months ago by
Mike Smith3.8k
EMBL Heidelberg / de.NBI
Mike Smith3.8k wrote:

Your rbh column contains protein IDs, not gene IDs, so you need to change your query accordingly e.g.

> getBM(attributes=c("ensembl_gene_id", "entrezgene", "hgnc_symbol", "description"),
+       filters="ensembl_peptide_id", 
+       values=geneid, mart=ensembl)

     ensembl_gene_id entrezgene hgnc_symbol                                                     description
1 ENSACAG00000006713  100559986       FOXP2             forkhead box P2 [Source:HGNC Symbol;Acc:HGNC:13875]
2 ENSACAG00000006970  100560550       IFT46 intraflagellar transport 46 [Source:HGNC Symbol;Acc:HGNC:26146]
3 ENSACAG00000013687  100552532                                                                            
4 ENSACAG00000023622         NA                                                                            
5 ENSACAG00000026443         NA

ADD COMMENTlink modified 12 months ago • written 12 months ago by Mike Smith3.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 158 users visited in the last hour