Question: Converting gene symbol list to Entrez IDs
0
gravatar for imalumberjack
13 months ago by
imalumberjack0 wrote:

Hello all, 

I'm not very experienced with bioconductor and R, and I am struggling with converting a list of gene symbols I've read in from a .csv file into R into their relevant ENTREZ ID(s). I was wondering if anyone had any tips for how to address this? The code I'd been attempting to use was the following:

>prog<-read.csv(file="mydata.csv," header=TRUE, sep="/")

> gns<-select(org.Hs.eg.db, prog, c("ENTREZID","GENENAME"))

Error in .testForValidKeys(x, keys, keytype, fks) :

  None of the keys entered are valid keys for 'ENTREZID'. Please use the keys method to see a listing of valid arguments.

Many thanks for your help!

ADD COMMENTlink modified 13 months ago by mat14940 • written 13 months ago by imalumberjack0

Where did you get the list of gene symbols from? From a published paper? I ask this many published sources include gene symbols that are no longer current official symbols.

Your file has a "csv" extension, suggesting that it is a comma-separated file, but then you specify sep="/". What gives with that? Can you show us the first few lines of your file? Does your data file have a column containing gene symbols?

What will you do with the Entrez Gene Ids when you get them? What will be the next step?

ADD REPLYlink written 13 months ago by Gordon Smyth36k
Answer: Converting gene symbol list to Entrez IDs
1
gravatar for James W. MacDonald
13 months ago by
United States
James W. MacDonald49k wrote:

You are passing a data.frame to select, rather than a character vector. Presumably one of the columns of prog contains the Entrez Gene IDs, so you should subset to that column. Also note that the default of read.csv is to convert strings to factors, so you should probably include stringsAsFactors = FALSE to your call to read.csv.

ADD COMMENTlink written 13 months ago by James W. MacDonald49k
Answer: Converting gene symbol list to Entrez IDs
0
gravatar for mat149
13 months ago by
mat14940
mat14940 wrote:

Here is a code chunk that I use to convert zebrafish gene symbols to Entrez gene ID's:

("t" in this case is of class character with random genes that I'm interested in, but you can use your "read.csv" object)

library(org.Dr.eg.db)
keytypes(org.Dr.eg.db)
library(clusterProfiler)

t <- c("lepa","lepr","lepb","leprot")
et <- bitr(t, fromType="SYMBOL", toType=(c("ENTREZID","PATH","GO","ALIAS","GENENAME")), OrgDb="org.Dr.eg.db")
head(et)

and the reverse:

tt<-c("100150233","567241","564348","550484")
ett <- bitr(tt, fromType="ENTREZID", toType="SYMBOL", OrgDb="org.Dr.eg.db")
head(ett)
ADD COMMENTlink written 13 months ago by mat14940
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 161 users visited in the last hour