Convert gene symbols to ensembl id

0

Entering edit mode

michelle_low ▴ 50

@michelle_low-5267

Last seen 9.6 years ago

Hi all, I have a list of gene symbols generated from the differential expression analysis below. How do I convert these symbols to emsembl id? Thanks Regards, Michelle R version 2.14.1 (2011-12-22) Platform: x86_64-pc-mingw32/x64 (64-bit) > library(affy) > library(limma) > pd=read.AnnotatedDataFrame("phenodata.txt",header=TRUE,sep="",row .names=1) > a=ReadAffy(filenames=rownames(pData(pd)),phenoData=pd,verbose=TRUE) 1 reading Control-1.cel ...instantiating an AffyBatch (intensity a 1004004x6 matrix)...done. Reading in : Control-1.cel Reading in : Control-2.cel Reading in : Dicer-1.cel Reading in : Dicer-2.cel Reading in : Drosha-1.cel Reading in : Drosha-2.cel > x=rma(a) Loading required package: AnnotationDbi Background correcting Normalizing Calculating Expression Warning message: package âAnnotationDbiâ was built under R version 2.14.2 > c=paste(pd$treatment,pd$n,sep="") > f=factor(c) > design=model.matrix(~0+f) > colnames(design)=levels(f) > fit=lmFit(x,design) > library(mouse4302.db) Loading required package: org.Mm.eg.db Loading required package: DBI Warning messages: 1: package âRSQLiteâ was built under R version 2.14.2 2: package âDBIâ was built under R version 2.14.2 > library(annotate) Warning message: package âannotateâ was built under R version 2.14.2 > fit$genes$Symbol <- getSYMBOL(fit$genes$ID,"mouse4302.db") > contrast.matrix=makeContrasts(E1="present-absent.Dicer",E2 ="present-absent.Drosha",E3="absent.Drosha- absent.Dicer",levels=design) > fit2=contrasts.fit(fit,contrast.matrix) > > fit2=eBayes(fit2) > > results1 <-topTable (fit2, coef=1, p.value=0.0001,number=nrow(fit2)) > write.table(results1, file="control-Dicer5.txt") > results2 <-topTable (fit2, coef=2, p.value=0.0001,number=nrow(fit2)) > write.table(results2, file="control-Drosha5.txt") > results3 <-topTable (fit2, coef=3, p.value=0.0001,number=nrow(fit2)) > results=decideTests(fit2) > summary(results2) > b=venncounts(results2) > print(b) > vennDiagram(results) > a=vennDiagram(results,include=c("up","down"),counts.col=c("red"," green")) [[alternative HTML version deleted]]

convert convert • 12k views

ADD COMMENT • link updated 12.0 years ago by Fred Boehm ▴ 20 • written 12.0 years ago by michelle_low ▴ 50

0

Entering edit mode

Fred Boehm ▴ 20

@fred-boehm-5269

Last seen 9.6 years ago

Greetings, Michelle, I haven't worked with mouse data, but I think that the function getBM() in the bioconductor package biomaRt can help. For instance, one could use the code below (replacing mySymbols with the vector of symbols that interest you) to output a data.frame with both ensembl gene ID and mgi symbol. The creators of biomaRt have generated some nice tutorial materials and posted them at: http://www.bioconductor.org/packages/2.2/bioc/html/biomaRt.html If I've misinterpreted your question, you may be able to find the answer by viewing the biomaRt materials. I hope that this helps. Cheers, Fred -------------- source("http://bioconductor.org/biocLite.R") biocLite("biomaRt") library(biomaRt) mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl") listFilters(mouse) listAttributes(mouse) mySymbols <- "2310015A10Rik" # mySymbols is a vector of MGI symbols. getBM( attributes=c("ensembl_gene_id", "mgi_symbol") , filters= "mgi_symbol" , values =mySymbols ,mart=mouse) On 5/6/12 5:41 AM, michelle_low wrote: > Hi all, > > I have a list of gene symbols generated from the differential expression analysis below. How do I convert these symbols to emsembl id? Thanks > > > Regards, > Michelle > > > > > R version 2.14.1 (2011-12-22) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > > > library(affy) > > library(limma) > > pd=read.AnnotatedDataFrame("phenodata.txt",header=TRUE,sep="",r ow.names=1) > > a=ReadAffy(filenames=rownames(pData(pd)),phenoData=pd,verbose=TRUE) > 1 reading Control-1.cel ...instantiating an AffyBatch (intensity a 1004004x6 matrix)...done. > Reading in : Control-1.cel > Reading in : Control-2.cel > Reading in : Dicer-1.cel > Reading in : Dicer-2.cel > Reading in : Drosha-1.cel > Reading in : Drosha-2.cel > > x=rma(a) > Loading required package: AnnotationDbi > Background correcting > Normalizing > Calculating Expression > Warning message: > package âEUR~AnnotationDbiâEUR^(TM) was built under R version 2.14.2 > > c=paste(pd$treatment,pd$n,sep="") > > f=factor(c) > > design=model.matrix(~0+f) > > colnames(design)=levels(f) > > fit=lmFit(x,design) > > library(mouse4302.db) > Loading required package: org.Mm.eg.db > Loading required package: DBI > > Warning messages: > 1: package âEUR~RSQLiteâEUR^(TM) was built under R version 2.14.2 > 2: package âEUR~DBIâEUR^(TM) was built under R version 2.14.2 > > library(annotate) > Warning message: > package âEUR~annotateâEUR^(TM) was built under R version 2.14.2 > > fit$genes$Symbol<- getSYMBOL(fit$genes$ID,"mouse4302.db") > > contrast.matrix=makeContrasts(E1="present-absent.Dicer",E2 ="present-absent.Drosha",E3="absent.Drosha- absent.Dicer",levels=design) > > fit2=contrasts.fit(fit,contrast.matrix) > > > > fit2=eBayes(fit2) > > > > results1<-topTable (fit2, coef=1, p.value=0.0001,number=nrow(fit2)) > > write.table(results1, file="control-Dicer5.txt") > > results2<-topTable (fit2, coef=2, p.value=0.0001,number=nrow(fit2)) > > write.table(results2, file="control-Drosha5.txt") > > results3<-topTable (fit2, coef=3, p.value=0.0001,number=nrow(fit2)) > > results=decideTests(fit2) > > summary(results2) > > b=venncounts(results2) > > print(b) > > vennDiagram(results) > > a=vennDiagram(results,include=c("up","down"),counts.col=c("red" ,"green")) > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]

ADD COMMENT • link 12.0 years ago Fred Boehm ▴ 20

0

Entering edit mode

On 05/06/2012 07:23 AM, Fred Boehm wrote: > Greetings, Michelle, > > I haven't worked with mouse data, but I think that the function getBM() > in the bioconductor package biomaRt can help. also library(mouse4302.db) unlist(mget(fit$genes$ID, mouse4302ENSEMBL, ifnotfound=NA) or even better in R 2.14.0 or greater select(mouse4302.db, ids, "ENSEMBL") (see ?select, ?keys, ?cols, ?keytype) Martin > > For instance, one could use the code below (replacing mySymbols with the > vector of symbols that interest you) to output a data.frame with both > ensembl gene ID and mgi symbol. > > The creators of biomaRt have generated some nice tutorial materials and > posted them at: > > http://www.bioconductor.org/packages/2.2/bioc/html/biomaRt.html > > If I've misinterpreted your question, you may be able to find the answer > by viewing the biomaRt materials. > > I hope that this helps. > > Cheers, > Fred > -------------- > > source("http://bioconductor.org/biocLite.R") > biocLite("biomaRt") > > library(biomaRt) > mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl") > listFilters(mouse) > listAttributes(mouse) > mySymbols<- "2310015A10Rik" # mySymbols is a vector of MGI symbols. > getBM( attributes=c("ensembl_gene_id", "mgi_symbol") , filters= > "mgi_symbol" , values =mySymbols ,mart=mouse) > > > > > On 5/6/12 5:41 AM, michelle_low wrote: >> Hi all, >> >> I have a list of gene symbols generated from the differential expression analysis below. How do I convert these symbols to emsembl id? Thanks >> >> >> Regards, >> Michelle >> >> >> >> >> R version 2.14.1 (2011-12-22) >> Platform: x86_64-pc-mingw32/x64 (64-bit) >> >> >> > library(affy) >> > library(limma) >> > pd=read.AnnotatedDataFrame("phenodata.txt",header=TRUE,sep="", row.names=1) >> > a=ReadAffy(filenames=rownames(pData(pd)),phenoData=pd,verbose=TRUE) >> 1 reading Control-1.cel ...instantiating an AffyBatch (intensity a 1004004x6 matrix)...done. >> Reading in : Control-1.cel >> Reading in : Control-2.cel >> Reading in : Dicer-1.cel >> Reading in : Dicer-2.cel >> Reading in : Drosha-1.cel >> Reading in : Drosha-2.cel >> > x=rma(a) >> Loading required package: AnnotationDbi >> Background correcting >> Normalizing >> Calculating Expression >> Warning message: >> package ?EUR~AnnotationDbi?EUR^(TM) was built under R version 2.14.2 >> > c=paste(pd$treatment,pd$n,sep="") >> > f=factor(c) >> > design=model.matrix(~0+f) >> > colnames(design)=levels(f) >> > fit=lmFit(x,design) >> > library(mouse4302.db) >> Loading required package: org.Mm.eg.db >> Loading required package: DBI >> >> Warning messages: >> 1: package ?EUR~RSQLite?EUR^(TM) was built under R version 2.14.2 >> 2: package ?EUR~DBI?EUR^(TM) was built under R version 2.14.2 >> > library(annotate) >> Warning message: >> package ?EUR~annotate?EUR^(TM) was built under R version 2.14.2 >> > fit$genes$Symbol<- getSYMBOL(fit$genes$ID,"mouse4302.db") >> > contrast.matrix=makeContrasts(E1="present-absent.Dicer",E2 ="present-absent.Drosha",E3="absent.Drosha- absent.Dicer",levels=design) >> > fit2=contrasts.fit(fit,contrast.matrix) >> > >> > fit2=eBayes(fit2) >> > >> > results1<-topTable (fit2, coef=1, p.value=0.0001,number=nrow(fit2)) >> > write.table(results1, file="control-Dicer5.txt") >> > results2<-topTable (fit2, coef=2, p.value=0.0001,number=nrow(fit2)) >> > write.table(results2, file="control-Drosha5.txt") >> > results3<-topTable (fit2, coef=3, p.value=0.0001,number=nrow(fit2)) >> > results=decideTests(fit2) >> > summary(results2) >> > b=venncounts(results2) >> > print(b) >> > vennDiagram(results) >> > a=vennDiagram(results,include=c("up","down"),counts.col=c("red ","green")) >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793

ADD REPLY • link 12.0 years ago Martin Morgan 25k

0

Entering edit mode

Thanks Fred and Martin. ---- On Sun, 06 May 2012 08:28:58 -0700 Martin Morgan <mtmorgan@fhcrc.org> wrote ---- On 05/06/2012 07:23 AM, Fred Boehm wrote: > Greetings, Michelle, > > I haven't worked with mouse data, but I think that the function getBM() > in the bioconductor package biomaRt can help. also library(mouse4302.db) unlist(mget(fit$genes$ID, mouse4302ENSEMBL, ifnotfound=NA) or even better in R 2.14.0 or greater select(mouse4302.db, ids, "ENSEMBL") (see ?select, ?keys, ?cols, ?keytype) Martin > > For instance, one could use the code below (replacing mySymbols with the > vector of symbols that interest you) to output a data.frame with both > ensembl gene ID and mgi symbol. > > The creators of biomaRt have generated some nice tutorial materials and > posted them at: > > http://www.bioconductor.org/packages/2.2/bioc/html/biomaRt.html > > If I've misinterpreted your question, you may be able to find the answer > by viewing the biomaRt materials. > > I hope that this helps. > > Cheers, > Fred > -------------- > > source("http://bioconductor.org/biocLite.R") > biocLite("biomaRt") > > library(biomaRt) > mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl") > listFilters(mouse) > listAttributes(mouse) > mySymbols<- "2310015A10Rik" # mySymbols is a vector of MGI symbols. > getBM( attributes=c("ensembl_gene_id", "mgi_symbol") , filters= > "mgi_symbol" , values =mySymbols ,mart=mouse) > > > > > On 5/6/12 5:41 AM, michelle_low wrote: >> Hi all, >> >> I have a list of gene symbols generated from the differential expression analysis below. How do I convert these symbols to emsembl id? Thanks >> >> >> Regards, >> Michelle >> >> >> >> >> R version 2.14.1 (2011-12-22) >> Platform: x86_64-pc-mingw32/x64 (64-bit) >> >> >> > library(affy) >> > library(limma) >> > pd=read.AnnotatedDataFrame("phenodata.txt",header=TR UE,sep="",row.names=1) >> > a=ReadAffy(filenames=rownames(pData(pd)),phenoData=pd,verbose=TRUE) >> 1 reading Control-1.cel ...instantiating an AffyBatch (intensity a 1004004x6 matrix)...done. >> Reading in : Control-1.cel >> Reading in : Control-2.cel >> Reading in : Dicer-1.cel >> Reading in : Dicer-2.cel >> Reading in : Drosha-1.cel >> Reading in : Drosha-2.cel >> > x=rma(a) >> Loading required package: AnnotationDbi >> Background correcting >> Normalizing >> Calculating Expression >> Warning message: >> package Ã¢EUR~AnnotationDbiÃ¢EUR^(TM) was built under R version 2.14.2 >> > c=paste(pd$treatment,pd$n,sep="") >> > f=factor(c) >> > design=model.matrix(~0+f) >> > colnames(design)=levels(f) >> > fit=lmFit(x,design) >> > library(mouse4302.db) >> Loading required package: org.Mm.eg.db >> Loading required package: DBI >> >> Warning messages: >> 1: package Ã¢EUR~RSQLiteÃ¢EUR^(TM) was built under R version 2.14.2 >> 2: package Ã¢EUR~DBIÃ¢EUR^(TM) was built under R version 2.14.2 >> > library(annotate) >> Warning message: >> package Ã¢EUR~annotateÃ¢EUR^(TM) was built under R version 2.14.2 >> > fit$genes$Symbol<- getSYMBOL(fit$genes$ID,"mouse4302.db") >> > contrast.matrix=makeContrasts(E1="present- absent.Dicer",E2="present-absent.Drosha",E3="absent.Drosha- absent.Dicer",levels=design) >> > fit2=contrasts.fit(fit,contrast.matrix) >> > >> > fit2=eBayes(fit2) >> > >> > results1<-topTable (fit2, coef=1, p.value=0.0001,number=nrow(fit2)) >> > write.table(results1, file="control-Dicer5.txt") >> > results2<-topTable (fit2, coef=2, p.value=0.0001,number=nrow(fit2)) >> > write.table(results2, file="control-Drosha5.txt") >> > results3<-topTable (fit2, coef=3, p.value=0.0001,number=nrow(fit2)) >> > results=decideTests(fit2) >> > summary(results2) >> > b=venncounts(results2) >> > print(b) >> > vennDiagram(results) >> > a=vennDiagram(results,include=c("up","down"),counts. col=c("red","green")) >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 [[alternative HTML version deleted]]

ADD REPLY • link 12.0 years ago michelle_low ▴ 50

Login before adding your answer.