Convert gene symbols to ensembl id
1
0
Entering edit mode
michelle_low ▴ 50
@michelle_low-5267
Last seen 7.1 years ago
Hi all, I have a list of gene symbols generated from the differential expression analysis below. How do I convert these symbols to emsembl id? Thanks Regards, Michelle R version 2.14.1 (2011-12-22) Platform: x86_64-pc-mingw32/x64 (64-bit) > library(affy) > library(limma) > pd=read.AnnotatedDataFrame("phenodata.txt",header=TRUE,sep="",row .names=1) > a=ReadAffy(filenames=rownames(pData(pd)),phenoData=pd,verbose=TRUE) 1 reading Control-1.cel ...instantiating an AffyBatch (intensity a 1004004x6 matrix)...done. Reading in : Control-1.cel Reading in : Control-2.cel Reading in : Dicer-1.cel Reading in : Dicer-2.cel Reading in : Drosha-1.cel Reading in : Drosha-2.cel > x=rma(a) Loading required package: AnnotationDbi Background correcting Normalizing Calculating Expression Warning message: package ‘AnnotationDbi’ was built under R version 2.14.2 > c=paste(pd$treatment,pd$n,sep="") > f=factor(c) > design=model.matrix(~0+f) > colnames(design)=levels(f) > fit=lmFit(x,design) > library(mouse4302.db) Loading required package: org.Mm.eg.db Loading required package: DBI Warning messages: 1: package ‘RSQLite’ was built under R version 2.14.2 2: package ‘DBI’ was built under R version 2.14.2 > library(annotate) Warning message: package ‘annotate’ was built under R version 2.14.2 > fit$genes$Symbol <- getSYMBOL(fit$genes$ID,"mouse4302.db") > contrast.matrix=makeContrasts(E1="present-absent.Dicer",E2 ="present-absent.Drosha",E3="absent.Drosha- absent.Dicer",levels=design) > fit2=contrasts.fit(fit,contrast.matrix) > > fit2=eBayes(fit2) > > results1 <-topTable (fit2, coef=1, p.value=0.0001,number=nrow(fit2)) > write.table(results1, file="control-Dicer5.txt") > results2 <-topTable (fit2, coef=2, p.value=0.0001,number=nrow(fit2)) > write.table(results2, file="control-Drosha5.txt") > results3 <-topTable (fit2, coef=3, p.value=0.0001,number=nrow(fit2)) > results=decideTests(fit2) > summary(results2) > b=venncounts(results2) > print(b) > vennDiagram(results) > a=vennDiagram(results,include=c("up","down"),counts.col=c("red"," green")) [[alternative HTML version deleted]]
convert convert • 9.9k views
ADD COMMENT
0
Entering edit mode
Fred Boehm ▴ 20
@fred-boehm-5269
Last seen 7.1 years ago
Greetings, Michelle, I haven't worked with mouse data, but I think that the function getBM() in the bioconductor package biomaRt can help. For instance, one could use the code below (replacing mySymbols with the vector of symbols that interest you) to output a data.frame with both ensembl gene ID and mgi symbol. The creators of biomaRt have generated some nice tutorial materials and posted them at: http://www.bioconductor.org/packages/2.2/bioc/html/biomaRt.html If I've misinterpreted your question, you may be able to find the answer by viewing the biomaRt materials. I hope that this helps. Cheers, Fred -------------- source("http://bioconductor.org/biocLite.R") biocLite("biomaRt") library(biomaRt) mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl") listFilters(mouse) listAttributes(mouse) mySymbols <- "2310015A10Rik" # mySymbols is a vector of MGI symbols. getBM( attributes=c("ensembl_gene_id", "mgi_symbol") , filters= "mgi_symbol" , values =mySymbols ,mart=mouse) On 5/6/12 5:41 AM, michelle_low wrote: > Hi all, > > I have a list of gene symbols generated from the differential expression analysis below. How do I convert these symbols to emsembl id? Thanks > > > Regards, > Michelle > > > > > R version 2.14.1 (2011-12-22) > Platform: x86_64-pc-mingw32/x64 (64-bit) > > > > library(affy) > > library(limma) > > pd=read.AnnotatedDataFrame("phenodata.txt",header=TRUE,sep="",r ow.names=1) > > a=ReadAffy(filenames=rownames(pData(pd)),phenoData=pd,verbose=TRUE) > 1 reading Control-1.cel ...instantiating an AffyBatch (intensity a 1004004x6 matrix)...done. > Reading in : Control-1.cel > Reading in : Control-2.cel > Reading in : Dicer-1.cel > Reading in : Dicer-2.cel > Reading in : Drosha-1.cel > Reading in : Drosha-2.cel > > x=rma(a) > Loading required package: AnnotationDbi > Background correcting > Normalizing > Calculating Expression > Warning message: > package âEUR~AnnotationDbiâEUR^(TM) was built under R version 2.14.2 > > c=paste(pd$treatment,pd$n,sep="") > > f=factor(c) > > design=model.matrix(~0+f) > > colnames(design)=levels(f) > > fit=lmFit(x,design) > > library(mouse4302.db) > Loading required package: org.Mm.eg.db > Loading required package: DBI > > Warning messages: > 1: package âEUR~RSQLiteâEUR^(TM) was built under R version 2.14.2 > 2: package âEUR~DBIâEUR^(TM) was built under R version 2.14.2 > > library(annotate) > Warning message: > package âEUR~annotateâEUR^(TM) was built under R version 2.14.2 > > fit$genes$Symbol<- getSYMBOL(fit$genes$ID,"mouse4302.db") > > contrast.matrix=makeContrasts(E1="present-absent.Dicer",E2 ="present-absent.Drosha",E3="absent.Drosha- absent.Dicer",levels=design) > > fit2=contrasts.fit(fit,contrast.matrix) > > > > fit2=eBayes(fit2) > > > > results1<-topTable (fit2, coef=1, p.value=0.0001,number=nrow(fit2)) > > write.table(results1, file="control-Dicer5.txt") > > results2<-topTable (fit2, coef=2, p.value=0.0001,number=nrow(fit2)) > > write.table(results2, file="control-Drosha5.txt") > > results3<-topTable (fit2, coef=3, p.value=0.0001,number=nrow(fit2)) > > results=decideTests(fit2) > > summary(results2) > > b=venncounts(results2) > > print(b) > > vennDiagram(results) > > a=vennDiagram(results,include=c("up","down"),counts.col=c("red" ,"green")) > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
On 05/06/2012 07:23 AM, Fred Boehm wrote: > Greetings, Michelle, > > I haven't worked with mouse data, but I think that the function getBM() > in the bioconductor package biomaRt can help. also library(mouse4302.db) unlist(mget(fit$genes$ID, mouse4302ENSEMBL, ifnotfound=NA) or even better in R 2.14.0 or greater select(mouse4302.db, ids, "ENSEMBL") (see ?select, ?keys, ?cols, ?keytype) Martin > > For instance, one could use the code below (replacing mySymbols with the > vector of symbols that interest you) to output a data.frame with both > ensembl gene ID and mgi symbol. > > The creators of biomaRt have generated some nice tutorial materials and > posted them at: > > http://www.bioconductor.org/packages/2.2/bioc/html/biomaRt.html > > If I've misinterpreted your question, you may be able to find the answer > by viewing the biomaRt materials. > > I hope that this helps. > > Cheers, > Fred > -------------- > > source("http://bioconductor.org/biocLite.R") > biocLite("biomaRt") > > library(biomaRt) > mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl") > listFilters(mouse) > listAttributes(mouse) > mySymbols<- "2310015A10Rik" # mySymbols is a vector of MGI symbols. > getBM( attributes=c("ensembl_gene_id", "mgi_symbol") , filters= > "mgi_symbol" , values =mySymbols ,mart=mouse) > > > > > On 5/6/12 5:41 AM, michelle_low wrote: >> Hi all, >> >> I have a list of gene symbols generated from the differential expression analysis below. How do I convert these symbols to emsembl id? Thanks >> >> >> Regards, >> Michelle >> >> >> >> >> R version 2.14.1 (2011-12-22) >> Platform: x86_64-pc-mingw32/x64 (64-bit) >> >> >> > library(affy) >> > library(limma) >> > pd=read.AnnotatedDataFrame("phenodata.txt",header=TRUE,sep="", row.names=1) >> > a=ReadAffy(filenames=rownames(pData(pd)),phenoData=pd,verbose=TRUE) >> 1 reading Control-1.cel ...instantiating an AffyBatch (intensity a 1004004x6 matrix)...done. >> Reading in : Control-1.cel >> Reading in : Control-2.cel >> Reading in : Dicer-1.cel >> Reading in : Dicer-2.cel >> Reading in : Drosha-1.cel >> Reading in : Drosha-2.cel >> > x=rma(a) >> Loading required package: AnnotationDbi >> Background correcting >> Normalizing >> Calculating Expression >> Warning message: >> package ?EUR~AnnotationDbi?EUR^(TM) was built under R version 2.14.2 >> > c=paste(pd$treatment,pd$n,sep="") >> > f=factor(c) >> > design=model.matrix(~0+f) >> > colnames(design)=levels(f) >> > fit=lmFit(x,design) >> > library(mouse4302.db) >> Loading required package: org.Mm.eg.db >> Loading required package: DBI >> >> Warning messages: >> 1: package ?EUR~RSQLite?EUR^(TM) was built under R version 2.14.2 >> 2: package ?EUR~DBI?EUR^(TM) was built under R version 2.14.2 >> > library(annotate) >> Warning message: >> package ?EUR~annotate?EUR^(TM) was built under R version 2.14.2 >> > fit$genes$Symbol<- getSYMBOL(fit$genes$ID,"mouse4302.db") >> > contrast.matrix=makeContrasts(E1="present-absent.Dicer",E2 ="present-absent.Drosha",E3="absent.Drosha- absent.Dicer",levels=design) >> > fit2=contrasts.fit(fit,contrast.matrix) >> > >> > fit2=eBayes(fit2) >> > >> > results1<-topTable (fit2, coef=1, p.value=0.0001,number=nrow(fit2)) >> > write.table(results1, file="control-Dicer5.txt") >> > results2<-topTable (fit2, coef=2, p.value=0.0001,number=nrow(fit2)) >> > write.table(results2, file="control-Drosha5.txt") >> > results3<-topTable (fit2, coef=3, p.value=0.0001,number=nrow(fit2)) >> > results=decideTests(fit2) >> > summary(results2) >> > b=venncounts(results2) >> > print(b) >> > vennDiagram(results) >> > a=vennDiagram(results,include=c("up","down"),counts.col=c("red ","green")) >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD REPLY
0
Entering edit mode
Thanks Fred and Martin. ---- On Sun, 06 May 2012 08:28:58 -0700 Martin Morgan <mtmorgan@fhcrc.org> wrote ---- On 05/06/2012 07:23 AM, Fred Boehm wrote: > Greetings, Michelle, > > I haven't worked with mouse data, but I think that the function getBM() > in the bioconductor package biomaRt can help. also library(mouse4302.db) unlist(mget(fit$genes$ID, mouse4302ENSEMBL, ifnotfound=NA) or even better in R 2.14.0 or greater select(mouse4302.db, ids, "ENSEMBL") (see ?select, ?keys, ?cols, ?keytype) Martin > > For instance, one could use the code below (replacing mySymbols with the > vector of symbols that interest you) to output a data.frame with both > ensembl gene ID and mgi symbol. > > The creators of biomaRt have generated some nice tutorial materials and > posted them at: > > http://www.bioconductor.org/packages/2.2/bioc/html/biomaRt.html > > If I've misinterpreted your question, you may be able to find the answer > by viewing the biomaRt materials. > > I hope that this helps. > > Cheers, > Fred > -------------- > > source("http://bioconductor.org/biocLite.R") > biocLite("biomaRt") > > library(biomaRt) > mouse = useMart("ensembl", dataset = "mmusculus_gene_ensembl") > listFilters(mouse) > listAttributes(mouse) > mySymbols<- "2310015A10Rik" # mySymbols is a vector of MGI symbols. > getBM( attributes=c("ensembl_gene_id", "mgi_symbol") , filters= > "mgi_symbol" , values =mySymbols ,mart=mouse) > > > > > On 5/6/12 5:41 AM, michelle_low wrote: >> Hi all, >> >> I have a list of gene symbols generated from the differential expression analysis below. How do I convert these symbols to emsembl id? Thanks >> >> >> Regards, >> Michelle >> >> >> >> >> R version 2.14.1 (2011-12-22) >> Platform: x86_64-pc-mingw32/x64 (64-bit) >> >> >> > library(affy) >> > library(limma) >> > pd=read.AnnotatedDataFrame("phenodata.txt",header=TR UE,sep="",row.names=1) >> > a=ReadAffy(filenames=rownames(pData(pd)),phenoData=pd,verbose=TRUE) >> 1 reading Control-1.cel ...instantiating an AffyBatch (intensity a 1004004x6 matrix)...done. >> Reading in : Control-1.cel >> Reading in : Control-2.cel >> Reading in : Dicer-1.cel >> Reading in : Dicer-2.cel >> Reading in : Drosha-1.cel >> Reading in : Drosha-2.cel >> > x=rma(a) >> Loading required package: AnnotationDbi >> Background correcting >> Normalizing >> Calculating Expression >> Warning message: >> package âEUR~AnnotationDbiâEUR^(TM) was built under R version 2.14.2 >> > c=paste(pd$treatment,pd$n,sep="") >> > f=factor(c) >> > design=model.matrix(~0+f) >> > colnames(design)=levels(f) >> > fit=lmFit(x,design) >> > library(mouse4302.db) >> Loading required package: org.Mm.eg.db >> Loading required package: DBI >> >> Warning messages: >> 1: package âEUR~RSQLiteâEUR^(TM) was built under R version 2.14.2 >> 2: package âEUR~DBIâEUR^(TM) was built under R version 2.14.2 >> > library(annotate) >> Warning message: >> package âEUR~annotateâEUR^(TM) was built under R version 2.14.2 >> > fit$genes$Symbol<- getSYMBOL(fit$genes$ID,"mouse4302.db") >> > contrast.matrix=makeContrasts(E1="present- absent.Dicer",E2="present-absent.Drosha",E3="absent.Drosha- absent.Dicer",levels=design) >> > fit2=contrasts.fit(fit,contrast.matrix) >> > >> > fit2=eBayes(fit2) >> > >> > results1<-topTable (fit2, coef=1, p.value=0.0001,number=nrow(fit2)) >> > write.table(results1, file="control-Dicer5.txt") >> > results2<-topTable (fit2, coef=2, p.value=0.0001,number=nrow(fit2)) >> > write.table(results2, file="control-Drosha5.txt") >> > results3<-topTable (fit2, coef=3, p.value=0.0001,number=nrow(fit2)) >> > results=decideTests(fit2) >> > summary(results2) >> > b=venncounts(results2) >> > print(b) >> > vennDiagram(results) >> > a=vennDiagram(results,include=c("up","down"),counts. col=c("red","green")) >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 215 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6