using biomaRt to find all genes with kinase activity (and perhaps a Gene Ontology question too)
1
0
Entering edit mode
Andrew Yee ▴ 350
@andrew-yee-2667
Last seen 7.3 years ago
Apologies, if this is a naive question, but I've been trying to use biomaRt to retrieve all genes with kinase activity. I use the GO term GO:0016301 as follows: ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl") results <- getBM(c("entrezgene", "hgnc_symbol", "unigene"), filters = "go", values = "GO:0016301" , mart=ensembl) This pulls the usual results, e.g. like EGFR. However, it doesn't retrieve results for genes like ATM, which in Gene Ontology, is listed as a child of the GO term http://amigo.geneontology.org/cgi-bin/amigo/term-assoc.cgi?gptype=all& speciesdb=all&taxid=9606&evcode=all&term_assocs=all&term=GO%3A0016301& session_id=7978amigo1229194926&action=filter I would have thought that this query would retrieve genes listed with that GO term and the associated children of the GO term. On a follow up note, is there a filter by which I can just search "kinase" with biomaRt? Many thanks, Andrew [[alternative HTML version deleted]]
GO GO • 2.1k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States
Hi Andrew, I don't know if you can get all the children from biomaRt, but you can get them from the org.Hs.eg.db package: > library(org.Hs.eg.db) Loading required package: AnnotationDbi Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. Loading required package: DBI > all.egs <- mget("GO:0016301", org.Hs.egGO2ALLEGS) > all.egs <- unique(all.egs[[1]]) > all.dat <- data.frame("Entrez Gene" = all.egs, "Symbol" = unlist(mget(all.egs, org.Hs.egSYMBOL, ifnotfound=NA)), "UniGene" = sapply(mget(all.egs, org.Hs.egUNIGENE, ifnotfound=NA), paste, collapse=",")) > head(all.dat) Entrez.Gene Symbol UniGene 25 25 ABL1 Hs.431048 27 27 ABL2 Hs.159472 90 90 ACVR1 Hs.470316 91 91 ACVR1B Hs.438918 92 92 ACVR2A Hs.470174 93 93 ACVR2B Hs.174273 > dim(all.dat) [1] 796 3 Best, Jim Andrew Yee wrote: > Apologies, if this is a naive question, but I've been trying to use biomaRt > to retrieve all genes with kinase activity. > > I use the GO term GO:0016301 as follows: > > ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl") > results <- getBM(c("entrezgene", "hgnc_symbol", "unigene"), filters = "go", > values = "GO:0016301" , mart=ensembl) > > This pulls the usual results, e.g. like EGFR. > > > However, it doesn't retrieve results for genes like ATM, which in Gene > Ontology, is listed as a child of the GO term > > http://amigo.geneontology.org/cgi-bin/amigo/term-assoc.cgi?gptype=al l&speciesdb=all&taxid=9606&evcode=all&term_assocs=all&term=GO%3A001630 1&session_id=7978amigo1229194926&action=filter > > I would have thought that this query would retrieve genes listed with that > GO term and the associated children of the GO term. > > On a follow up note, is there a filter by which I can just search "kinase" > with biomaRt? > > Many thanks, > Andrew > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-5646 734-936-8662
ADD COMMENT
0
Entering edit mode
Thanks for the suggestion! Andrew On 12/13/08, James W. MacDonald <jmacdon@med.umich.edu> wrote: > > Hi Andrew, > > I don't know if you can get all the children from biomaRt, but you can get > them from the org.Hs.eg.db package: > > > library(org.Hs.eg.db) > Loading required package: AnnotationDbi > Loading required package: Biobase > Loading required package: tools > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > Loading required package: DBI > > all.egs <- mget("GO:0016301", org.Hs.egGO2ALLEGS) > > all.egs <- unique(all.egs[[1]]) > > all.dat <- data.frame("Entrez Gene" = all.egs, "Symbol" = > unlist(mget(all.egs, org.Hs.egSYMBOL, ifnotfound=NA)), "UniGene" = > sapply(mget(all.egs, org.Hs.egUNIGENE, ifnotfound=NA), paste, collapse=",")) > > head(all.dat) > Entrez.Gene Symbol UniGene > 25 25 ABL1 Hs.431048 > 27 27 ABL2 Hs.159472 > 90 90 ACVR1 Hs.470316 > 91 91 ACVR1B Hs.438918 > 92 92 ACVR2A Hs.470174 > 93 93 ACVR2B Hs.174273 > > dim(all.dat) > [1] 796 3 > > > Best, > > Jim > > > > > Andrew Yee wrote: > >> Apologies, if this is a naive question, but I've been trying to use >> biomaRt >> to retrieve all genes with kinase activity. >> >> I use the GO term GO:0016301 as follows: >> >> ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl") >> results <- getBM(c("entrezgene", "hgnc_symbol", "unigene"), filters = >> "go", >> values = "GO:0016301" , mart=ensembl) >> >> This pulls the usual results, e.g. like EGFR. >> >> >> However, it doesn't retrieve results for genes like ATM, which in Gene >> Ontology, is listed as a child of the GO term >> >> >> http://amigo.geneontology.org/cgi-bin/amigo/term-assoc.cgi?gptype=a ll&speciesdb=all&taxid=9606&evcode=all&term_assocs=all&term=GO%3A00163 01&session_id=7978amigo1229194926&action=filter >> >> I would have thought that this query would retrieve genes listed with that >> GO term and the associated children of the GO term. >> >> On a follow up note, is there a filter by which I can just search "kinase" >> with biomaRt? >> >> Many thanks, >> Andrew >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > Hildebrandt Lab > 8220D MSRB III > 1150 W. Medical Center Drive > Ann Arbor MI 48109-5646 > 734-936-8662 > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi James, Andrew to get all children of the GO term GO:0016301, you can use library("GO.db") ch = GOMFCHILDREN[["GO:0016301"]] You can supply that in the 'values' argument of your call to getBM. To see what they are, use e.g. lapply(ch, function(x) GOTERM[[x]]) Best wishes Wolfgang ------------------------------------------------------------------ Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber 13/12/2008 21:35 James W. MacDonald scripsit > Hi Andrew, > > I don't know if you can get all the children from biomaRt, but you can > get them from the org.Hs.eg.db package: > >> library(org.Hs.eg.db) > Loading required package: AnnotationDbi > Loading required package: Biobase > Loading required package: tools > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > Loading required package: DBI >> all.egs <- mget("GO:0016301", org.Hs.egGO2ALLEGS) >> all.egs <- unique(all.egs[[1]]) >> all.dat <- data.frame("Entrez Gene" = all.egs, "Symbol" = > unlist(mget(all.egs, org.Hs.egSYMBOL, ifnotfound=NA)), "UniGene" = > sapply(mget(all.egs, org.Hs.egUNIGENE, ifnotfound=NA), paste, > collapse=",")) >> head(all.dat) > Entrez.Gene Symbol UniGene > 25 25 ABL1 Hs.431048 > 27 27 ABL2 Hs.159472 > 90 90 ACVR1 Hs.470316 > 91 91 ACVR1B Hs.438918 > 92 92 ACVR2A Hs.470174 > 93 93 ACVR2B Hs.174273 >> dim(all.dat) > [1] 796 3 > > > Best, > > Jim > > > > > Andrew Yee wrote: >> Apologies, if this is a naive question, but I've been trying to use >> biomaRt >> to retrieve all genes with kinase activity. >> >> I use the GO term GO:0016301 as follows: >> >> ensembl = useMart("ensembl", dataset = "hsapiens_gene_ensembl") >> results <- getBM(c("entrezgene", "hgnc_symbol", "unigene"), filters = >> "go", >> values = "GO:0016301" , mart=ensembl) >> >> This pulls the usual results, e.g. like EGFR. >> >> >> However, it doesn't retrieve results for genes like ATM, which in Gene >> Ontology, is listed as a child of the GO term >> >> http://amigo.geneontology.org/cgi-bin/amigo/term-assoc.cgi?gptype=a ll&speciesdb=all&taxid=9606&evcode=all&term_assocs=all&term=GO%3A00163 01&session_id=7978amigo1229194926&action=filter >> >> >> I would have thought that this query would retrieve genes listed with >> that >> GO term and the associated children of the GO term. >> >> On a follow up note, is there a filter by which I can just search >> "kinase" >> with biomaRt? >> >> Many thanks, >> Andrew >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 203 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6