Any changes in Biomart regarding GO terms? Script crashing today
2
0
Entering edit mode
@jdelasherasedacuk-1189
Last seen 9.3 years ago
United Kingdom
I am unable to run a script I have been using quite happily in the past, so I suspect a change in the "outside world", but I can't find any news about it. It appears that the usual attributes for GO terms and IDs (CC, BP and MF) are not available. (?) If you try this quick example: library(biomaRt) ensembl=useMart("ensembl") dataset="hsapiens_gene_ensembl" ensembl=useDataset(dataset, mart=ensembl) attributes=c("entrezgene", "go_cellular_component_id", "go_cellular_component__dm_name_1006") getBM(attributes=attributes, filters="entrezgene", values=1, mart=ensembl) It fails with the error: Error in getBM(attributes = attributes, filters = "entrezgene", values = 1, : Invalid attribute(s): go_cellular_component_id, go_cellular_component__dm_name_1006 Please use the function 'listAttributes' to get valid attribute names and if you list the attributes... listAttributes(ensembl) you can see they're indeed not listed... am I right thinking this has nothing to do with the biomaRt R side of things? Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507090 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
GO biomaRt GO biomaRt • 2.8k views
ADD COMMENT
0
Entering edit mode
@steffen-durinck-4465
Last seen 10.2 years ago
Hi Jose, It looks like the Ensembl BioMart recently got updated to version 62 and this included some changes in the attributes. > mart=useMart('ensembl',dataset='hsapiens_gene_ensembl') > att = listAttributes(mart) > att[grep('go',att[,1]),] name 24 go_id 27 go_linkage_type Possibly the three go ontologies are now combined in only one attribute go_id? If you check the release notes of Ensembl 62 (http://www.ensembl.org/info/website/news/index.html) you'll see: 'Gene Ontology Filter and Attribute section has been modified and all three domains have now been merged into one' If needed you can always connect to the previous Ensembl version using: > ensembl61=useMart('ENSEMBL_MART_ENSEMBL',dataset='hsapiens_gene_ense mbl', host='feb2011.archive.ensembl.org') > att = listAttributes(ensembl61) > att[grep('go',att[,1]),] name 24 go_biological_process_id 27 go_biological_process_linkage_type 28 go_cellular_component_id 29 go_cellular_component__dm_name_1006 30 go_cellular_component__dm_definition_1006 31 go_cellular_component_linkage_type 32 go_molecular_function_id 33 go_molecular_function__dm_name_1006 34 go_molecular_function__dm_definition_1006 35 go_molecular_function_linkage_type Cheers, Steffen On Fri, May 13, 2011 at 8:04 AM, <j.delasheras at="" ed.ac.uk=""> wrote: > > I am unable to run a script I have been using quite happily in the past, so > I suspect a change in the "outside world", but I can't find any news about > it. > > It appears that the usual attributes for GO terms and IDs (CC, BP and MF) > are not available. (?) > > If you try this quick example: > > library(biomaRt) > ensembl=useMart("ensembl") > dataset="hsapiens_gene_ensembl" > ensembl=useDataset(dataset, mart=ensembl) > > attributes=c("entrezgene", > ? ? ? ? ? ? "go_cellular_component_id", > ? ? ? ? ? ? "go_cellular_component__dm_name_1006") > > getBM(attributes=attributes, > ? ? ?filters="entrezgene", > ? ? ?values=1, mart=ensembl) > > It fails with the error: > > Error in getBM(attributes = attributes, filters = "entrezgene", values = 1, > ?: > ?Invalid attribute(s): go_cellular_component_id, > go_cellular_component__dm_name_1006 > Please use the function 'listAttributes' to get valid attribute names > > and if you list the attributes... > > listAttributes(ensembl) > > you can see they're indeed not listed... > > am I right thinking this has nothing to do with the biomaRt R side of > things? > > Jose > > > -- > Dr. Jose I. de las Heras ? ? ? ? ? ? ? ? ? ? ?Email: J.delasHeras at ed.ac.uk > The Wellcome Trust Centre for Cell Biology ? ?Phone: +44 (0)131 6507090 > Institute for Cell & Molecular Biology ? ? ? ?Fax: ? +44 (0)131 6507360 > Swann Building, Mayfield Road > University of Edinburgh > Edinburgh EH9 3JR > UK > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Hi Steffen, thank you for your fast reply. Indeed, after I posted my message I looked into the attributes and found that there are "generic" GO attributes (although their description included a "(bp)" which at first I took to mean they referred to "biological process" and I was looking for the (mf) and (cc) counterparts... but then noticed the "namespace_1003" attribute, and this one returns BP, MF or CC... so yes indeed, they've all been merged and you have to use the namespace attribute if you want to extract cc, bp, mf info. I didn't know how to connect to archived versions, so that's very useful, thank you! It means I can quickly adapt the old scripts to continue working straightway and I'll fix them properly when I have time. Thank you! regards, Jose Quoting Steffen Durinck <durinck.steffen at="" gene.com=""> on Fri, 13 May 2011 08:29:07 -0700: > Hi Jose, > > It looks like the Ensembl BioMart recently got updated to version 62 > and this included some changes in the attributes. > >> mart=useMart('ensembl',dataset='hsapiens_gene_ensembl') >> att = listAttributes(mart) >> att[grep('go',att[,1]),] > name > 24 go_id > 27 go_linkage_type > > Possibly the three go ontologies are now combined in only one > attribute go_id? > If you check the release notes of Ensembl 62 > (http://www.ensembl.org/info/website/news/index.html) > you'll see: > > 'Gene Ontology Filter and Attribute section has been modified and all > three domains have now been merged into one' > > > If needed you can always connect to the previous Ensembl version using: > >> ensembl61=useMart('ENSEMBL_MART_ENSEMBL',dataset='hsapiens_gene_ens embl', >> host='feb2011.archive.ensembl.org') >> att = listAttributes(ensembl61) >> att[grep('go',att[,1]),] > name > 24 go_biological_process_id > 27 go_biological_process_linkage_type > 28 go_cellular_component_id > 29 go_cellular_component__dm_name_1006 > 30 go_cellular_component__dm_definition_1006 > 31 go_cellular_component_linkage_type > 32 go_molecular_function_id > 33 go_molecular_function__dm_name_1006 > 34 go_molecular_function__dm_definition_1006 > 35 go_molecular_function_linkage_type > > > Cheers, > Steffen > > On Fri, May 13, 2011 at 8:04 AM, <j.delasheras at="" ed.ac.uk=""> wrote: >> >> I am unable to run a script I have been using quite happily in the past, so >> I suspect a change in the "outside world", but I can't find any news about >> it. >> >> It appears that the usual attributes for GO terms and IDs (CC, BP and MF) >> are not available. (?) >> >> If you try this quick example: >> >> library(biomaRt) >> ensembl=useMart("ensembl") >> dataset="hsapiens_gene_ensembl" >> ensembl=useDataset(dataset, mart=ensembl) >> >> attributes=c("entrezgene", >> ? ? ? ? ? ? "go_cellular_component_id", >> ? ? ? ? ? ? "go_cellular_component__dm_name_1006") >> >> getBM(attributes=attributes, >> ? ? ?filters="entrezgene", >> ? ? ?values=1, mart=ensembl) >> >> It fails with the error: >> >> Error in getBM(attributes = attributes, filters = "entrezgene", values = 1, >> ?: >> ?Invalid attribute(s): go_cellular_component_id, >> go_cellular_component__dm_name_1006 >> Please use the function 'listAttributes' to get valid attribute names >> >> and if you list the attributes... >> >> listAttributes(ensembl) >> >> you can see they're indeed not listed... >> >> am I right thinking this has nothing to do with the biomaRt R side of >> things? >> >> Jose >> >> >> -- >> Dr. Jose I. de las Heras ? ? ? ? ? ? ? ? ? ? ?Email: J.delasHeras at ed.ac.uk >> The Wellcome Trust Centre for Cell Biology ? ?Phone: +44 (0)131 6507090 >> Institute for Cell & Molecular Biology ? ? ? ?Fax: ? +44 (0)131 6507360 >> Swann Building, Mayfield Road >> University of Edinburgh >> Edinburgh EH9 3JR >> UK >> >> >> -- >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507090 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
ADD REPLY
0
Entering edit mode
@rhoda-kinsella-3200
Last seen 10.2 years ago
Hi Jose The three GO sections have been merged into one. See here for news about ensembl BioMart changes: http://www.ensembl.org/info/website/news/index.html#team-Mart These are the names of the attributes you now need to look for: <query virtualschemaname="default" formatter="TSV" header="0" uniquerows="0" count="" datasetconfigversion="0.6"> <dataset name="hsapiens_gene_ensembl" interface="default"> <attribute name="go_id"/> <attribute name="name_1006"/> <attribute name="definition_1006"/> <attribute name="go_linkage_type"/> <attribute name="namespace_1003"/> </dataset> </query> These attribute names correspond to: GO Term Accession GO Term Name GO Term Definition GO Term Evidence Code GO domain I will give the internal names (i.e. name_1006) a more descriptive and meaningful name for the next release. Apologies for any confusion caused. Regards Rhoda > > I am unable to run a script I have been using quite happily in the > past, so I suspect a change in the "outside world", but I can't find > any news about it. > > It appears that the usual attributes for GO terms and IDs (CC, BP and > MF) are not available. (?) > > If you try this quick example: > > library(biomaRt) > ensembl=useMart("ensembl") > dataset="hsapiens_gene_ensembl" > ensembl=useDataset(dataset, mart=ensembl) > > attributes=c("entrezgene", > "go_cellular_component_id", > "go_cellular_component__dm_name_1006") > > getBM(attributes=attributes, > filters="entrezgene", > values=1, mart=ensembl) > > It fails with the error: > > Error in getBM(attributes = attributes, filters = "entrezgene", values = > 1, : > Invalid attribute(s): go_cellular_component_id, > go_cellular_component__dm_name_1006 > Please use the function 'listAttributes' to get valid attribute names > > and if you list the attributes... > > listAttributes(ensembl) > > you can see they're indeed not listed... > > am I right thinking this has nothing to do with the biomaRt R side of > things? > > Jose > > > -- > Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk > The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507090 > Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 > Swann Building, Mayfield Road > University of Edinburgh > Edinburgh EH9 3JR > UK > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Hi Rhoda, thank you for the explanation. I notice the description of the attributes contain a "(bp)" in them: GO Term Accession (bp) GO Term Name (bp) GO Term Definition (bp) GO Term Evidence Code (bp) Initially I thought it mean they referred to Biological Process and that CC and MF were missing... until I tried it and realised they are generic. It'll probably be good to remove that if it's a leftover, or clarify its meaning otherwise. Is there a mailing list, for instance, I can subscribe to so that I find out about this kind of changes in advance, rather than when I'm trying to do some work and my scripts cease to function? :) Thank you for your help! regards, Jose Quoting rhoda at ebi.ac.uk on Fri, 13 May 2011 16:40:00 +0100 (BST): > Hi Jose > The three GO sections have been merged into one. See here for news about > ensembl BioMart changes: > > http://www.ensembl.org/info/website/news/index.html#team-Mart > > These are the names of the attributes you now need to look for: > > > > <query virtualschemaname="default" formatter="TSV" header="0"> uniqueRows = "0" count = "" datasetConfigVersion = "0.6" > > > <dataset name="hsapiens_gene_ensembl" interface="default"> > <attribute name="go_id"/> > <attribute name="name_1006"/> > <attribute name="definition_1006"/> > <attribute name="go_linkage_type"/> > <attribute name="namespace_1003"/> > </dataset> > </query> > > These attribute names correspond to: > > GO Term Accession > GO Term Name > GO Term Definition > GO Term Evidence Code > GO domain > > I will give the internal names (i.e. name_1006) a more descriptive and > meaningful name for the next release. Apologies for any confusion caused. > Regards > Rhoda > > >> >> I am unable to run a script I have been using quite happily in the >> past, so I suspect a change in the "outside world", but I can't find >> any news about it. >> >> It appears that the usual attributes for GO terms and IDs (CC, BP and >> MF) are not available. (?) >> >> If you try this quick example: >> >> library(biomaRt) >> ensembl=useMart("ensembl") >> dataset="hsapiens_gene_ensembl" >> ensembl=useDataset(dataset, mart=ensembl) >> >> attributes=c("entrezgene", >> "go_cellular_component_id", >> "go_cellular_component__dm_name_1006") >> >> getBM(attributes=attributes, >> filters="entrezgene", >> values=1, mart=ensembl) >> >> It fails with the error: >> >> Error in getBM(attributes = attributes, filters = "entrezgene", values = >> 1, : >> Invalid attribute(s): go_cellular_component_id, >> go_cellular_component__dm_name_1006 >> Please use the function 'listAttributes' to get valid attribute names >> >> and if you list the attributes... >> >> listAttributes(ensembl) >> >> you can see they're indeed not listed... >> >> am I right thinking this has nothing to do with the biomaRt R side of >> things? >> >> Jose >> >> >> -- >> Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk >> The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507090 >> Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 >> Swann Building, Mayfield Road >> University of Edinburgh >> Edinburgh EH9 3JR >> UK >> >> >> -- >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6507090 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
ADD REPLY

Login before adding your answer.

Traffic: 895 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6