Question: GSEABase error in parsing msigdb_v2.5.xml
0
gravatar for Vladimir Morozov
11.5 years ago by
Vladimir Morozov130 wrote:
Hi, I get error reading the last vesrsion of Broad msigdb . Is it supposed to work? > gss <- getBroadSets('/data/PathDB/msigdb_v2.5.xml') Error: 'getBroadSets' failed to create gene sets: invalid BroadCollection category: 'c5' > traceback() 6: stop("'getBroadSets' failed to create gene sets:\n ", conditionMessage(err), call. = FALSE) 5: value[[3]](cond) 4: tryCatchOne(expr, names, parentenv, handlers[[1]]) 3: tryCatchList(expr, classes, parentenv, handlers) 2: tryCatch({ geneSets <- unlist(mapply(.fromXML, uri, "//GENESET", factories, SIMPLIFY = FALSE, USE.NAMES = FALSE)) }, error = function(err) { stop("'getBroadSets' failed to create gene sets:\n ", conditionMessage(err), call. = FALSE) }) 1: getBroadSets("/data/PathDB/msigdb_v2.5.xml") > packageDescription('GSEABase') Package: GSEABase Type: Package Title: Gene set enrichment data structures and methods Version: 1.2.0 Author: Martin Morgan, Seth Falcon, Robert Gentleman Maintainer: Biocore Team c/o BioC user list <bioconductor@stat.math.ethz.ch> Description: This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA). License: Artistic-2.0 Depends: R (>= 2.6.0), methods, AnnotationDbi, Biobase, annotate Suggests: Ruuid, hgu95av2.db, GO.db, org.Hs.eg.db Imports: methods, XML, graph LazyLoad: yes biocViews: Infrastructure, Statistics Collate: utilities.R AAA.R AllClasses.R AllGenerics.R getObjects.R methods-CollectionType.R methods-ExpressionSet.R methods-GeneColorSet.R methods-GeneIdentifierType.R methods-GeneSet.R methods-GeneSetCollection.R methods-OBOCollection.R zzz.R Packaged: Wed Apr 30 02:43:40 2008; biocbuild Built: R 2.7.0; ; 2008-05-14 16:18:51; unix -- File: /usr/local/lib64/R/library/GSEABase/Meta/package.rds Althogh getBroadSets('/data/PathDB/msigdb_v2.1.xml') works. I don's see obvios signs of corruption in the 2.5.xml [rstats:GeneLogic070523] head -n 2 /data/PathDB/*.xml ==> /data/PathDB/msigdb_v2.1.xml <== ==> /data/PathDB/msigdb_v2.5.xml <== tail -n 2 /data/PathDB/*.xml ==> /data/PathDB/msigdb_v2.1.xml <== <geneset standard_name="GNF2_ZAP70" systematic_name="c4:526" organism="Human" chip="GENE_SYMBOL" category_code="c4" contributor="Broad Institute" contributor_org="Broad Institute" description_brief="Neighborhood of ZAP70" description_full="Neighborhood of ZAP70 zeta-chain (TCR) associated protein kinase 70kDa in the GNF2 expression compendium" tags="" members="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG7,PRKCH,KLRK1 ,B TN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP,CD96,RASGRP1, GZ MM,TRD@,MATK,ITGAL,KLRB1" members_symbolized="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG7, PR KCH,KLRK1,BTN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP,CD 96 ,RASGRP1,GZMM,TRD@,MATK,ITGAL,KLRB1"/> </msigdb> ==> /data/PathDB/msigdb_v2.5.xml <== <geneset standard_name="INOSITOL_OR_PHOSPHATIDYLINOSITOL_KINASE_ACTIVITY" systematic_name="c5:1203" organism="Homo sapiens" authors="Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC,Richardson JE, Ringwald M, Rubin GM, Sherlock G." external_details_url="&lt;a href=" http:="" amigo.geneontology.org="" cgi-"="" rel="nofollow">http://amigo.geneontology.org/cgi- bin/amigo/go.cgi ?view=details&search_constraint=terms&depth=0&query=GO:000 44 28" chip="GENE_SYMBOL" category_code="c5" contributor="Gene Ontology" contributor_org="Gene Ontology" description_brief="Genes annotated by the GO term GO:0004428. Catalysis of the phosphorylation of myo- inositol (1,2,3,5/4,6-cyclohexanehexol) or a phosphatidylinositol." description_full="" tags="Molecular function" members="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3CA,PIK3CB,PIK3 CG ,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB" members_symbolized="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3CA, PI K3CB,PIK3CG,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB"/> </msigdb> Best Vlad Vladimir Morozov ALS Therapy Development Institute [[alternative HTML version deleted]]
ADD COMMENTlink modified 11.5 years ago by Martin Morgan ♦♦ 24k • written 11.5 years ago by Vladimir Morozov130
Answer: GSEABase error in parsing msigdb_v2.5.xml
0
gravatar for Martin Morgan
11.5 years ago by
Martin Morgan ♦♦ 24k
United States
Martin Morgan ♦♦ 24k wrote:
Thanks Vladimir for the report, more below... "Vladimir Morozov" <vmorozov at="" als.net=""> writes: > Hi, > > I get error reading the last vesrsion of Broad msigdb . Is it supposed > to work? > >> gss <- getBroadSets('/data/PathDB/msigdb_v2.5.xml') > Error: 'getBroadSets' failed to create gene sets: > invalid BroadCollection category: 'c5' The Broad added a category; I've updated GSEABase in both the devel and release branches. The update should be available with biocLite after 12 noon Friday; look for GSEABase 1.2.1 in the release. One aspect that is a little unsatisfactory is that the subcategories (CC/ BP/MF for c5, for instance) are not encoded in the XML, and so are not present in the gene sets. Martin >> traceback() > 6: stop("'getBroadSets' failed to create gene sets:\n ", > conditionMessage(err), > call. = FALSE) > 5: value[[3]](cond) > 4: tryCatchOne(expr, names, parentenv, handlers[[1]]) > 3: tryCatchList(expr, classes, parentenv, handlers) > 2: tryCatch({ > geneSets <- unlist(mapply(.fromXML, uri, "//GENESET", factories, > SIMPLIFY = FALSE, USE.NAMES = FALSE)) > }, error = function(err) { > stop("'getBroadSets' failed to create gene sets:\n ", > conditionMessage(err), > call. = FALSE) > }) > 1: getBroadSets("/data/PathDB/msigdb_v2.5.xml") >> packageDescription('GSEABase') > Package: GSEABase > Type: Package > Title: Gene set enrichment data structures and methods > Version: 1.2.0 > Author: Martin Morgan, Seth Falcon, Robert Gentleman > Maintainer: Biocore Team c/o BioC user list > <bioconductor at="" stat.math.ethz.ch=""> > Description: This package provides classes and methods to support Gene > Set Enrichment Analysis (GSEA). > License: Artistic-2.0 > Depends: R (>= 2.6.0), methods, AnnotationDbi, Biobase, annotate > Suggests: Ruuid, hgu95av2.db, GO.db, org.Hs.eg.db > Imports: methods, XML, graph > LazyLoad: yes > biocViews: Infrastructure, Statistics > Collate: utilities.R AAA.R AllClasses.R AllGenerics.R getObjects.R > methods-CollectionType.R methods-ExpressionSet.R > methods-GeneColorSet.R methods-GeneIdentifierType.R > methods-GeneSet.R methods-GeneSetCollection.R > methods-OBOCollection.R zzz.R > Packaged: Wed Apr 30 02:43:40 2008; biocbuild > Built: R 2.7.0; ; 2008-05-14 16:18:51; unix > > -- File: /usr/local/lib64/R/library/GSEABase/Meta/package.rds > > > Althogh > getBroadSets('/data/PathDB/msigdb_v2.1.xml') > works. I don's see obvios signs of corruption in the 2.5.xml > [rstats:GeneLogic070523] head -n 2 /data/PathDB/*.xml > ==> /data/PathDB/msigdb_v2.1.xml <== > > > > ==> /data/PathDB/msigdb_v2.5.xml <== > > > tail -n 2 /data/PathDB/*.xml > ==> /data/PathDB/msigdb_v2.1.xml <== > <geneset standard_name="GNF2_ZAP70" systematic_name="c4:526"> ORGANISM="Human" CHIP="GENE_SYMBOL" CATEGORY_CODE="c4" > CONTRIBUTOR="Broad Institute" CONTRIBUTOR_ORG="Broad Institute" > DESCRIPTION_BRIEF="Neighborhood of ZAP70" DESCRIPTION_FULL="Neighborhood > of ZAP70 zeta-chain (TCR) associated protein kinase 70kDa in the GNF2 > expression compendium" TAGS="" > MEMBERS="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG7,PRKCH,KLR K1,B > TN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP,CD96,RASGRP 1,GZ > MM,TRD@,MATK,ITGAL,KLRB1" > MEMBERS_SYMBOLIZED="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG 7,PR > KCH,KLRK1,BTN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP, CD96 > ,RASGRP1,GZMM,TRD@,MATK,ITGAL,KLRB1"/> > </msigdb> > > ==> /data/PathDB/msigdb_v2.5.xml <== > <geneset> STANDARD_NAME="INOSITOL_OR_PHOSPHATIDYLINOSITOL_KINASE_ACTIVITY" > SYSTEMATIC_NAME="c5:1203" ORGANISM="Homo sapiens" AUTHORS="Ashburner M, > Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski > K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, > Lewis S, Matese JC,Richardson JE, Ringwald M, Rubin GM, Sherlock G." > EXTERNAL_DETAILS_URL="http://amigo.geneontology.org/cgi- bin/amigo/go.cgi > ?view=details&search_constraint=terms&depth=0&query=GO:0 0044 > 28" CHIP="GENE_SYMBOL" CATEGORY_CODE="c5" CONTRIBUTOR="Gene Ontology" > CONTRIBUTOR_ORG="Gene Ontology" DESCRIPTION_BRIEF="Genes annotated by > the GO term GO:0004428. Catalysis of the phosphorylation of myo- inositol > (1,2,3,5/4,6-cyclohexanehexol) or a phosphatidylinositol." > DESCRIPTION_FULL="" TAGS="Molecular function" > MEMBERS="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3CA,PIK3CB,PI K3CG > ,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB" > MEMBERS_SYMBOLIZED="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3C A,PI > K3CB,PIK3CG,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB"/> > </msigdb> > > > > Best > Vlad > > > > Vladimir Morozov > > ALS Therapy Development Institute > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793
ADD COMMENTlink written 11.5 years ago by Martin Morgan ♦♦ 24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 448 users visited in the last hour