Search
Question: GOseq with ce6
0
4.5 years ago by
Hi all, Our RNA-seq pipeline uses GOseq, and we are thankful to the authors for offering this great tools. GOseq crashed on us today on C. elegans data with the error": "Couldn't grab GO categories automatically. Please manually specify." The workaround is to use the non-native mechanism. After investigating, the error is thrown after the function getgo() cannot reliably identify the org package associated to the Ce6 genome.There is a collision with the yeast org package. So for fun I listed all other genomes for which that would not work: library(goseq,quietly = TRUE) genomes = sort(supportedGenomes()$db) .ORG_PACKAGES = goseq:::.ORG_PACKAGES supported = sapply(genomes,function(genome) { orgstring = as.character(.ORG_PACKAGES[grep(gsub("[0-9]+", "", genome), names(.ORG_PACKAGES), ignore.case = TRUE)]) if (length(orgstring) != 1) { #stop("Couldn't grab GO categories automatically. Please manually specify.") return(FALSE) }else{ return(TRUE) } }) print(names(supported[!supported])) which prints: [1] "ailMel1" "allMis1" "anoCar1" "anoCar2" "apiMel1" "apiMel2" "aplCal1" [8] "braFlo1" "caeJap1" "caePb1" "caePb2" "caeRem2" "caeRem3" "calJac1" [15] "calJac3" "cavPor3" "cb1" "cb3" "ce10" "ce2" "ce4" [22] "ce6" "cerSim1" "choHof1" "chrPic1" "ci1" "ci2" "dasNov3" [29] "dipOrd1" "dp2" "dp3" "droAna1" "droAna2" "droEre1" "droGri1" [36] "droMoj1" "droMoj2" "droPer1" "droSec1" "droSim1" "droVir1" "droVir2" [43] "droYak1" "droYak2" "echTel1" "echTel2" "equCab1" "equCab2" "eriEur1" [50] "felCat3" "felCat4" "felCat5" "fr1" "fr2" "fr3" "gadMor1" [57] "gasAcu1" "geoFor1" "gorGor3" "hetGla1" "hetGla2" "latCha1" "loxAfr3" [64] "macEug2" "melGal1" "melUnd1" "micMur1" "monDom1" "monDom4" "monDom5" [71] "musFur1" "myoLuc2" "nomLeu1" "nomLeu2" "nomLeu3" "ochPri2" "oreNil2" [78] "ornAna1" "oryCun2" "oryLat2" "otoGar3" "oviAri1" "oviAri3" "papAnu2" [85] "papHam1" "petMar1" "petMar2" "ponAbe2" "priPac1" "proCap1" "pteVam1" [92] "saiBol1" "sarHar1" "sorAra1" "speTri2" "strPur1" "strPur2" "susScr2" [99] "susScr3" "taeGut1" "tarSyr1" "tetNig1" "tetNig2" "triMan1" "tupBel1" [106] "turTru2" "vicPac1" "vicPac2" I just thought it was odd for the package not to work out of the box for the model organism C. elegans. It also wouldn't work for pig, and maybe more. Thank you! [[alternative HTML version deleted]] ADD COMMENTlink modified 4.5 years ago by Nadia Davidson270 • written 4.5 years ago by François Lefebvre50 0 4.5 years ago by Australia Nadia Davidson270 wrote: Fran?ois Lefebvre <lefebvrf at="" ...=""> writes: > > Hi all, > > Our RNA-seq pipeline uses GOseq, and we are thankful to the authors for > offering this great tools. > > GOseq crashed on us today on C. elegans data with the error": > > "Couldn't grab GO categories automatically. Please manually specify." > > The workaround is to use the non-native mechanism. > > After investigating, the error is thrown after the function getgo() cannot > reliably identify the org package associated to the Ce6 genome.There is a > collision with the yeast org package. > > So for fun I listed all other genomes for which that would not work: > > library(goseq,quietly = TRUE) > genomes = sort(supportedGenomes()$db) > .ORG_PACKAGES = goseq:::.ORG_PACKAGES > > supported = sapply(genomes,function(genome) > { > orgstring = as.character(.ORG_PACKAGES[grep(gsub("[0-9]+", "", genome), > names(.ORG_PACKAGES), ignore.case = TRUE)]) > if (length(orgstring) != 1) { > #stop("Couldn't grab GO categories automatically. Please manually > specify.") > return(FALSE) > }else{ > return(TRUE) > } > }) > print(names(supported[!supported])) > ... > > I just thought it was odd for the package not to work out of the box for > the model organism C. elegans. It also wouldn't work for pig, and maybe > more. > > Thank you! Dear Fran?ois, Thank you for reporting this. You've found a bug in the getgo function. For c. elegans, two .ORG_PACKAGES were being returned because "sacCer" also has the string "ce" within it and grep was being used to match strings. I've fixed this in the development version of goseq. For pig, the issue was different, but related and has also been fixed. Many of the other genomes you listed above are still not supported, but this has to do with what organisms have "org.*" annotation packages in bioconductor. "supportedGenomes" is perhaps a bit misleading in this way, as it's just the gene lengths being automatically looked up that is supported. Please let us know if you find any more issues. Cheers, Nadia.