Search
Question: GOseq with ce6
0
gravatar for François Lefebvre
3.6 years ago by
Canada
François Lefebvre40 wrote:
Hi all, Our RNA-seq pipeline uses GOseq, and we are thankful to the authors for offering this great tools. GOseq crashed on us today on C. elegans data with the error": "Couldn't grab GO categories automatically. Please manually specify." The workaround is to use the non-native mechanism. After investigating, the error is thrown after the function getgo() cannot reliably identify the org package associated to the Ce6 genome.There is a collision with the yeast org package. So for fun I listed all other genomes for which that would not work: library(goseq,quietly = TRUE) genomes = sort(supportedGenomes()$db) .ORG_PACKAGES = goseq:::.ORG_PACKAGES supported = sapply(genomes,function(genome) { orgstring = as.character(.ORG_PACKAGES[grep(gsub("[0-9]+", "", genome), names(.ORG_PACKAGES), ignore.case = TRUE)]) if (length(orgstring) != 1) { #stop("Couldn't grab GO categories automatically. Please manually specify.") return(FALSE) }else{ return(TRUE) } }) print(names(supported[!supported])) which prints: [1] "ailMel1" "allMis1" "anoCar1" "anoCar2" "apiMel1" "apiMel2" "aplCal1" [8] "braFlo1" "caeJap1" "caePb1" "caePb2" "caeRem2" "caeRem3" "calJac1" [15] "calJac3" "cavPor3" "cb1" "cb3" "ce10" "ce2" "ce4" [22] "ce6" "cerSim1" "choHof1" "chrPic1" "ci1" "ci2" "dasNov3" [29] "dipOrd1" "dp2" "dp3" "droAna1" "droAna2" "droEre1" "droGri1" [36] "droMoj1" "droMoj2" "droPer1" "droSec1" "droSim1" "droVir1" "droVir2" [43] "droYak1" "droYak2" "echTel1" "echTel2" "equCab1" "equCab2" "eriEur1" [50] "felCat3" "felCat4" "felCat5" "fr1" "fr2" "fr3" "gadMor1" [57] "gasAcu1" "geoFor1" "gorGor3" "hetGla1" "hetGla2" "latCha1" "loxAfr3" [64] "macEug2" "melGal1" "melUnd1" "micMur1" "monDom1" "monDom4" "monDom5" [71] "musFur1" "myoLuc2" "nomLeu1" "nomLeu2" "nomLeu3" "ochPri2" "oreNil2" [78] "ornAna1" "oryCun2" "oryLat2" "otoGar3" "oviAri1" "oviAri3" "papAnu2" [85] "papHam1" "petMar1" "petMar2" "ponAbe2" "priPac1" "proCap1" "pteVam1" [92] "saiBol1" "sarHar1" "sorAra1" "speTri2" "strPur1" "strPur2" "susScr2" [99] "susScr3" "taeGut1" "tarSyr1" "tetNig1" "tetNig2" "triMan1" "tupBel1" [106] "turTru2" "vicPac1" "vicPac2" I just thought it was odd for the package not to work out of the box for the model organism C. elegans. It also wouldn't work for pig, and maybe more. Thank you! [[alternative HTML version deleted]]
ADD COMMENTlink modified 3.6 years ago by Nadia Davidson260 • written 3.6 years ago by François Lefebvre40
0
gravatar for Nadia Davidson
3.6 years ago by
Australia
Nadia Davidson260 wrote:
Fran?ois Lefebvre <lefebvrf at="" ...=""> writes: > > Hi all, > > Our RNA-seq pipeline uses GOseq, and we are thankful to the authors for > offering this great tools. > > GOseq crashed on us today on C. elegans data with the error": > > "Couldn't grab GO categories automatically. Please manually specify." > > The workaround is to use the non-native mechanism. > > After investigating, the error is thrown after the function getgo() cannot > reliably identify the org package associated to the Ce6 genome.There is a > collision with the yeast org package. > > So for fun I listed all other genomes for which that would not work: > > library(goseq,quietly = TRUE) > genomes = sort(supportedGenomes()$db) > .ORG_PACKAGES = goseq:::.ORG_PACKAGES > > supported = sapply(genomes,function(genome) > { > orgstring = as.character(.ORG_PACKAGES[grep(gsub("[0-9]+", "", genome), > names(.ORG_PACKAGES), ignore.case = TRUE)]) > if (length(orgstring) != 1) { > #stop("Couldn't grab GO categories automatically. Please manually > specify.") > return(FALSE) > }else{ > return(TRUE) > } > }) > print(names(supported[!supported])) > ... > > I just thought it was odd for the package not to work out of the box for > the model organism C. elegans. It also wouldn't work for pig, and maybe > more. > > Thank you! Dear Fran?ois, Thank you for reporting this. You've found a bug in the getgo function. For c. elegans, two .ORG_PACKAGES were being returned because "sacCer" also has the string "ce" within it and grep was being used to match strings. I've fixed this in the development version of goseq. For pig, the issue was different, but related and has also been fixed. Many of the other genomes you listed above are still not supported, but this has to do with what organisms have "org.*" annotation packages in bioconductor. "supportedGenomes" is perhaps a bit misleading in this way, as it's just the gene lengths being automatically looked up that is supported. Please let us know if you find any more issues. Cheers, Nadia.
ADD COMMENTlink written 3.6 years ago by Nadia Davidson260
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 225 users visited in the last hour