GOseq with ce6
1
0
Entering edit mode
@francois-lefebvre-4696
Last seen 3.9 years ago
Canada
Hi all, Our RNA-seq pipeline uses GOseq, and we are thankful to the authors for offering this great tools. GOseq crashed on us today on C. elegans data with the error": "Couldn't grab GO categories automatically. Please manually specify." The workaround is to use the non-native mechanism. After investigating, the error is thrown after the function getgo() cannot reliably identify the org package associated to the Ce6 genome.There is a collision with the yeast org package. So for fun I listed all other genomes for which that would not work: library(goseq,quietly = TRUE) genomes = sort(supportedGenomes()$db) .ORG_PACKAGES = goseq:::.ORG_PACKAGES supported = sapply(genomes,function(genome) { orgstring = as.character(.ORG_PACKAGES[grep(gsub("[0-9]+", "", genome), names(.ORG_PACKAGES), ignore.case = TRUE)]) if (length(orgstring) != 1) { #stop("Couldn't grab GO categories automatically. Please manually specify.") return(FALSE) }else{ return(TRUE) } }) print(names(supported[!supported])) which prints: [1] "ailMel1" "allMis1" "anoCar1" "anoCar2" "apiMel1" "apiMel2" "aplCal1" [8] "braFlo1" "caeJap1" "caePb1" "caePb2" "caeRem2" "caeRem3" "calJac1" [15] "calJac3" "cavPor3" "cb1" "cb3" "ce10" "ce2" "ce4" [22] "ce6" "cerSim1" "choHof1" "chrPic1" "ci1" "ci2" "dasNov3" [29] "dipOrd1" "dp2" "dp3" "droAna1" "droAna2" "droEre1" "droGri1" [36] "droMoj1" "droMoj2" "droPer1" "droSec1" "droSim1" "droVir1" "droVir2" [43] "droYak1" "droYak2" "echTel1" "echTel2" "equCab1" "equCab2" "eriEur1" [50] "felCat3" "felCat4" "felCat5" "fr1" "fr2" "fr3" "gadMor1" [57] "gasAcu1" "geoFor1" "gorGor3" "hetGla1" "hetGla2" "latCha1" "loxAfr3" [64] "macEug2" "melGal1" "melUnd1" "micMur1" "monDom1" "monDom4" "monDom5" [71] "musFur1" "myoLuc2" "nomLeu1" "nomLeu2" "nomLeu3" "ochPri2" "oreNil2" [78] "ornAna1" "oryCun2" "oryLat2" "otoGar3" "oviAri1" "oviAri3" "papAnu2" [85] "papHam1" "petMar1" "petMar2" "ponAbe2" "priPac1" "proCap1" "pteVam1" [92] "saiBol1" "sarHar1" "sorAra1" "speTri2" "strPur1" "strPur2" "susScr2" [99] "susScr3" "taeGut1" "tarSyr1" "tetNig1" "tetNig2" "triMan1" "tupBel1" [106] "turTru2" "vicPac1" "vicPac2" I just thought it was odd for the package not to work out of the box for the model organism C. elegans. It also wouldn't work for pig, and maybe more. Thank you! [[alternative HTML version deleted]]
GO Yeast Organism goseq genomes GO Yeast Organism goseq genomes • 1.7k views
ADD COMMENT
0
Entering edit mode
@nadia-davidson-5739
Last seen 5.6 years ago
Australia
Fran?ois Lefebvre <lefebvrf at="" ...=""> writes: > > Hi all, > > Our RNA-seq pipeline uses GOseq, and we are thankful to the authors for > offering this great tools. > > GOseq crashed on us today on C. elegans data with the error": > > "Couldn't grab GO categories automatically. Please manually specify." > > The workaround is to use the non-native mechanism. > > After investigating, the error is thrown after the function getgo() cannot > reliably identify the org package associated to the Ce6 genome.There is a > collision with the yeast org package. > > So for fun I listed all other genomes for which that would not work: > > library(goseq,quietly = TRUE) > genomes = sort(supportedGenomes()$db) > .ORG_PACKAGES = goseq:::.ORG_PACKAGES > > supported = sapply(genomes,function(genome) > { > orgstring = as.character(.ORG_PACKAGES[grep(gsub("[0-9]+", "", genome), > names(.ORG_PACKAGES), ignore.case = TRUE)]) > if (length(orgstring) != 1) { > #stop("Couldn't grab GO categories automatically. Please manually > specify.") > return(FALSE) > }else{ > return(TRUE) > } > }) > print(names(supported[!supported])) > ... > > I just thought it was odd for the package not to work out of the box for > the model organism C. elegans. It also wouldn't work for pig, and maybe > more. > > Thank you! Dear Fran?ois, Thank you for reporting this. You've found a bug in the getgo function. For c. elegans, two .ORG_PACKAGES were being returned because "sacCer" also has the string "ce" within it and grep was being used to match strings. I've fixed this in the development version of goseq. For pig, the issue was different, but related and has also been fixed. Many of the other genomes you listed above are still not supported, but this has to do with what organisms have "org.*" annotation packages in bioconductor. "supportedGenomes" is perhaps a bit misleading in this way, as it's just the gene lengths being automatically looked up that is supported. Please let us know if you find any more issues. Cheers, Nadia.
ADD COMMENT

Login before adding your answer.

Traffic: 545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6