making use of the Apis mellifera BeeBase assembly 4 data in goseq
0
0
Entering edit mode
@herve-pages-1542
Last seen 20 hours ago
Seattle, WA, United States
Hello Vanessa, On 02/24/2012 10:45 AM, Corby, Vanessa wrote: > Hello Herve and Matt, > > After looking through the Bioconductor documentation for the BeeBase > assembly 4 package Herv? posted (information on the Apis 4 annotation > stored in Biostrings objects), the documentation for the org.Hs.eg.db > Annotation database documentation, the bioconductor mailing list, the > BSgenome documentation, and the goseq documentation, I am still very > confused about whether I can use the assembly 4 package that Herv? > posted in goseq. Just to clarify, goseq is not my package so I can't "post" anything in it, whatever that means. I assume you are talking about the BSgenome.Amellifera.BeeBase.assembly4 package that I made and that is part of Bioconductor. > The reason that I want to use the assembly 4 data is > that I would presume that it will have more current information than the > natively supported (by goseq) Apis release 2. It's a more recent assembly so I would expect it to be more accurate (i.e. closer to reality). > > So, here are my questions: > > 1.Will release 4 offer much improvement over release 2? If this is not > the case, then the next two questions are moot. It's just a more recent assembly, with all what that implies. > > 2.Do I need to get information on the transcript lengths and the > associations between the geneids and GO terms for the Apis 4 release and > build 2 new files of this information for goseq to use? I'm not familiar with the goseq package so I'll let Matt answer this. > Is that > information available (perhaps through UCSC or Baylor?s site for the > honeybee projects)? Can I use Bioconductor for this if I have the > annotation database file Herv? posted? The BSgenome.Amellifera.BeeBase.assembly4 package only contains the DNA sequences of Apis 4 release. It does *not* contain annotations for this assembly. One advantage of using the BSgenome.Amellifera.UCSC.apiMel2 package instead is that you have an easy access to a world of annotations for this genome thru the UCSC genome browser. Too bad that the UCSC folks have not plans to support apiMel4: https://lists.soe.ucsc.edu/pipermail/genome/2007-October/014763.html apiMel2 is 7 year old now! Note that the GenomicFeatures and rtracklayer packages make it really simple to import and work with those annotations in R/Bioconductor. > > 3.Do I just have to rename the Apis 4 genome package that Herv? posted > in order to use it in goseq (I see that there are several naming > conventions on the Annotation Data packages)? I'll let Matt answer this. > > You can see that some of these questions are more appropriate for Herv? > and some for Matt, so I decided to email both of you. Some of these > issues arise simply because I?ve only been successful with the example > in the goseq documentation (using the org.Hs.eg.db Annotation database). > Others arise because I am just very new to R and the Bioconductor packages. For what is worth, I don't think there is any org.* package for Bee (would probably be named something like org.Am.eg.db if there was one). And if there was one, you would need to double-check that the annotations in it are actually compatible with whatever genome assembly you finally decided to use. > > Thanks for any help you can offer. And apologies if this is the 100^th > time you?ve received an email about this from newbies such as myself. No problem. Wish I could help more. I'm cc'ing the Bioconductor mailing list (hope you don't mind). It's a better place to ask questions like this as other people might be able to help and also the whole discussion will be archived and searchable for further reference. Cheers, H. > > Vanessa Corby-Harris > > Research Molecular Biologist > > USDA-ARS > > Carl Hayden Bee Research Center > > 2000 E. Allen Rd., Tucson, AZ 85719 > > (520) 647-9269 > > This electronic message contains information generated by the USDA > solely for the intended recipients. Any unauthorized interception of > this message or the use or disclosure of the information it contains may > violate the law and subject the violator to civil or criminal penalties. > If you believe you have received this message in error, please notify > the sender and delete the email immediately. -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
Annotation GO Cancer BSgenome Biostrings BSgenome rtracklayer GenomicFeatures goseq GO • 820 views
ADD COMMENT

Login before adding your answer.

Traffic: 990 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6