is database RefSeq achievable from any Bioconductor package
1
0
Entering edit mode
@steve-lianoglou-2771
Last seen 14 months ago
United States
Hi, On Mon, May 31, 2010 at 8:40 AM, <mauede at="" alice.it=""> wrote: > The Biologist we work with has brought my attention to some misalignment between > ?Ensembl and RefSeq with regard to the length and position of 3UTR sequences. I'd like to comment on this, but I'm not sure I'd provide any useful information w/o more details from you. But just one point: the RefSeq gene annotations and the ensembl gene annotations are not necessarily the same, so what you say here isn't all that surprising. A quick example: the number (and "character") of isoforms per "gene" often differ between the two sources. If you really want to turn your world view upside down, check out the AceView annotations some day ... Just thought I'd mention ... > I have been querying Ensembl many times through biomaRt . > I wonder whether I can reach RefSeq data through biomaRt or any other Biconductor package. > Unfortunately, I cannot find ?RefSeq in the list of databases obtained through function listMarts() You can download the gene annotation tracks from the UCSC table browser and parse them out to get your 3'UTRs. I know the GenomicFeatures packages has code to download and parse these (what used to be called) 'knownGene' tables automagically and dump them into an SQLite db, so you can: (i) look at that code to find inspiration (ii) just let the GF package do it's thing and work w/ the resulting database (iii) d/l the table manually and parse out the relevant coordinate info by yourself. I'm at a loss to offer you (i) any other packages to help you do this automatically, or (ii) another source to find the info you need from. Hope that helps, -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Annotation Cancer biomaRt GenomicFeatures Annotation Cancer biomaRt GenomicFeatures • 1.2k views
ADD COMMENT
0
Entering edit mode
@kasper-daniel-hansen-2979
Last seen 10 months ago
United States
On Tue, Jun 1, 2010 at 10:37 AM, Steve Lianoglou <mailinglist.honeypot at="" gmail.com=""> wrote: > Hi, > > On Mon, May 31, 2010 at 8:40 AM, ?<mauede at="" alice.it=""> wrote: >> The Biologist we work with has brought my attention to some misalignment between >> ?Ensembl and RefSeq with regard to the length and position of 3UTR sequences. > > I'd like to comment on this, but I'm not sure I'd provide any useful > information w/o more details from you. > > But just one point: the RefSeq gene annotations and the ensembl gene > annotations are not necessarily the same, so what you say here isn't > all that surprising. > > A quick example: the number (and "character") of isoforms per "gene" > often differ between the two sources. An additional comment: the definition of UTR and coding region requires that you know what part of the transcript is actually translated. This is well known for the canonical transcript of most genes in well-annotated organisms. But it is much less well known for alternative transcripts from the same gene, even for a well-annotated organism such as drosophila (this is based on the not-newest version of Flybase). Note that this (=defining coding region and UTRs) is actually surprisingly hard to do computationally (it involves a lot of guess work). For more detail on this, for drosophila, you can read parts of Hansen KD, Lareau LF, Blanchette M, Green RE, Meng Q, et al. 2009 Genome-Wide Identification of Alternative Splice Forms Down-Regulated by Nonsense-Mediated mRNA Decay in Drosophila. PLoS Genet 5(6): e1000525. doi:10.1371/journal.pgen.1000525 http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pge n.1000525 Especially the "Reannotating coding regions reveals distinct features of NMD?target isoforms" subsection of the results. This proved to essential for the this particular paper. Fixing up the mistakes in Flybase made our results interpretable instead of just looking like noise. Kasper
ADD COMMENT

Login before adding your answer.

Traffic: 664 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6