Question: IRanges, GenomicRanges, GenomicFeatures?
0
gravatar for Oleg Moskvin
9.0 years ago by
Oleg Moskvin60
United States
Oleg Moskvin60 wrote:
Hello list members, For a RNA-seq analysis, what would you suggest to use to convert raw- sequence-based read coverage to annotated ORF-based coverage, if the genome of interest is NOT supported in neither UCSC nor ENSEMBL, which means that creation of a TranscriptDB object in a straightforward way (I.e. according to the GenomicFeatures pipeline) is impossible? What would you recommend to import a .gff file (containing annotation of a particular genome, from GenBank) into R/Bioconductor to eventually generate a gene-centric countTable readable by packages like DESeq? Thank you! Oleg [[alternative HTML version deleted]]
ADD COMMENTlink modified 9.0 years ago by Steve Lianoglou12k • written 9.0 years ago by Oleg Moskvin60
Answer: IRanges, GenomicRanges, GenomicFeatures?
0
gravatar for Steve Lianoglou
9.0 years ago by
Denali
Steve Lianoglou12k wrote:
Hi, On Sun, Oct 31, 2010 at 11:10 PM, Oleg Moskvin <moskvin at="" wisc.edu=""> wrote: > Hello list members, > > For a RNA-seq analysis, what would you suggest to use to convert raw-sequence-based read coverage to annotated ORF-based coverage, if the genome of interest is NOT supported in neither UCSC nor ENSEMBL, which means that creation of a TranscriptDB object in a straightforward way (I.e. according to the GenomicFeatures pipeline) is impossible? What would you recommend to import a .gff file (containing annotation of a particular genome, from GenBank) into R/Bioconductor to eventually generate a gene-centric countTable readable by packages like DESeq? Assuming I've understood your question and how you have your data available to you, here is one (maybe too simple) approach: I think I'd parse the GFF into a GRangesList object (each item of the list would be a GRanges object that stores the exon structure of your transcripts (or genes) (which I'm assuming is what's in your GFF file)). If you had your rna-seq data in its own GRanges object, you could then countOverlaps between your data and GRangesList-transcript info pretty easily, which you could use to create your countTable. Hope that helps, -steve ps - I think rtracklayer has some facilities to import GFF files, which might be helpful to you. -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENTlink written 9.0 years ago by Steve Lianoglou12k
On Sun, Oct 31, 2010 at 10:19 PM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi, > > On Sun, Oct 31, 2010 at 11:10 PM, Oleg Moskvin <moskvin@wisc.edu> wrote: > > Hello list members, > > > > For a RNA-seq analysis, what would you suggest to use to convert > raw-sequence-based read coverage to annotated ORF-based coverage, if the > genome of interest is NOT supported in neither UCSC nor ENSEMBL, which means > that creation of a TranscriptDB object in a straightforward way (I.e. > according to the GenomicFeatures pipeline) is impossible? What would you > recommend to import a .gff file (containing annotation of a particular > genome, from GenBank) into R/Bioconductor to eventually generate a > gene-centric countTable readable by packages like DESeq? > > Assuming I've understood your question and how you have your data > available to you, here is one (maybe too simple) approach: > > I think I'd parse the GFF into a GRangesList object (each item of the > list would be a GRanges object that stores the exon structure of your > transcripts (or genes) (which I'm assuming is what's in your GFF > file)). > > If you had your rna-seq data in its own GRanges object, you could then > countOverlaps between your data and GRangesList-transcript info pretty > easily, which you could use to create your countTable. > > Hope that helps, > -steve > > ps - I think rtracklayer has some facilities to import GFF files, > which might be helpful to you. > > Right. Just import(asRangedData=FALSE) and then split() it into a GRangesList. > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact<http: cbio.mskc="" c.org="" %7elianos="" contact=""> > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLYlink written 9.0 years ago by Michael Lawrence11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 288 users visited in the last hour