Problems easyRNASeq
1
0
Entering edit mode
@yates-steven-a-5218
Last seen 9.6 years ago
Dear Sir/Madam I am in the process of learning how to use the easyRNAseq package for bioconductor but have a question/problem. The problem is that the organisms I am working with do not have any comprehensive genome information (or any prior sequencing) in effect I will be creating a de novo transcriptome. Therefore there is no annotation file available for me to use, all I have is a list of transcripts. How will this work for this package? I am quite happy for the results to be reads per transcript etc, is it neccessary to create an annotation file (gff) for this purpose or not. I have created a gtf file using cufflinks, which should be ok? # The second problem I am encountering is the chrSizes. How do I get around the chromosome sizes problem???, if you have any advice it would be appreciated many thanks Steven Yates
Annotation PROcess easyRNASeq Annotation PROcess easyRNASeq • 801 views
ADD COMMENT
0
Entering edit mode
@delhommeemblde-3232
Last seen 9.6 years ago
Hi Steven, Using transcriptome annotation only is something I haven't done yet, but that should not be problematic. I'll suppose you will have an alignment of your reads against your generated transcriptome, right? If your transcripts are unique, i.e. there are no isoforms of each other, all you need to do to get a count table is to figure out how many times a given transcript has a read aligned to it, which is the information present in your BAM file. You would not need the overhead of easyRNASeq for that. Reading in your bam file using the Rsamtools scanBam function (with the appropriate ScanBamParam parameters) and tabulating the query names should be pretty straightforward and give you what you need. Now, if we assume that your transcripts are not unique, i.e. that you do have isoforms in your data, we need to do some additional processing and then easyRNAseq might come in handy to avoid counting reads several times. An important parameter in that case is how you'll decide to run your aligner. To ensure that reads can match to several isoforms, you'll need to allow multiple mapping. It would be interesting to estimate what's the highest number of isoforms you have and use that as a threshold for your aligner, i.e. neither to return only unique reads, nor too many. I would need to think a bit more on how to prepare the data for easyRNAseq, and if it makes sense to use it in that setup. And a data excerpt would help too in that case. Let me know which is your situation before we take matter further, Cheers, Nico --------------------------------------------------------------- Nicolas Delhomme Genome Biology Computational Support European Molecular Biology Laboratory Tel: +49 6221 387 8310 Email: nicolas.delhomme at embl.de Meyerhofstrasse 1 - Postfach 10.2209 69102 Heidelberg, Germany --------------------------------------------------------------- On 12 Apr 2012, at 10:07, Yates, Steven A wrote: > Dear Sir/Madam > > I am in the process of learning how to use the easyRNAseq package for > bioconductor but have a question/problem. The problem is that the > organisms I am working with do not have any comprehensive genome > information (or any prior sequencing) in effect I will be creating a de > novo transcriptome. Therefore there is no annotation file available for > me to use, all I have is a list of transcripts. How will this work for > this package? I am quite happy for the results to be reads per > transcript etc, is it neccessary to create an annotation file (gff) for > this purpose or not. I have created a gtf file using cufflinks, which should be ok? # > > The second problem I am encountering is the chrSizes. How do I get around the chromosome sizes problem???, if you have any advice it would be appreciated > > many thanks > > Steven Yates > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 434 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6