any package to do gene expression value estimation

0

Entering edit mode

wang peter ★ 2.0k

@wang-peter-4647

Last seen 11.3 years ago

dear all: i used all the RNA-seq data to assemble the transcripsome. but anyone R package or software can map data of each sample to the the transcripsome and also estimate gene expression value thank you shan gao [[alternative HTML version deleted]]

• 1.9k views

ADD COMMENT • link updated 14.2 years ago by Wei Shi ★ 3.6k • written 14.2 years ago by wang peter ★ 2.0k

0

Entering edit mode

Martin Morgan 25k

@martin-morgan-1513

Last seen 11 months ago

United States

On 10/24/2011 11:27 AM, wang peter wrote: > dear all: > > i used all the RNA-seq data to assemble the transcripsome. > but anyone R package or software can map data of each sample to the the > transcripsome > and also estimate gene expression value Hi -- your question is quite vague. Alignment via Rsubread / Biostrings::matchPDict / third party. Count overlaps via GenomicRanges readGappedAlignments, countOverlaps, summarizeExperiment (in 'devel'). Differential representation via edgeR, DESeq, DEXSeq and others at http://bioconductor.org/packages/release/BiocViews.html#___RNAseq Hope that helps, Martin > > thank you > shan gao > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793

ADD COMMENT • link 14.2 years ago Martin Morgan 25k

0

Entering edit mode

> > dear Martin: > your answer is quite clear. i will try Biostrings::matchPDict to > map the reads of each sample to the assembled transcriptome using those > samples. > > but what i worrry is the efficiency. i can also use BWA OR bowtie > to do so. > > i donot know if Biostrings::matchPDict will be very slow? > > thank you > > shan gao > [[alternative HTML version deleted]]

ADD REPLY • link 14.2 years ago wang peter ★ 2.0k

0

Entering edit mode

On 10/25/2011 07:05 AM, wang peter wrote: > dear Martin: > your answer is quite clear. i will > try Biostrings::matchPDict to map the reads of each sample to the > assembled transcriptome using those samples. > > but what i worrry is the efficiency. i can also use BWA OR > bowtie to do so. > > i donot know if Biostrings::matchPDict will be very slow? matchPDict would not be my first choice; it will consume a lot of memory and will not be as flexible as some aligners. Martin > > thank you > > shan gao > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793

ADD REPLY • link 14.2 years ago Martin Morgan 25k

0

Entering edit mode

thank u: so the best methods is use bwa/bowtie to get the sam file then use R package to anaylize sam file and get the information which you want. but, how to deal with huge sam files? do i need writing a perl scripts to split the sam file and process them by R. then combine themï¼ any other good ways? shangao [[alternative HTML version deleted]]

ADD REPLY • link 14.2 years ago wang peter ★ 2.0k

0

Entering edit mode

On 10/25/2011 07:43 AM, wang peter wrote: > thank u: > so the best methods is > use bwa/bowtie to get the sam file > then use R package to anaylize sam file and get the information which > you want. > but, how to deal with huge sam files? > do i need writing a perl scripts to split the sam file and process them > by R. > then combine them? > any other good ways? use Rsamtools package. Make your aligner produce BAM files, or use Rsamtools::asBam to convert sam to bam; bam files are faster to load / more flexible. For many purposes GenomicRanges::readGappedAlignments is sufficient; Rsamtools::scanBam for maximum flexibility. Martin > shangao -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793

ADD REPLY • link 14.2 years ago Martin Morgan 25k

0

Entering edit mode

Hi Shan Gao, Neither BWA or Bowtie can map junction reads which span two or more exons. You may try the Subread aligner implemented in Rsubread package which can map junction reads in addition to exonic reads. The percentage of junction reads in an RNA-seq dataset is typically around 20%. So you should be able to map ~20% more reads using Subread compared to Bwa or Bowtie. The Rsubread package also includes a function called featureCounts which counts the number of reads falling into each gene or each exon and returns you a list object which contains a table of read counts and also annotation information. If your worry about the efficiency, this might be the package for you. Rsubread is at least twice as fast as bowtie (and a lot faster than Bwa). The featureCounts() function only takes ~2 minutes to summarize mapping information from a SAM format file into a table of read counts. Cheers, Wei On Oct 26, 2011, at 1:05 AM, wang peter wrote: >> >> dear Martin: >> your answer is quite clear. i will try Biostrings::matchPDict to >> map the reads of each sample to the assembled transcriptome using those >> samples. >> >> but what i worrry is the efficiency. i can also use BWA OR bowtie >> to do so. >> >> i donot know if Biostrings::matchPDict will be very slow? >> >> thank you >> >> shan gao >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}

ADD REPLY • link 14.2 years ago Wei Shi ★ 3.6k

0

Entering edit mode

jason0701 ▴ 190

@jason0701-3921

Last seen 6.1 years ago

Hi Shan, There are quite many online resources online to allow you getting started. For example: http://www.ebi.ac.uk/fg/general/events/EMBO2010/presentations/day4/Ang ela/2010_EMBO.pdf http://www.bioconductor.org/help/course-materials/2011/ Best, Jason

ADD COMMENT • link 14.2 years ago jason0701 ▴ 190

0

Entering edit mode

thank u. but the bam file is still very large, more than 10G i think i must read them partly by R. so any good idea? shangao [[alternative HTML version deleted]]

ADD REPLY • link 14.2 years ago wang peter ★ 2.0k

0

Entering edit mode

Hi, Shangao. You should demonstrate that you have followed up on previous replies and have read the relevant documentation before asking the list to provide answers. In particular, please read the documentation for scanBam and readGappedAlignment as Martin has suggested. You should TRY these functions on your BAM files. A 10GB BAM file is not large, so with a little reading, you may find that folks have thought about your perceived problem ("large" BAM files) rather carefully. I hope my comments above are taken as constructive; they are meant that way and not meant to be rude. Sean On Tue, Oct 25, 2011 at 12:09 PM, wang peter <wng.peter at="" gmail.com=""> wrote: > thank u. > but the bam file is still very large, more than 10G > i think i must read them partly by R. > so any good idea? > > shangao > > ? ? ? ?[[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 14.2 years ago Sean Davis 21k

0

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 5 months ago

Australia/Melbourne

Even you map your reads to the transcriptome, you still have the problem of junction read mapping because the annotated transcriptome is unlikely to include all the alternative isoforms of each gene. Those junction reads, which contain exon junction locations that are not included in your transcriptome, can not be mapped if you are not using junction-aware read aligner. Wei On Oct 26, 2011, at 9:07 AM, wang peter wrote: > > thx for your reply. > but i map those reads to transcripsome, not genome. > i think i donot need consider the junction. > so i care more on the efficiency > > shangao ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}

ADD COMMENT • link 14.2 years ago Wei Shi ★ 3.6k

Login before adding your answer.