Dear Sir/Madam,
I am a postdoctoral researcher at Chris Mason's lab at the ICB Cornell
Medical College in NYC.
I would be very interested in using your R package for RNA-seq
analysis of some raw data we have.
We went through the manual quickly, and it is not clear for me how to
start the analysis, i.e., input file.
To be more specific: given the fastq file (or binary fastq.tbz), could
we use it as input for Voom, and then
use the result for Limma? Or should we align first our raw fastq data
and then use the sam or bam files as
input for the Vomm or Limma packages? How should I proceed to start an
analysis from raw fastq files?
I would highly appreciate an answer at your earliest convenience.
Thank you very much in advance for your attention,
--
Best,
Pedro
Pedro Blecua, Ph.D.
Postdoctoral Associate
Weill Medical College of Cornell University
Institute for Computational Biomedicine
Department of Physiology and Biophysics
1305 York Avenue, Box 140,
New York, NY 10021
Office: 212 746 4237
Fax: 212 746 8690
Hi Pedro,
On 1/7/2013 3:47 PM, Pedro Blecua wrote:
> Dear Sir/Madam,
>
> I am a postdoctoral researcher at Chris Mason's lab at the ICB
Cornell
> Medical College in NYC.
> I would be very interested in using your R package for RNA-seq
> analysis of some raw data we have.
> We went through the manual quickly, and it is not clear for me how
to
> start the analysis, i.e., input file.
>
> To be more specific: given the fastq file (or binary fastq.tbz),
could
> we use it as input for Voom, and then
> use the result for Limma? Or should we align first our raw fastq
data
> and then use the sam or bam files as
> input for the Vomm or Limma packages? How should I proceed to start
an
> analysis from raw fastq files?
You need to align using a gapped aligner (bowtie2, gsnap, etc), and
then
use the resulting bam file to get counts per transcript, which is the
input to voom.
Once you have the aligned data, you can use GenomicFeatures and the
correct transcript.db package to get the counts using
summarizeOverlaps(). Given aligned bam files, I usually do something
like
library(Rsamtools)
library(GenomicFeatures)
bflst <- BamFileList(<character vector="" of="" bam="" files,="" including="" path="" if="" not="" in="" working="" dir="">)
library(Tx.Db.Hsapiens.UCSC.hg19.knownGene) ## substitute applicable
species here
feat <- exonsBy(Tx.Db.Hsapiens.UCSC.hg19.knownGene, by = "gene")
olaps <- summarizeOverlaps(feat, bflst)
then you can do
counts <- assays(olaps)$counts
voom(counts)
Best,
Jim
>
> I would highly appreciate an answer at your earliest convenience.
>
> Thank you very much in advance for your attention,
>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
Dear Pedro,
You'll need to align your fastq data, summarize read counts to genes
and then run limma voom or edgeR for expression analysis.
We have developed a pipeline for doing this which allows you to
complete the entire analysis in R. This pipeline starts with using
Rsubread package for alignment (see Rsubread vignette for more
details, but briefly the functions 'buildindex' and 'align' are used
for index building and read mapping, respectively), and then uses
featureCounts function in the same package to do the read
summarization. The summarized read counts can then be fed into limma
or edgeR for expression analysis.
Cheers,
Wei
On Jan 8, 2013, at 7:47 AM, Pedro Blecua wrote:
> Dear Sir/Madam,
>
> I am a postdoctoral researcher at Chris Mason's lab at the ICB
Cornell
> Medical College in NYC.
> I would be very interested in using your R package for RNA-seq
> analysis of some raw data we have.
> We went through the manual quickly, and it is not clear for me how
to
> start the analysis, i.e., input file.
>
> To be more specific: given the fastq file (or binary fastq.tbz),
could
> we use it as input for Voom, and then
> use the result for Limma? Or should we align first our raw fastq
data
> and then use the sam or bam files as
> input for the Vomm or Limma packages? How should I proceed to start
an
> analysis from raw fastq files?
>
> I would highly appreciate an answer at your earliest convenience.
>
> Thank you very much in advance for your attention,
>
>
> --
> Best,
>
> Pedro
>
> Pedro Blecua, Ph.D.
> Postdoctoral Associate
> Weill Medical College of Cornell University
> Institute for Computational Biomedicine
> Department of Physiology and Biophysics
> 1305 York Avenue, Box 140,
> New York, NY 10021
>
> Office: 212 746 4237
> Fax: 212 746 8690
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:6}}