Question: Ballgown vs (voom,edgeR,DESeq,limma)
2
gravatar for François Lefebvre
4.7 years ago by
Canada
François Lefebvre50 wrote:

Hi all, How would you use the ballgown package in conjunction with voom, edgeR, DESeq, limma?

The bioarXiv paper seems to be making the claim that ballgown gaps the bridge between cufflinks and tools like Limma, Voom, edgeR, DEseq.  

I don’t understand how voom, edgeR and DEseq can by used at the gene or transcript level, since these require raw counts , which ballgown/cufflinks do not return (unlike estimates from RSEM or Sailfish for instance). That is unless one is ready to feed FPKM values to voom(), but that looks incorrect to me.

As for Limma or the ballgown model, one has to accept working on the log2(1+FPKM) scale. But then one wonders why using ballgown in the first place when we can just parse the output of cuffnorm and feed those into Limma.

Thanks!

 

ADD COMMENTlink modified 4.6 years ago by Alyssa Frazee210 • written 4.7 years ago by François Lefebvre50

This is a useful comment on this issue.

http://permalink.gmane.org/gmane.science.biology.informatics.conductor/48283

ADD REPLYlink written 4.0 years ago by matthew.hindle0
Answer: Ballgown vs (voom,edgeR,DESeq,limma)
3
gravatar for Alyssa Frazee
4.6 years ago by
Alyssa Frazee210
San Francisco, CA, USA
Alyssa Frazee210 wrote:

Hi Francois,

Ballgown objects do contain read counts at the exon level. These are calculated with Tablemaker, the preprocessor we released to parse Cufflinks assemblies. So you can use edgeR, DESeq, voom, or other count-based methods on those counts. They are not meant for transcript-level analysis. If you want gene counts for your Cufflinks assembly, you can use existing gene counting functions (e.g. summarizeOverlaps) with alignments + the "merged.gtf" Cufflinks file. 

For transcript-level analysis: you are of course welcome to parse Cuffnorm output, load it into R, and feed it into limma. Running tablemaker and then the "ballgown()" function, then extracting expression measurements with texpr() is basically equivalent to that. There are a few advantages to using ballgown: 
(1) The ballgown() function is the parser, so you don't have to write it yourself.
(2) The expression measurements are connected to the assembly structure, in efficient GRanges/GRangesList format.
(3) ballgown provides functions for plotting transcript structure/abundances and matching assembled transcripts to annotation.
(4) The linear modeling in ballgown (stattest() function) specifies the models to compare, does library size normalization, and adjusts p-values for multiple testing correction by default. This is all totally possible with limma, of course, but we have wrapped this into one function call. The idea behind stattest() was to provide a drop-in replacement for Cuffdiff, whose users don't need to specify models, normalization, etc. We show in our preprint that using these models (log FPKM values fed to ballgown/limma) can accurately detect differential transcript expression.

Ballgown's contribution is the software infrastructure connecting Cufflinks assemblies to R (so users don't have to write their own parsers; the paper also shows that linear modeling of transcript FPKM gives appropriate DE results). 

Hope this helps!

 

 

 

 

 

ADD COMMENTlink written 4.6 years ago by Alyssa Frazee210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 146 users visited in the last hour