21 months ago by

Australia/Melbourne/Monash University Bioinformatics Platform

Since RNA-seq is count based, in an RNA-seq analysis at the gene level the noise in individual counts for genes can be assumed to be Poisson (variance equal to the mean). This is an estimate of "technical variation", but does not include "biological varation". This would be the equivalent of the Kallisto bootstrapping method -- it's considerably simpler because there is no confusion as to which transcript a read is assigned to.

In limma, voom provides precision weights. As Ryan Thompson pointed out, these are simply the inverse of the variances. These voom weights will also include the biological variation component. (And using Kallisto confidence intervals, one might need to also estimate the amount of biological variance before using limma.)

In terms of confidence intervals in the final result of a differential expression analysis: limma's topTable function can provide confidence intervals on log fold change, but note that these are not adjusted for multiple testing. limma and edgeR also provide the TREAT method for finding genes with fold change exceeding some specified amount, and these do provide FDR control.

For what it's worth, I wrote some Bioc-friendly input parsers for the quantification files (to simple matrix, or to SummarizedExperiment) and for the bootstrap (using the great rhdf5 package); happy to accept feedback on these; it would be good as a community to use a consistent set of tools, so there's only one collection of bugs.

23kOne useful feature would be to convert the confidence intervals into weights on the logCPM values, so that one could use them in limma. (Unfortunately, I don't know enough of the mathematics to know how to do that conversion.)

7.3kExactly what I had in mind.

21kReading further, a precision weight is simply the inverse of the estimated variance. So I guess you would just compute the variance of the bootstrap estimates of normalized logCPM for each feature and then take the inverse as the weight.

7.3kKallisto looks promising - I'd check out some of the comparisons between it and Salmon, same principle, different implementations.

http://sjcockell.me/2015/05/18/alignment-free-transcriptome-quantification/

Both Authors got involved.

310