Question: Can tximport be used to mitigate sample specific length biases?
0
21 days ago by
abf0
abf0 wrote:

A recent publication in PLoS Biology documents sample specific biases in differential expression analyses related to gene length:

Recurrent functional misinterpretation of RNA-seq data caused by sample-specific gene length bias

By using a transcript-aware quantitation tool such as salmon, stringtie, kallisto, or RSEM, and calculating offsets with tximport, could this issue be mitigated?

rnaseq tximport • 80 views
modified 21 days ago by Michael Love26k • written 21 days ago by abf0
Answer: Can tximport be used to mitigate sample specific length biases?
3
21 days ago by
Michael Love26k
United States
Michael Love26k wrote:

Thanks for posting. I think the sample-specific biases shown in the paper could be addressed with tximport in its effective length offset, if the upstream method can capture the bias with one of the sample-specific terms it estimates.

I'm familiar with Salmon which has a fragment length distribution (FLD) term by default and an optional position bias term that can be estimated per sample (--posBias). The positional bias model is flexible across short and long transcripts by binning transcripts by their length as was suggested by Roberts (2011). I believe that these two terms should capture the effects seen in the downstream gene counts and gene lengths in this paper. I believe RSEM also has an optional sample-specific positional bias term. Most methods have a sample-specific FLD term.

You could try it out, and then run CQN or EDASeq on the estimated counts you get with tximport and countsFromAbundance="lengthScaledTPM" to see if the biases are effectively removed.

If you see a residual bias, you can always use the offset from CQN or EDASeq as well. I suppose if you're trying for both methods to eliminate the bias you should provide the lengthScaledTPM to the CQN / EDASeq methods, so they do not over-adjust biases which are already corrected by the effective length correction that tximport calculates.