I am currently analysing a set of time-course (~10 time points) stranded paired-end RNA-seq data with the particular objective of identifying time-dependent changes in alternative splicing.
However, I am still undecided whether alpine or Salmon or another method would be better suited for the estimation of transcript abundance. Up to this point, I used STAR to align the data to the reference genome for each individual time point. Is there an optimal work flow that you would recommend for read-depth normalization and isoform-specific transcript abundance estimation for time-course data?
Concerning read-depth normalisation, I have read the recommendation of "downsampling" reads in order to get comparable numbers for all samples, assuming that the number of reads is roughly equal across samples. However, the number of reads from my samples differs substantially (50-125 m reads) and I do not want to throw away so much data.
I would be very grateful to hear your thoughts on the matter.
alpine is mostly a tool for researching bias itself, for production-level quantification, please use Salmon instead. They implement the same GC bias model essentially, but Salmon is higher quality software and is massively faster and more efficient.
Salmon provides accurate estimates even if there is a dependence of coverage on fragment GC content, which is one of the most common biases in Illumina sequencing data. I recommend to therefore always run Salmon with the flag --gcBias. See the manuscript and Salmon home page for more details.
Read depth is normalized if you load Salmon data into R using tximport, and then follow the steps for downstream analysis. The downstream packages will then normalize for depth. It is not recommended to downsample, this leads to loss of information, and parametric models can handle the change in depth with an offset. edgeR, DESeq2, limma-voom all go this route rather than throw away some of the data.
You can perform testing at the isoform level if you like. From our latest preprint, there are many Bioconductor packages which work well for "DTE" (differential transcript expression). See Figure 13 for comparison.