Question: Should I use alpine and can I convert the output back to counts?
7 months ago by
harry.smith20
harry.smith20 wrote:

Hello,

I am working with some paired-end data, and based on the fastQC reports the R2 reads have pretty bad GC, frag length, and base-pair results. I am worried that using these R2s will add bias related to the issues above. First, I tried just using the R1s with RSEM, but got extremely poor alignments (<1%). I then cleaned the reads of rRNA (as the data was total RNA), and used those R1s in RSEM again. This did not improve alignments. I am currently re-running RSEM using the both R1 and R2, but I am concerned about the poor quality of the R2s. Now these are prelim data which came from barely sequencable samples to begin with (RIN < 3 for most samples), and the results likely won't be used for anything but generating a candidate list for grant writing; and I am aware that with the biases above there is a good chance that many of the genes in that candidate list will be false positives. I am curious about whether or not it is appropriate to use alpine in this instance to correct those biases. And if so, it looks like alpine outputs FPKM estimates which means I can no longer use DESeq2. Is the best course of action then to use limma? Or is there a way I can get back to expected counts from the alpine output?

Thank you

Harry

deseq2 alpine • 214 views
modified 7 months ago by Michael Love22k • written 7 months ago by harry.smith20
Answer: Should I use alpine and can I convert the output back to counts?
7 months ago by
Michael Love22k
United States
Michael Love22k wrote:

I would recommend use of Salmon, which includes the GC bias correction of alpine if you use the --gcBias argument. Salmon is a much better software for making quantifications than alpine. alpine was designed for research into RNA-seq bias, and comparing various bias models. It provides abundance, but Salmon is better for a number of reasons. For more details see this presentation:

https://goo.gl/ftK55e