Question

RNA-seq read alignment best practices for low-input samples

0

Entering edit mode

goldberg.jm ▴ 10

@goldbergjm-9751

Last seen 5.8 years ago

RNA-seq read alignment best practices for low-input samples

Hi All,

There are many ways to align RNA-seq reads to transcripts and genomes, and I am confused about which is best for my experiment.

My goal is to do differential expression analysis on Plasmodium vivax data. The input is fastq files of paired 25nt reads from Smart-seq2.

The P.vivax genome is 29 Mb, 40% G/C, 6,642 genes. Introns are much smaller and fewer in number than human introns. P.vivax RNA is hard to get, so we used Smart-seq2 to generate sequence. Using bowtie and RSEM we find for our richer samples we get 4.2 million reads aligned to transcripts, and 86% transcripts with at least one read aligned. For our less rich samples we get 1.2 million reads aligned to transcripts, and 78% transcripts with at least one read aligned. rRNA accounts for < 10% of the aligned reads.

Questions:

Is my understanding correct that if I am primarily interested in differential expression of known genes, as opposed to finding new genes and fixing gene models, that I may align reads to transcripts as opposed to the whole genome?
There are so many different aligners: bwa, bowtie, bowtie2, STAR, HISAT2. How much does it really matter which one I use? Does using a different alignment program have major effects on the outcome, or is it more just playing around the edges? How do I tell which is best for my data?
What metrics ought I use to assess my data?
I have read articles reviewing best practices for evaluation RNA-seq data, for example PMID26813401. But I’m not sure how this translates to my data, which I think falls in a nether region between single-cell and bulk.
It is wonderfully convenient to run RSEM and have RSEM do the alignment itself. Is it better practice to do the alignment step outside of RSEM?

Thank you! Jon

limma differential gene expression rnaseq analysis alignment RSEM • 1.7k views

ADD COMMENT • link updated 6.3 years ago by Aaron Lun ★ 28k • written 6.3 years ago by goldberg.jm ▴ 10

score 1 · Answer 1 · 2017-12-20

1

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 3 hours ago

The city by the bay

You've tagged this question with limma, but I don't actually see a question about limma - or, indeed, any Bioconductor packages. I suspect you're better off posting this at a forum such as SEQanswers or Biostars, given that many of your concerns seem to be about read alignment prior to the DE analysis.