Figuring out what long decent Phred value unaligned reads are
Entering edit mode
Last seen 9 weeks ago
USA, Los Angeles, USC


My question is about reads that don't align to the genome yet are long and have very good Phred scores. Currently, my workflow is FastQC > Cutadapt > Trimmomatic > RNA-STAR > HTSeq-count > edgeR (RUVSeq) I use gencode genomes with Ensembl IDs and even with the cleanest isolation of cells and excellent library production I still get about 80% alignment to the genome. I use the entire genocode genome and gtf files for the alignment and I collect the unaligned reads and sometimes there are a large number of long reads with good Phred scores and I am thinking that in a perfect reference genome that they would align. The reference genome is not perfect by any means and with a certainty there are some cell type differences and strain differences between the reference genome and the source of the total RNA. Is there a way to construct and extract contigs from unaligned reads and then blast them to see what they have homology to? or see if they are genetic rearrangements or even if they are simply un-annotated ORFs. Does anyone have experience with this? What software would you recommend? Any response is greatly appreciated. TIA.

rnaseq alignment • 562 views

Login before adding your answer.

Traffic: 215 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6