Hi. I have been working with mouse RNA-Seq data and I aimed to use Rsubread package to align, annotate and quantify the reads. My issue is that whenever I use the inbuilt annotation for the mouse genome practically none of the reads are mappped:
align (index = "GRCm38.p4", readfile1 = all.fastq1, readfile2 = all.fastq2, input_format = "FASTQ", output_format="BAM", nthreads=8, output_file = all.BAM)
featuresC <- featureCounts(all.BAM, annot.inbuilt="mm10")
//================================= Running ==================================\\ || || || Load annotation file mm10_RefSeq_exon.txt ... || || Features : 222996 || || Meta-features : 27179 || || Chromosomes/contigs : 43 || || || || Process BAM file 1A.bam... || || Paired-end reads are included. || || Assign alignments to features... || || Total alignments : 93836764 || || Successfully assigned alignments : 29766 (0,0%) || || Running time : 3,96 minutes || || ||
I tried with the latest patch too (p6) and I obtained the same result. It only works by explicitly stating an external annotation file (.gtf). But in that case genes are being annotated by name and not by entrezid which I require for gene ontology analysis and so on.
Could somebody help me with this? Thank you very much in advance.
So far I could not find an ncbi annotation file as desired so I decided to stick with Gencode, Ensembl assemblies and the inbuilt annotation. I tried both out of curiosity and they work as expected except for a detail...the percentage of successfully assigned alignments using Gencode primary assembly (release M18) is slightly better than with Ensembl (79,6% vs 77,6% for example). Regarding the detected features (after removal of zeros) they are almost the same 22392 using the Gencode assembly and 22399 using Emsembl. Thank you all for your help.