I tried to use coverageByTranscript function from GenomicAlignments package to calculate the RNA-seq coverage along transcripts. This function requires the exons be sorted in ascending rank (i.e. the first Exxon in the transcript should be listed in the GRanges object first, and so forth, see reference below
My understanding is that if a gene is in the reference strand, exons should be sorted in ascending order genomic coordinate. If on the minus strand, the first exon in the transcript, but with highest genomic coordinate should be listed first. The documentation above said that "If transcripts was obtained with exonsBy, then the exons are guaranteed to be ordered by ascending rank. See ?exonsBy for more information". This statement appears to be incorrect. I have a gif file named genes.gtf. I used the following command to build the Granges list for all genes as follows:
txdb <- makeTxDbFromGFF(gtf_file) all_genes <- exonsBy(txdb, by="gene")
However the returned GRanges list does not order exons in the ascending rank for minus-strand genes. For example:
>all_genes[] GRanges object with 16 ranges and 2 metadata columns: seqnames ranges strand | exon_id exon_name <Rle> <IRanges> <Rle> | <integer> <character>  chr10 52559169-52566640 - | 126742 <NA>  chr10 52569654-52569802 - | 126743 <NA>  chr10 52570800-52570936 - | 126744 <NA>  chr10 52573617-52573798 - | 126745 <NA>
This does cause problems for coverageByTranscript in calculating the coverage scores. It's messed up. Is my understanding about exonsBy function incorrect or the documentation of coverageByTranscript is outdate or incorrect? Thanks.