Hi every one! I've been using tuxedo pipeline for RNAseq analysis and now I'm stock with cummRbund. I've used a reference annotation.gtf in my tophat and cufflinks runs. in this annotation my gene of interest has only two isoforms but when I try to visualize expression bar plots for isoforms using cummRbund it result in to 3 variants. its weired because when I load my annotation.gtf to IGV I can clearly see that there are only 2 transcripts for my gene.
trying to understand a little more I view isoforms cuffFeature of my gene using cummrbund annotation function and saw that one of isoforms is repeated twice but with different length:
isoform_id gene_id CDS_id gene_short_name TSS_group_id class_code nearest_ref_id locus length
1 TCONS_00047384 XLOC_021135 NA NR_026816 TSS26199 = NR_026816 chr6:31132113-31154094 593
2 TCONS_00047385 XLOC_021135 NA LC050987 TSS26200 j LC050987 chr6:31132113-31154094 885
3 TCONS_00047386 XLOC_021135 NA LC050987 TSS26200 = LC050987 chr6:31132113-31154094 1369
coverage gene_id class_code nearest_ref_id gene_short_name locus length coverage
1 NA XLOC_021135 <NA> <NA> LC050987,NR_026816 chr6:31132113-31154094 NA NA
2 NA XLOC_021135 <NA> <NA> LC050987,NR_026816 chr6:31132113-31154094 NA NA
3 NA XLOC_021135 <NA> <NA> LC050987,NR_026816 chr6:31132113-31154094 NA NA
any help on why this is happening or how to fix it is appreciated.