DEXseq: How to do differential exon usage of genes with overlapping exons
0
0
Entering edit mode
@matmu
Last seen 6 weeks ago
Germany

I followed the DEXSeq tutorial described here for performing differential exon usage analysis. When looking at the DEU plot for my gene of interest many of the exons were not shown. It turned out that a lncRNA having overlapping exons with my gene of interest and that many exons were removed during exon binning when I set the parameter linked.to.single.gene.only to TRUE.

R code:

# Create transcript data base
txdb = GenomicFeatures::makeTxDbFromGFF(gtf_file)

# Collapse the gene models into counting bins
exonic_parts = GenomicFeatures::exonicParts(txdb, linked.to.single.gene.only = TRUE)

# Create references to BAM files
bam_filelist = Rsamtools::BamFileList(bam_files, index=character(), yieldSize=100000, obeyQname=TRUE)

# Count the reads overlapping the bins
register(MulticoreParam(workers=4))
ese = GenomicAlignments::summarizeOverlaps(exonic_parts, bam_filelist, mode="Union", singleEnd=isSingleEnd, ignore.strand=TRUE, inter.feature=FALSE, fragments=!isSingleEnd)

# Convert SummarizedExperiment to DEXSeqDataSet
dxd = DEXSeq::DEXSeqDataSetFromSE(ese, design = design)


When I set the parameter to FALSE an error occurs during the build of a DEXSeq object with DEXSeqDataSetFromSE:

Error in (function (classes, fdef, mtable)  :
unable to find an inherited method for function ‘NSBS’ for signature ‘"CompressedIntegerList"’


I suppose it's because there is no exon_part column (reflects the exon bin order in a gene) included in rowRanges(ese) because a sequence can map to multiple genes. I could find a workaround by splitting exon bins with multiple comma-seperated genes in to multiple rows.

However, the problem that I see now is that DEXSeq can't differentiate between read counts that belong to lncRNA and the read counts that belong to my gene of interest. I.e. for my gene of interest DEXseq would always include the read counts of lncRNA for the overlapping exons and the same for the lncRNA.

The only solution I can think of is to use a transcript quantification tool like Salmon instead of a genome mapper and to build a DEXSeq object from it. Is there something I'm missing? I am happy to hear your opinion.

GenomicFeatures DEXSeq • 74 views