Selecting pairs of exons from differentially spliced genes
Entering edit mode
Last seen 16 hours ago


I am interested in finding differentially spliced genes using the diffSplice function from the limma package.

For example, in the code below, Gene1 shows the most evidence of differential splicing, with Exon1 being significantly more highly expressed than the other Gene1 exons, when comparing group 2 to group 1. However, for such a differentially spliced gene, I would then like to select a representative pair of exons that capture this splicing difference as much as possible. In this case, Exon1 would be one of the pair, but which other exon would be best as a reference?

If another exon were significantly more lowly expressed, then that would probably be the obvious choice. Instead, one could naïvely look at the exon _t_-statistics and choose the exon with largest negative value, which would be Exon2 for Gene1. For Gene21, which also happens to show some differential splicing, the pair with the highest and lowest _t_-statistic values are Exon1 and Exon3 respectively.

Does this approach make sense or is there perhaps a better way?




## Genes
Gene <- paste("Gene", 1:100, sep="")
Gene <- rep(Gene, each=10)
Exon <- paste("Exon", 1:10, sep="")
Gene.Exon <- paste(Gene, Exon, sep=".")
genes <- data.frame(GeneID=Gene, Gene.Exon=Gene.Exon)

## Design
group <- factor(rep(1:2, each=3))
design0 <- model.matrix(~0 + group)
design <- model.matrix(~group)

## Expression
sd <- 0.3*sqrt(4/rchisq(1000,df=4))
E <- matrix(rnorm(1000*6,sd=sd),1000,6)

## Increase expression of Exon1 of Gene1 in group 2 only 
E[1,4:6] <- E[1,4:6] + 2

## EList
y <- new("EList", list(E = E, genes = genes, design = design))

## lmFit
fit <- lmFit(y,design)

## Splice
ds <- diffSplice(fit, geneid="GeneID")
# Total number of exons:  1000 
# Total number of genes:  100 
# Number of genes with 1 exon:  0 
# Mean number of exons in a gene:  10 
# Max number of exons in a gene:  10 

topSplice(ds, number = 3)
#       GeneID NExons      P.Value          FDR
# 10     Gene1     10 8.034967e-07 8.034967e-05
# 210   Gene21     10 1.124678e-04 5.623389e-03
# 1000 Gene100     10 2.015570e-04 6.718565e-03

## t-tests for exons
tt <- topSplice(ds, test = "t", n = Inf)
tt <- tt[order(-tt$t), ] <- split(tt, tt$GeneID)

## Exon pairs by gene
exon.pairs <- lapply(, function(x) x[c(1, nrow(x)),])

#   GeneID   Gene.Exon     logFC         t      P.Value          FDR
# 1  Gene1 Gene1.Exon1  3.347710  6.251525 8.927741e-08 8.927741e-05
# 2  Gene1 Gene1.Exon2 -1.414739 -2.641889 1.097982e-02 3.786144e-01

#     GeneID    Gene.Exon      logFC         t      P.Value        FDR
# 201 Gene21 Gene21.Exon1  0.9521223  2.593235 1.244185e-02 0.40134996
# 203 Gene21 Gene21.Exon3 -1.7805751 -4.849639 1.249642e-05 0.00624821

# R version 3.6.2 (2019-12-12)
# Platform: x86_64-apple-darwin15.6.0 (64-bit)
# Running under: macOS Mojave 10.14.6
# Matrix products: default
# BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
# locale:
# [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# other attached packages:
# [1] limma_3.42.0
# loaded via a namespace (and not attached):
# [1] compiler_3.6.2
diffSplice limma • 193 views
Entering edit mode
Last seen 8 hours ago
WEHI, Melbourne, Australia

Choosing a representative pair of exons is not part of our method, so we won't make any recommendations.

You should choose the exons with largest and smallest t-statistics. Why not? You're just trying to extract something informal, so a sensible informal rule is the best that can be done.

Entering edit mode

That is fair enough. Sensible is always reassuring to hear! Thanks very much for your input Gordon.
Best regards, Jamie.


Login before adding your answer.

Traffic: 311 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6