Hi,
I wondered if there is a way to compare a reduced model matrix against the full model matrix with limma
.
Let's say for example, I have a full model: ~ cell + dex
and want to compare this against the reduced model ~ dex
.
Can I directly compare the two models, or do I have to specify which columns in the original design matrix come from cell
and set them as coef
?
Here is a small example to illustrate what I mean:
library(limma)
suppressMessages(library(airway))
data(airway)
logCPM <- edgeR::cpm(airway, log = TRUE)
design <- model.matrix(~ cell + dex, data = colData(airway))
design
#> (Intercept) cellN061011 cellN080611 cellN61311 dexuntrt
#> SRR1039508 1 0 0 1 1
#> SRR1039509 1 0 0 1 0
#> SRR1039512 1 0 0 0 1
#> SRR1039513 1 0 0 0 0
#> SRR1039516 1 0 1 0 1
#> SRR1039517 1 0 1 0 0
#> SRR1039520 1 1 0 0 1
#> SRR1039521 1 1 0 0 0
#> attr(,"assign")
#> [1] 0 1 1 1 2
#> attr(,"contrasts")
#> attr(,"contrasts")$cell
#> [1] "contr.treatment"
#>
#> attr(,"contrasts")$dex
#> [1] "contr.treatment"
fit <- lmFit(logCPM, design)
fit <- eBayes(fit, trend = TRUE)
topTable(fit, coef = 2:4, number = 3)
#> cellN061011 cellN080611 cellN61311 AveExpr F
#> ENSG00000213058 -5.997784 -5.9977840 0.3935414 -0.3581851 28310.347
#> ENSG00000164308 3.780246 3.1534606 3.5315346 4.8003548 13606.080
#> ENSG00000012817 -8.330419 0.1749454 0.2208583 3.5195512 3285.961
#> P.Value adj.P.Val
#> ENSG00000213058 6.767049e-08 0.004337814
#> ENSG00000164308 2.325664e-07 0.007453984
#> ENSG00000012817 2.547302e-06 0.017566474
Created on 2021-09-10 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)
If there is no other way to compare a full and reduced model matrix, what is the recommended way to conduct a likelihood ratio test for two unrelated models? Would you say that it is simply not a good idea to do something like that? I could for example imagine a scenario where I want to find out if the model using ~ cell + dex
or a hypothetical patient age ~ age
explains the observed data better.
model.matrix(~ age, tmp)
#> (Intercept) age
#> SRR1039508 1 10
#> SRR1039509 1 7
#> SRR1039512 1 41
#> SRR1039513 1 25
#> SRR1039516 1 9
#> SRR1039517 1 33
#> SRR1039520 1 13
#> SRR1039521 1 19
#> attr(,"assign")
#> [1] 0 1
Given how long limma
has been around, I am sure this question must have been discussed before. I searched for similar questions in the Bioconductor forum, and this was the closest hit, but it did not directly answer my question. But I am very sorry if I missed some obvious resource or forum post.
Best, Constantin
Hi James,
thanks for your thorough answer :)
I agree that this question is not about the most common use case in high-throughput testing. I should have probably given more detail on the background of my question.
I am currently writing a small wrapper package for
limma
and wanted to support testing for differential expression using either using contrasts or a reduced model. This is inspired by thenbinomLRT
andnbinomWaldTest
functions inDESeq2
.For this purpose, I was looking for the best way to make
limma
work on the reduced design matrix. I guess if I posit that the full model and reduced model are nested, I could calculate the correspondingcoef
parameter like thiswhich(colnames(full_design) %in% setdiff(colnames(full_design), colnames(red_design)))
, but that feels very hacky so I was hoping for a more robust solution.Ah. I think it's easier for Mike to do that sort of stuff because
DESeq2
generates the design matrix internally unlikeedgeR
orlimma
, where you have to specify it separately. Probably the best way to get around that is to write a wrapper that takes a model formula likeDESeq2
does, so you will have better control.Although you can compute an LRT for conventional linear models, it's not really a common thing to do is it? It seems to me the preferred thing to do is to test the coefficient (or contrast) directly. Isn't that test universally most powerful in the context of normal-distribution based linear modeling? Also, an LRT is usually performed to compare nested models. Is it OK to compare non-nested models using LRT? That seems not copacetic, but I don't know for sure.