Likelihood ratio test (lrTest) in MAST comparing full to reduced model; and another question on controlling for CDR
1
0
Entering edit mode
Ken C • 0
@ba661187
Last seen 9 months ago
Singapore

My Question #1

In the MAST documentation there are examples for doing the lrTest dropping one of the coefficients from the model. When it comes to dropping multiple coefficients, i.e. for the general case of comparing a full model to a nested reduced model, it seems that the documentation is a bit unclear. I would like to confirm that this can be done in the following way, with a toy example:

Say I have a crossed design and would like to analyze the data in the classical two-way ANOVA-like manner, for which the model expressed in R formula is ~x*z, where x is a factor of two levels "a" and "b", and z is a factor of three levels "a", "b", and "c". And I would like to do a LR test for all the interaction terms at once. The design matrix, say from model.matrix would have columns named "(Intercept)", "xb", "zb", "zc", "xb:zb" and "xb:zc", the interaction terms would correspond to the last two columns. Given this, I tried the code below:

# zlmFit is a returned object from running zlm() with the above design
# do LR test for the intended interaction effects
lrTest(zlmFit, hypothesis=as.matrix(c(0,0,0,0,1,1)))

This did run and returned some results. But I would like to confirm that this is the correct way and the results are indeed for the intended test as described above.

My Question #2

It was mentioned in the MAST publication that the cellular detection rate (CDR) can be controlled for by adding the variable as a covariate in the discrete and continuous models. Somehow I had always thought that MAST::zlm does the CDR correction in this way internally, but I recently realized I was probably wrong... So again just confirming, zlm is actually not doing CDR correction internally by default, and if desired the user should manually add CDR as a covariate to his/her model, is this correct?

Thanks in advance for your help.

MAST • 1.3k views
ADD COMMENT
1
Entering edit mode
@andrew_mcdavid-11488
Last seen 13 months ago
United States

Question 1: The contrasts specified looks to be backwards from what you wrote as the tests in question. Set the ones you want to test to 0, and the rest to 1. The contrasts test that the sum of specified combination of coefficients is equal to zero, or not. c(0,0,0,0, 1, 1) would test that xb:zb + xb:zc = 0, or in other words, xb:zb = - xb:zc, which is probably not what you want. You likely want the two-degree of freedom test, which would be a matrix with two columns:

cbind(c(0,0,0,0, 1, 0),
c(0,0,0,0, 0, 1))

Or use CoefficientHypothesis. The examples ?ZlmFit-class include some that are analogous to what you are trying to do, so you might poke around there.

Question 2: Correct -- you would need to include that as a covariate in the model, if you want to adjust for it.

ADD COMMENT
0
Entering edit mode

Hi Andrew,

Thanks a lot for your quick reply. For Question 1, thanks for directing me to the examples in ZlmFit-class. I'm still a bit unsure after reading those. Back to my above toy example, if sticking with the convenient user interface the package provides and avoid using a contrast matrix, would passing multiple coefficients in a character vector to CoefficientHypothesis be the correct way? e.g.:

lrTest(zlmFit, CoefficientHypothesis(c("xb:zb","xb:zc")))

Also, I have to say that I'm not formally a statistician, and if the 0's and 1' in the contrast matrix I wrote in my original post should be the opposite, then I must have been seriously missing something... To double check, isn't it that the CoefficientHypothesis (in the single coefficient case) will be internally converted to a contrast vector of zeros except for the position of the specified model coefficient, which will be filled with 1? Thus the coefficients to literally drop from the full model should get value 1 in the contrast (or am I mistaken)? And when passing a plain matrix to the hypothesis argument of lrTest it will be treated as a contrast matrix? I also did a few quick tests and found inconsisent results even if I pass a matrix and reverse the 0's and 1's... Could you please kindly help looking through these below and help explain a bit further?

library(MAST)

# I use an arbitrary subset of the shipped vbetaFA datasets only for testing purpose:
data(vbetaFA)
dat=subset(vbetaFA, ncells==1 & Population %in% c("VbetaResponsive","VbetaUnresponsive"))[1:10,]

# check the data:
table(colData(dat)$Stim.Condition, colData(dat)$Population)
#            VbetaResponsive VbetaUnresponsive
#  Stim(SEB)              43                43
#  Unstim                 42                43

# check the design matrix for the names of model coefficients:
colnames(model.matrix(~Stim.Condition*Population, colData(dat)))
# [1] "(Intercept)"       "Stim.ConditionUnstim"       "PopulationVbetaUnresponsive"
# [4] "Stim.ConditionUnstim:PopulationVbetaUnresponsive"

# fit model
fit=zlm(~Stim.Condition*Population, dat)

# test the interaction effect; here corresponding to just one coefficient,
# so I'm expecting equivalent results to using `CoefficientHpothesis`:
# 1. my previous way:
lrt1=lrTest(fit, as.matrix(c(0,0,0,1)))
head(lrt1[,,3])
#         test.type
# primerid      cont      disc    hurdle
#   B3GAT1 1.0000000 0.1349874 0.1349874
#   BAX    0.9235851 0.8054454 0.9656696
#   BCL2   1.0000000 0.4718240 0.4718240
#   CCL2   1.0000000 0.3383300 0.3383300
#   CCL3   1.0000000 1.0000000 1.0000000
#   CCL4   1.0000000 1.0000000 1.0000000

# 2. reversing my 0's and 1's:
lrt2=lrTest(fit, as.matrix(c(1,1,1,0)))
head(lrt2[,,3])
#         test.type
# primerid         cont         disc       hurdle
#   B3GAT1 1.000000e+00 1.288748e-01 1.288748e-01
#   BAX    2.254593e-45 3.713067e-01 2.691915e-44
#   BCL2   1.000000e+00 1.263200e-06 1.263200e-06
#   CCL2   1.000000e+00 2.181552e-01 2.181552e-01
#   CCL3   1.000000e+00 4.726529e-05 4.726529e-05
#   CCL4   1.000000e+00 1.000000e+00 1.000000e+00

# 3. using `CoefficientHypothesis`:
lrt3=lrTest(fit, CoefficientHypothesis("Stim.ConditionUnstim:PopulationVbetaUnresponsive"))
head(lrt3[,,3])
#         test.type
# primerid      cont      disc    hurdle
#  B3GAT1 1.0000000 0.5045526 0.5045526
#  BAX    0.9235851 0.8058111 0.9657819
#  BCL2   1.0000000 0.4259263 0.4259263
#  CCL2   1.0000000 0.7140869 0.7140869
#  CCL3   1.0000000 1.0000000 1.0000000
#  CCL4   1.0000000 1.0000000 1.0000000

However, as seen above, all 3 tests have different resulting P values. But I saw that the df's from lrt1 and lrt3 are the same, but different from that of `lrt2' (not shown). What did I miss?

Thanks again!

ADD REPLY
0
Entering edit mode

I wonder if other members of the development team or other experienced users may answer my question above... Thanks!

ADD REPLY
0
Entering edit mode

Err, you are right -- I did have it inverted. I edited my answer above. Sorry for the confusion!

ADD REPLY
0
Entering edit mode

Thanks a lot for your reply Andrew, this is very helpful! I tried and indeed the df=2 test by passing a two-column matrix gave the same results as passing a vector of two coefficient names to CoefficientHypothesis.

The remaining question is why for the df=1 case, the lrt1 (results with a single-column matrix/column vector) and lrt3 (results with CoefficientHypothesis) are different, as asked in my previous post. Could you please help shed some light on this when you have another few free minutes? Again, much appreciated.

ADD REPLY
0
Entering edit mode

I'm suspicious there may be a bug in the matrix method, and would use the CoefficientHypothesis method instead.

When passed a matrix, lrTest calls a method .rotateMM, which I suppose is trying to fit the model subject to the constraint imposed by the contrast, but it's doing something odd that doesn't really make sense to 2023 me. It'd be great if you could open a an issue on github so that I can look into this more when I have more time.

ADD REPLY
0
Entering edit mode

OK sounds good. I will open an issue on GitHub. Thanks again.

Edited: the issue is now opened here.

ADD REPLY

Login before adding your answer.

Traffic: 939 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6