Question

The basis for the full rank design matrix requirement in DESeq2

0

Entering edit mode

Nik Tuzov ▴ 90

@nik-tuzov-8783

Last seen 14 months ago

United States

Dear Prof. Love:

There have been numerous questions about “not full rank error” in DESeq2. The vignette addresses the issue and I understand when the error is generated. What I would like to find out is why you decided to include that feature in the first place. The point is that both SAS (PROC GLM, MIXED, GENMOD with Negative Binomial response) and R (lm) are quite comfortable with incomplete rank designs.

When one factor is perfectly confounded with another (called “Linear combinations” in the vignette) I suppose it was a good idea to generate an error, but, strictly speaking, SAS and R will produce an answer even in that case (R will produce some missing values and a warning “not defined because of singularities”).

Was your intention just to force the user to “consult a statistician” or there were some estimation difficulties when fitting the model with an incomplete rank matrix?

Regards, Nik Tuzov

deseq2 • 626 views

ADD COMMENT • link updated 5.1 years ago by Michael Love 43k • written 5.1 years ago by Nik Tuzov ▴ 90

score 2 · Accepted Answer · 2020-02-28

2

Entering edit mode

Michael Love 43k

@mikelove

Last seen 6 hours ago

United States

Plus or minus epsilon in 100% of the cases of user’s encountering the full rank error it was because they have accidentally included covariates that have a linear dependency. Typically batch confounded with condition, or attempting to control both effects within subject as well as subject traits such as sex or age.

ADD COMMENT • link 5.1 years ago Michael Love 43k

0

Entering edit mode

I was almost sure that some estimation difficulties were the reason because catching statistical design errors is well outside DESeq2 mandate. It would be better if that feature were optional, not hard coded. However, if the user were allowed to use any design matrix (including those in GLM coding which you call EMM) it may have some side effects in lfcShrink().

ADD REPLY • link 5.1 years ago Nik Tuzov ▴ 90

0

Entering edit mode

Catching design errors is equally as important as estimating the parameters in my opinion. Adding support for non full rank X would involve additional complexity for little to no gain.

ADD REPLY • link 5.1 years ago Michael Love 43k