Question

Typo in the help of lmFit()

0

Entering edit mode

Balazs • 0

@9c299243

Last seen 6 months ago

Switzerland

I would like to report 2 issues, the first one is a typo, the second one is suggestion to change the default behavior:

1) The _Details_ in the documentation of lmFit() still says that the default correlation is 0.75, see here: https://code.bioconductor.org/browse/limma/blob/RELEASE_3_21/man/lmFit.Rd#L60 However, that has changed since this commit: https://code.bioconductor.org/browse/limma/commit/f5ccbb95fc8f96ec5ad7f7be9ff582e321f717c1 and it throws an error if correlation is not specified, see https://code.bioconductor.org/browse/limma/blob/RELEASE_3_21/R/lmfit.R#L72. I believe this should be corrected by removing the 0.75 value and its caveats from the help.

2) Currently in lmFit() there is no default value for correlation https://code.bioconductor.org/browse/limma/blob/RELEASE_3_21/R/lmfit.R#L2. However in gls.series() the default is correlation = NULL and that causes it to estimate it with duplicateCorrelation(), see https://code.bioconductor.org/browse/limma/blob/RELEASE_3_21/R/lmfit.R#L240. This is very convenient and my suggestion would be to change to this default behavior in lmFit() too.

Thank you in advance for your consideration.

limma • 536 views

ADD COMMENT • link updated 9 months ago by Gordon Smyth 53k • written 9 months ago by Balazs • 0

score 0 · Answer 1 · 2025-05-20

Thanks for the heads-up.

The _Details_ in the documentation of lmFit() still says that the default correlation is 0.75

Yes, you're right. That hasn't been so for 20 years but I forgot to edit the help page appropriately. Now fixed.

Currently in lmFit() there is no default value for correlation. However in gls.series() the default is correlation = NULL and that causes it to estimate it with duplicateCorrelation(). This is very convenient and my suggestion would be to change to this default behavior in lmFit() too.

I did't want to run duplicateCorrelation() automatically because it can be a time-consuming computation. So I would rather highlight to users that they need to run it separately. I don't want users to be running an interactive session, call lmFit() expecting a fast response, and unexpectedly have the function hang with an computation that could take an hour for a large dataset.

I do note that that the newer functions voomLmFit() and voomaLmFit() do run duplicateCorrelation automatically, so perhaps this is a decision that could be revisited. Those functions run duplicateCorrelation() twice, so there is major convenience issue for doing it automatically in these contexts. As far as lmFit() is concerned, I am very cautious about altering a user interface that has been unchanged for 20 years.

The glm.series() function isn't intended to be run directly by users, so I assume it won't be run direclty from an interactive session.