Including covariates with fitFeatureModel
Entering edit mode
dkorandla • 0
Last seen 22 months ago

I am working on a project where I am comparing how different statistical methods for finding differential abundance perform. Over the course of this project I have found that the zero-inflated, feature-specific lognormal model (Feature model, fitFeatureModel function) generally performs better than the zero-inflated Gaussian model (ZIG model, fitZig function), but the Feature model is unable to handle confounding covariates unlike the ZIG model. Both the manual and vignette clearly indicate that confounders can be incorporated into the model design matrix for the ZIG model but there is no similar mention for the Feature model (see example code from the vignette below).

## Feature model
lungData = lungData[,$SmokingStatus))]
lungData = filterData(lungData, present = 30, depth = 1)
lungData <- cumNorm(lungData, p = 0.5)
pd <- pData(lungData)
## No covariates in the model matrix
mod <- model.matrix(~1 + SmokingStatus, data = pd)
lungres1 = fitFeatureModel(lungData, mod)
## ZIG model
controls = grep("Extraction.Control", pData(lungData)$SampleType)
lungTrim = lungData[, -controls]
rareFeatures = which(rowSums(MRcounts(lungTrim) > 0) < 10)
lungTrim = lungTrim[-rareFeatures, ]
lungp = cumNormStat(lungTrim, pFlag = TRUE, main = "Trimmed lung data")
lungTrim = cumNorm(lungTrim, p = lungp)
smokingStatus = pData(lungTrim)$SmokingStatus
bodySite = pData(lungTrim)$SampleType
normFactor = normFactors(lungTrim)
normFactor = log2(normFactor/median(normFactor) + 1)
## bodySite and normFactor are the covariates
mod = model.matrix(~smokingStatus + bodySite + normFactor)
settings = zigControl(maxit = 10, verbose = TRUE)
fit = fitZig(obj = lungTrim, mod = mod, useCSSoffset = FALSE, control = settings)

Whenever I try to use covariates with the Feature model, either with the example data or my own data, I keep getting an error saying Can't analyze currently. I have also tried to fit the zero-inflated lognormal model directly myself using code from the package to no avail. Is there a mathematical / statistical explanation as to why the Feature model cannot handle covariates that I am just missing?

Let me know if you need any additional information from me. Thanks in advance!

metagenomeSeq • 225 views

Login before adding your answer.

Traffic: 329 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6