I analyzed a biosynthetic gene cluster dataset using metagenomeSeq but I am confused about the resulting table(s). Perhaps someone can help me understanding what is going on here. When performing the fitZig function (i.e. Zero-inflated Gaussian Model), using a model matrix with an intercept, as such:
# model matrix mod <- model.matrix(~Treatment) colnames(mod) = levels(as.factor(Treatment))
# fitZig function settings <- zigControl(tol=1e-6, maxit = 18, verbose = TRUE) fit <- fitZig(obj = object, mod = mod, useCSSoffset = TRUE, control = settings) zigFit = fit@fit finalMod = zigFit$design contrast.matrix = makeContrasts(paste(make.names(lvls.contrast), "-", make.names(lvls.contrast)), levels = finalMod) fit2 = contrasts.fit(zigFit, contrast.matrix) fit3 = eBayes(fit2) topTable(fit3, coef=paste(lvls.contrast, "-", lvls.contrast)) res <- topTable(fit3,coef=1,adjust="BH",n=Inf)
my resulting topTable shows very high logFolds (30+) and relatively low number of significant hits (range 1 - 20).
However, when I perform the same but without intercept, as such:
# model matrix without intercept mod <- model.matrix(~ 0 + Treatment) colnames(mod) = levels(as.factor(Treatment))
I get realistic logFolds (max. 4) but the number of significant hits are in the range of 400-500. Also, the two resulting tables from 'with intercept' and 'without intercept', rarely have the same logFold direction (positive or negative).
THEN, I tried the fitFeatureModel (see below) and I get huge amounts of significant hits (range of 500-600) with a 'realistic' logFold (max. 4)
fit <- fitFeatureModel(object, mod)
So, my confusion is about which models to use. I also don't understand how it's possible that the resulting table from the different model.matrix gives opposite directions (logFolds) for same features.
If someone could explain this, I would be very grateful.