I am analysing an proteome dataset, derived from DIA mass spec with quantified values from Biognosys Spectronaut 11, and I would like to use LIMMA for differential expression analysis. Im just a beginner and not very experienced yet so please forgive me if my question isn't perfectly stated.
According to the LIMMA manual, one should test for heteroscedasticity and correct for it if present.
I tested my data using a linear model testing I found online. The test, output below, showed heteroscedasticity to be present in the raw data. The strange thing is that LIMMA did not change that as these after transformation still shows heteroscedasticity (though with a higher but still very low Breusch-Pagan test p-value)
I started playing with the data, testing also log2 transformation and VSN stabilisation. All matrixes continue to show heteroscedasticity if tested but after VSN transformation the meanSdPlot(vsnMatrix) shows the "straightest" line. (compared with the meanSdPlot of raw, voom or log2 matrix)
After reviewing the results below, what would you recommend me to do? Simply log2 transformation, VOOM or SVN before LIMMA? How badly does it affect results if only log2 transformed data are used?
Thanks for your suggestions! Sebastian
PS: if you know how to resize pictures postet here let me know :)
Analysis: - Data: raw expression matrix with optionally: voom, log2, SVN transformation. - Analysis (shown for rawMatrix, others simply with matrix after transformation)
variance_rawMatrix <- apply(raw_matrix, 1, var)
mean_rawMatrix <- apply(raw_matrix, 1, mean)
var_and_mean_rawMatrix <- data.frame("variance" = variance_noVoom,
"mean" = mean_noVoom,
row.names = names(variance_noVoom))
lmMod_rawMatrix <- lm(variance ~ mean, data=var_and_mean_rawMatrix)
par(mfrow=c(2,2))
plot(lmMod_rawMatrix)
mtext("Linear model quality testing of raw data", side = 3, line = -1.5, outer = TRUE)
lmtest::bptest(lmMod_rawMatrix)
Transformation
voomMatrix <- voom(rawMatrix, design = design, plot = T)
log2Matrix <- log2(rawMatrix)
vsnMatrix <- normalizeVSN(rawMatrix)
Results:
Raw matrix: Breusch-Pagan: BP = 2264.5, df = 1, p-value < 2.2e-16
VOOM matrix BP = 20.044, df = 1, p-value = 7.567e-06
log2 matrix BP = 25.474, df = 1, p-value = 4.483e-07
SVN matrix BP = 42.718, df = 1, p-value = 6.323e-11
Is it maybe an option to use the log2 matrix and then use either trend = T and/or robust = T ?