I have a big rnaseq data set (146 samples), I did the voom transformation with a design matrix that have my factor with my 4 groups of interest and 4 more factors that could affect the expression as well, but I am not interested in them,only added to catch the variation they may introduce.
I did a PCA of the weights that voom returns, and I saw my samples clustered in the 4 groups I am interested to do the DE, so suddenly I had the question what this means? Is something that we should expect, or that means that are some bias, or is not important?
I tried to think about it, I thought that weight won't be correlated with anything, but some genes are doing the separation of the samples because of the weight values. Since weights are used in the glm, and they are correlated with my groups, don't know if results will be correct.
thanks in advance