Hello,
I'm facing serious problem trying to recalculate the results of a given Series from GEO. For better understanding I will just describe the experiment briefly
There are 22 samples, two samples are always a replicate. And the first two belong to the control group. I managed to calculate the mean value of each two replicates, logFC , the logratio in comparison to control and also the fold change. In order to find differential expressed genes (twofold up or down compared to control) that are common in at least 9 of 10 sample group I used LIMMA in R.
My code
samples <- as.factor(samples) design <- model.matrix(~0 + samples) fit <- lmFit(exprSet, design) contrast.matrix <- makeContrasts(RF_control= control-RF, LNIT_control= control-LNIT, REC_control= control-REC, LIP_control=control-LIP, BUR_control= control-BUR,BRI_control=control-BRI, UL_control=control-UL, FF_control=control-FF, LT_control=control-LT, LTMAS_control= control-LTMAS, levels=design ) fits <- contrasts.fit(fit, contrast.matrix) eFit <- eBayes(fits) topTable(eFit, number=10, coef=1) nrow(topTable(eFit, coef=1, number=10000, lfc=2)) probeset.list <- topTable(eFit, coef=1, number=10000, lfc=2) gene.symbols <- getSYMBOL(rownames(probeset.list), "hgu133plus2") results <- cbind(probeset.list, gene.symbols) write.table(results, "results1.txt", sep="\t", quote=FALSE)
I compared the logFC generated by LIMMA of some genes with the logFC that i have calculated before(which are definitely right) and they are different. Is LIMMA used wrongly?
And how to I combine the different data from each coefficient, sind they result in different row numbers
Thank you very much!
Thank you very much for helping! I found the error :R somehow confused the target, so samples have different targets, which means when i compared them to the control group, the comparison was done to another group
all target are being changed in 'design'. I don't get it, because my phenodata looks like this:
You seem to have incorrectly renamed your columns for
design
. I'm guessing that you named your columns based on unique values insamples
, which is wrong. You should be doing:Regarding collating DE results
did you mean it like this :
topTable(eFit, number=10, coef=1, sort.by="none", sort.by="none")
topTable(eFit, number=10, coef=2, sort.by="none", sort.by="none")
...
..
...
Is there way to tell R to go through all coefficient at once?
It's called a loop:
I'm so sorry for disturbing again ...
I used your for loop. It works as long as I don't define lfc, which I need to define, because I'm only interested intwofold changed genes, but since I get different row numbers I cannot combine them to one table. I tried to do it with this method
But then it reorder the transcripts, which makes the whole table not useful anymore. Do you know a solution?
Well, for starters, don't use
lfc
, usetreat
instead. See the "Note" in?topTable
.I get this Error in topTable(eFit, number = 10, coef = 1, sort.by = "none", n = Inf) :
unused argument (n = Inf)
n
andnumber
refer to the same argument, which can't be specified twice. Look at?topTable
for more details.