Question: Why my limma DEG result in top Table all without significant meaning,showing logFC <1.5 or logFC>-1.5 ?
0
13 months ago by
Biomed0
Biomed0 wrote:

Hi, all

I'm dealing with Agilent single channel microarray when I filtering DEGs using limma, I find all my results showing logFC>-1.5 or logFC<1.5, without significant expression. I have been searching reason for a long time, but I still don't know what my faults are, I really need some help, thanks! My code as following:

treatment<-targets$Treatment Level<-c("neurologically healthy control","Parkinson disease") Group<-factor(treatment,levels = Level) Design<-model.matrix(~0+Group) colnames(Design) <- c("Control","PD") fita<-lmFit(exprs(eSet),Design) cont.matrix <- makeContrasts(Diff=PD-Control, levels=Design) fitb<- contrasts.fit(fita, cont.matrix) fitc<- eBayes(fitb) ##result Result<-topTable(fitc,adjust="BH",number = Inf,p.value = 0.05) dt_result<-decideTests(fitc,lfc = log2(1.5),p.value=0.05,method="separate") summary(dt_result)range (Result$logFC)
[1] -1.471461  1.192352

limma toptable logfc deg • 451 views
modified 13 months ago by Gordon Smyth39k • written 13 months ago by Biomed0

Cross-posted on Biostars: https://www.biostars.org/p/336903/

Answer: Why my limma DEG result in top Table all without significant meaning,showing log
2
13 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

Not every dataset gives DE results

If I understand your question correctly, the situation seems to have that you done an DE analysis between two groups and don't get any significantly DE genes. You don't show the output from summary(decideTests()) but I am guessing that it shows no up or down genes.

Why is that a problem? If your data is noisy, there may be no clear differential expression between the groups. The whole point of statistically rigorous DE methods is that they don't give significant results for every dataset.

If you are worried about not getting any DE genes, why are you imposing a fold-change cutoff as well as a FDR cutoff? That's almost uncertainly an unnecessary and counter-productive thing to do and it will naturally decrease your changes of getting any significant results. The limma documentation (help("decideTests")) advises against doing that.

Use up-to-date software

It is clear from your comments on Biostars that you are using the Agi4x44PreProcess package to do the pre-processing, even though that package was removed from Bioconductor many years ago. See Agi4x44PreProcess for Bioconductor version 3.0 . You must be using very old versions of R and Bioconductor. You can't really expect us to answer questions about obsolete software. Please use up-to-date software and please analyse the Agilent data in the simpler and better way that is recommended in the limma User's Guide.

BTW, cross-posting the same question to both Bioconductor and Biostars at the same time is considered impolite, especially when you supply different information to the different threads.

Fold-change vs logFC

Finally, there is a confusion in your question about log-fold-change cutoffs. Your call to decideTests() has imposed a fold-change cutoff of 1.5 (meaning that logFC > log2(1.5) = 0.58) but the title of your question refers to a 1.5 cutoff for logFC itself. These are not the same thing, so the use of 1.5 in the title of your question is incorrect.

This particular dataset

You mention on Biostars (but not on Bioconductor!) that the dataset you are analysing is from this paper: https://doi.org/10.1371/journal.pgen.1002794 . I analysed this data myself and had no difficulting finding DE genes. The use of fancy filtering or array weights would have yielded more differential expression, but I've kept it simple. Here is my complete analysis. The data was downloaded from https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-812/

> files
[1] "C_1.txt"  "C_10.txt" "C_11.txt" "C_12.txt" "C_13.txt" "C_14.txt" "C_16.txt" "C_17.txt"
[9] "C_18.txt" "C_19.txt" "C_2.txt"  "C_20.txt" "C_21.txt" "C_22.txt" "C_23.txt" "C_24.txt"
[17] "C_25.txt" "C_26.txt" "C_27.txt" "C_29.txt" "C_3.txt"  "C_4.txt"  "C_5.txt"  "C_6.txt"
[25] "C_8.txt"  "C_9.txt"  "P_1.txt"  "P_11.txt" "P_12.txt" "P_13.txt" "P_14.txt" "P_15.txt"
[33] "P_16.txt" "P_18.txt" "P_19.txt" "P_20.txt" "P_21.txt" "P_22.txt" "P_24.txt" "P_26.txt"
[41] "P_27.txt" "P_28.txt" "P_29.txt" "P_3.txt"  "P_30.txt" "P_31.txt" "P_33.txt" "P_4.txt"
[49] "P_5.txt"  "P_6.txt"  "P_7.txt"  "P_8.txt"  "P_9.txt"
> xreg <- x[x$genes$ControlType==0,]
> xreg <- backgroundCorrect(xreg,method="normexp",offset=0)
> y <- normalizeBetweenArrays(xreg,method="quantile")
> Park <- factor(substring(colnames(y),1,1))
> design <- model.matrix(~Park)
> fit <- lmFit(y,design)
> fit <- eBayes(fit,robust=TRUE,trend=TRUE)
> summary(decideTests(fit))
(Intercept) ParkP
Down             0   362
NotSig           0 42137
Up           43376   877

Thanks, Gordon, and very sorry for my rude action.

It's true that I didn't get any result. Maybe there are some problems with my data before limma. I will check it first.

Thanks again!