Question

Why my limma DEG result in top Table all without significant meaning,showing logFC <1.5 or logFC>-1.5 ?

0

Entering edit mode

Biomed • 0

@biomed-17196

Last seen 5.7 years ago

Hi, all

I'm dealing with Agilent single channel microarray when I filtering DEGs using limma, I find all my results showing logFC>-1.5 or logFC<1.5, without significant expression. I have been searching reason for a long time, but I still don't know what my faults are, I really need some help, thanks! My code as following:

treatment<-targets$Treatment
Level<-c("neurologically healthy control","Parkinson disease")
Group<-factor(treatment,levels = Level)
Design<-model.matrix(~0+Group)
colnames(Design) <- c("Control","PD")
fita<-lmFit(exprs(eSet),Design)
cont.matrix <- makeContrasts(Diff=PD-Control, levels=Design)
fitb<- contrasts.fit(fita, cont.matrix)
fitc<- eBayes(fitb)

##result
Result<-topTable(fitc,adjust="BH",number = Inf,p.value = 0.05)
dt_result<-decideTests(fitc,lfc = log2(1.5),p.value=0.05,method="separate")
summary(dt_result)range

(Result$logFC)
[1] -1.471461 1.192352

limma toptable deg logfc • 2.3k views

ADD COMMENT • link written 5.7 years ago by Biomed • 0

0

Entering edit mode

Cross-posted on Biostars: https://www.biostars.org/p/336903/

ADD REPLY • link 5.7 years ago Kevin Blighe ★ 3.9k

score 2 · Answer 1 · 2018-09-09

Not every dataset gives DE results

If I understand your question correctly, the situation seems to have that you done an DE analysis between two groups and don't get any significantly DE genes. You don't show the output from summary(decideTests()) but I am guessing that it shows no up or down genes.

Why is that a problem? If your data is noisy, there may be no clear differential expression between the groups. The whole point of statistically rigorous DE methods is that they don't give significant results for every dataset.

If you are worried about not getting any DE genes, why are you imposing a fold-change cutoff as well as a FDR cutoff? That's almost uncertainly an unnecessary and counter-productive thing to do and it will naturally decrease your changes of getting any significant results. The limma documentation (help("decideTests")) advises against doing that.

Use up-to-date software

It is clear from your comments on Biostars that you are using the Agi4x44PreProcess package to do the pre-processing, even though that package was removed from Bioconductor many years ago. See Agi4x44PreProcess for Bioconductor version 3.0 . You must be using very old versions of R and Bioconductor. You can't really expect us to answer questions about obsolete software. Please use up-to-date software and please analyse the Agilent data in the simpler and better way that is recommended in the limma User's Guide.

BTW, cross-posting the same question to both Bioconductor and Biostars at the same time is considered impolite, especially when you supply different information to the different threads.

Fold-change vs logFC

Finally, there is a confusion in your question about log-fold-change cutoffs. Your call to decideTests() has imposed a fold-change cutoff of 1.5 (meaning that logFC > log2(1.5) = 0.58) but the title of your question refers to a 1.5 cutoff for logFC itself. These are not the same thing, so the use of 1.5 in the title of your question is incorrect.

This particular dataset

You mention on Biostars (but not on Bioconductor!) that the dataset you are analysing is from this paper: https://doi.org/10.1371/journal.pgen.1002794 . I analysed this data myself and had no difficulting finding DE genes. The use of fancy filtering or array weights would have yielded more differential expression, but I've kept it simple. Here is my complete analysis. The data was downloaded from https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-812/

> files
 [1] "C_1.txt"  "C_10.txt" "C_11.txt" "C_12.txt" "C_13.txt" "C_14.txt" "C_16.txt" "C_17.txt"
 [9] "C_18.txt" "C_19.txt" "C_2.txt"  "C_20.txt" "C_21.txt" "C_22.txt" "C_23.txt" "C_24.txt"
[17] "C_25.txt" "C_26.txt" "C_27.txt" "C_29.txt" "C_3.txt"  "C_4.txt"  "C_5.txt"  "C_6.txt" 
[25] "C_8.txt"  "C_9.txt"  "P_1.txt"  "P_11.txt" "P_12.txt" "P_13.txt" "P_14.txt" "P_15.txt"
[33] "P_16.txt" "P_18.txt" "P_19.txt" "P_20.txt" "P_21.txt" "P_22.txt" "P_24.txt" "P_26.txt"
[41] "P_27.txt" "P_28.txt" "P_29.txt" "P_3.txt"  "P_30.txt" "P_31.txt" "P_33.txt" "P_4.txt" 
[49] "P_5.txt"  "P_6.txt"  "P_7.txt"  "P_8.txt"  "P_9.txt" 
> x <- read.maimages(files,source="agilent",green.only=TRUE)
> xreg <- x[x$genes$ControlType==0,]
> xreg <- backgroundCorrect(xreg,method="normexp",offset=0)
> y <- normalizeBetweenArrays(xreg,method="quantile")
> Park <- factor(substring(colnames(y),1,1))
> design <- model.matrix(~Park)
> fit <- lmFit(y,design)
> fit <- eBayes(fit,robust=TRUE,trend=TRUE)
> summary(decideTests(fit))
       (Intercept) ParkP
Down             0   362
NotSig           0 42137
Up           43376   877