While trying to remove non differentially expressed genes through limma, I am getting a warning
1
0
Entering edit mode
Rishav ▴ 20
@rishav-25031
Last seen 15 months ago
Patna

While performing co-expression analysis of my RNA seq data, I tried to remove non differentially expressed genes using limma, but I am getting a warning. Please help me out. My RNA seq count data is "hep"


>log_counts <- log2(data.matrix(hep[,-1]))
> sample <- factor(rep(c("Non.neoplastic","Tumor"), each=21))
> design.mat <- model.matrix(~0+sample)
> colnames(design.mat) <- levels(sample)
> contrast.mat <- makeContrasts(diff = "Non.neoplastic-Tumor", levels = design.mat)
> fit <- lmFit(log_counts, design.mat)
> fit2 <- contrasts.fit(fit, contrast.mat)
> fit3 <- eBayes(fit2)
# Warning:
Zero sample variances detected, have been offset away from zero

RNASeqData • 769 views
1
Entering edit mode

Since you have a question ongoing on biostars https://www.biostars.org/p/9463542/ where one can see a shapshot of the data...I would assume that this is raw counts that you transform here and feed to limma? Did you normalize this? It is strongly recommended to exactly follow code from the manual if you are new to the topic. This would include the prefiltering step that is mentioned below. Something like https://f1000research.com/articles/5-1408/v3

0
Entering edit mode

After performing log transformation, I tried to remove any genes with _zero_ variance through this code

> log_counts <- log_counts[apply(log_counts, 1, var) > 0,]
> sample <- factor(rep(c("Non.neoplastic","Tumor"), each=21))
> design.mat <- model.matrix(~0+sample)
> colnames(design.mat) <- levels(sample)
> contrast.mat <- makeContrasts(diff = "Non.neoplastic-Tumor", levels = design.mat)
> fit <- lmFit(log_counts, design.mat)
> fit2 <- contrasts.fit(fit, contrast.mat)
> fit3 <- eBayes(fit2, robust=TRUE)
> deg <- topTable(fit3, coef = 'diff', p.value=0.05, adjust.method = 'fdr',number= nrow(log_counts))
> dim(deg)
[1] 0 0


Even then, I am getting Zero variation in genes

1
Entering edit mode

Not good, please follow the linked workflow. It covers normalization, filtering and data transformation via voom, so all the topics that are currently lacking or not done properly.

0
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia

The warning seems self-explanatory, is it not? Some of your genes have identical expression values for all samples so the variance is zero. This will make limma's empirical Bayes procedure less effective.

There are three ways you can respond:

1. Filter out unexpressed genes. The warning is quite probably caused by the fact that you have not filtered out unexpressed genes, so you may be including genes for which expression is at the minimum value for all samples.

2. Set robust=TRUE in eBayes, which will make the warning less important.

3. It is just a warning so you can just ignore it.

Later

With the help of ATpoint's comments, it now appears that you are simply inputing RNA-seq counts to limma as if they were expression values without any normalization for library size. This makes your whole analysis incorrect. Probably may of the rows are entirely zero, leading to the warning you see. You instead should follow an appropriate RNA-seq workflow such as voom.

0
Entering edit mode

On performing this, I am getting this warning, what should I do now

> fit3 <- eBayes(fit2, robust=TRUE)


Warning message: 11030 very small variances detected, have been offset away from zero

0
Entering edit mode

0
Entering edit mode

After ignoring this warning, I proceed further and I got this

> deg <- topTable(fit3, coef = 'diff', p.value=0.05, adjust.method = 'fdr',number= nrow(log_counts))
> dim(deg)


[1] 0 0

0
Entering edit mode

I suspect that your whole analysis is probably incorrect because you are probably inputing and preprocessing the data incorrectly. Please follow one of our workflows for RNA-seq data.