Dear all,
I have 6 samples which are three High marbling Hanwoo(Korean cattle) and three
Low marbling and each sample have 4 different stages(6mon, 12mon, 24mon, 30mon).
I'd like to find DEG from each month and intersected genes which are located in center of Venn-diagram
of all months.
Initially, I have analyzed with cufflinks and cuffdiff package to detect DEGs and now, try to do with limma.
Here is my question
I find more than 238 DEGs with Cuffdiff(24mon, q-value < 0.05), However, when I used limma package,
it gave me only 1 DEG(adj.P.Val < 0.05). I know these two tools have different protocol, and way to find DEGs
such as statistical test, normalization, expression value etc.
I'm just wondering there are some wrong codes in my scripts, if not, why only 1 gene came from.
My R scripts are below, any help should be appreciated. thanks.
library(edgeR)
library(limma)
data <- read.table("24mon_H_L.txt", sep = "\t",header = T, row.names = 1)
data <- data[,-1]
head(data)
dim(data)
#filtering
isexpr <- rowSums(cpm(data) > 1) >= 3
table(isexpr)
data <- data[isexpr,]
dim(data)
genes <- rownames(data)
colnames(data)
#makes model matrix
group <- factor(c(0,0,0,1,1,1))
group
design <- model.matrix(~0 + group)
design
colnames(design) <- c("High", "Low")
# calculate normalization factors
nf <- calcNormFactors(data, method = "TMM")
nf
# normalize the read count with voom
y <- voom(data, design, lib.size = colSums(data) * nf)
y
# extract normalized read counts
counts.voom <- y$E
head(counts.voom)
dim(counts.voom)
# save normalized expression
write.table(counts.voom, "counts.voom.txt", row.names = T, quote = F, sep = "\t")
# linear model
fit <- lmFit(y, design)
fit
# contrast matrix
cont.matrix <- makeContrasts(High-Low, levels = design)
cont.matrix
fit <- contrasts.fit(fit, cont.matrix)
fit
# Bayes
fit <- eBayes(fit)
dim(fit)
fit
# topTable
colnames(fit$coefficients)
mycoef = "High - Low"
topTable(fit, coef = mycoef)
limma.res <- topTable(fit, coef = mycoef, n = dim(fit)[1])
deg <- limma.res[which(limma.res$adj.P.Val <= 0.05),]
deg
thanks for your response James, so you mean i'd better use kallisto and sleuth to find DEG instead of Cuffdiff, right?
what do you think about this, detect DEG from limma or cuffdiff by p-value and log2(FC) even if there are some not corrected type 1 error by multiple test. Because I read some papers which have used this cutoff values as well.
I'm sorry but I have one more question, you said you have experience with using cuffdiff.
when I used cuffdiff and saw p-value, There are limited p-values for genes and they post the links why this happened. links: http://cole-trapnell-lab.github.io/cufflinks/releases/v2.1.0/
How do you deal with this problem? In my opinion, this is not reasonable, somewhat weird.
Any ideas?
thanks.