Dear All,
I have a matrix "U2" with 44 samples (41 Disease + 3 Normal) as columns and approx. 15k genes as rows. "coldata" is where samples as rows and columns "Subtypes" and "Type". Column "Type" is with Disease and Normal.
library(DESeq2) dds <- DESeqDataSetFromMatrix(countData = U2, colData = coldata, design = ~ Type) dds class: DESeqDataSet dim: 15754 44 metadata(1): version assays(1): counts rownames(15754): A1BG-AS1 A2M-AS1 ... ZSWIM8-AS1 hsa-mir-1253 rowData names(0): colnames(44): AT PT ... CT OT colData names(2): Subtypes Type keep <- rowSums(counts(dds)) >= 10 dds <- dds[keep,] nrow(dds) [1] 11755 dds$Type <- factor(dds$Type, levels = c("Normal","Disease")) dds <- DESeq(dds) estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting model and testing -- replacing outliers and refitting for 1206 genes -- DESeq argument 'minReplicatesForReplace' = 7 -- original counts are preserved in counts(dds) estimating dispersions fitting model and testing res <- results(dds) res res <- results(dds, name="Type_Disease_vs_Normal") res <- results(dds, contrast=c("Type","Disease","Normal")) resultsNames(dds) [1] "Intercept" "Type_Disease_vs_Normal" Differential analysis: res05 <- results(dds, lfcThreshold = 1.2) table(res05$padj < 0.05) summary(res05) out of 11755 with nonzero total read count adjusted p-value < 0.1 LFC > 1.20 (up) : 2, 0.016% LFC < -1.20 (down) : 0, 0% outliers [1] : 72, 0.57% low counts [2] : 33, 0.26% (mean count < 0) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results
With DeSeq2 I see only two upregulated and no downregulated genes using lofFC 1.2 and FDR < 0.05.
I also used same data with edgeR.
library(edgeR) group <- factor(paste0(samples2$Type)) y <- DGEList(U2,group = group) design2 <- model.matrix(~ 0 + group) colnames(design2) <- c("Normal","Disease") design2 keep <- filterByExpr(y, design2) keep <- rowSums(cpm(y) > 0.5) >= 1 table(keep) keep FALSE TRUE 1573 14181 summary(keep) y <- y[keep, , keep.lib.sizes=FALSE] y <- calcNormFactors(y,method = "TMM") y <- estimateDisp(y, design2, robust=TRUE) y$common.dispersion [1] 1.428964 fit <- glmQLFit(y, design2, robust=TRUE) head(fit$coefficients) contrast.matrix <- makeContrasts(Disease-Normal, levels=design2) contrast.matrix qlf <- glmQLFTest(fit, contrast=contrast.matrix) topTags(qlf) summary(decideTestsDGE(qlf)) tr <- glmTreat(fit, contrast=contrast.matrix, lfc=log2(1.2)) topTags(tr) tab2 <- topTags(tr,n=Inf) keep <- tab2$table$FDR <= 0.05 summary(decideTestsDGE(tr)) -1*Normal 1*Disease Down 203 NotSig 13974 Up 4
Questions:
1) Is my analysis with DeSeq2 right? I feel so because it doesn't show any down regulated genes.
2) with deseq2 there are two upregulated and with edgeR there are four. And none of them are same.
3) My analysis is with mainly lncRNAs. These have very very low expression values. So, Is there a better way to do differential analysis?
Thank you
Thank you very much
Hi Michael,
I have a very small cohort 28 cancer vs 3 Normal only. For this type of cohort for differential analysis do I need keep any special arguments while using deseq2?