I have data like following: 56318 genes and two cell-lines with counts data.
head(counts)[1:5,]
Name Description Cell-line1 Cell-line2 1 ENSG00000223972 DDX11L1 1 2 2 ENSG00000227232 WASH7P 1639 1138 3 ENSG00000243485 MIR1302-11 7 1 4 ENSG00000237613 FAM138A 0 2 5 ENSG00000268020 OR4G4P 0 0
library(edgeR) y <- DGEList(counts = counts[,3:4], genes = counts[,2]) o <- order(rowSums(y$counts), decreasing=TRUE) y <- y[o,] d <- duplicated(y$genes$genes) y <- y[!d,] nrow(y) [1] 54354 y$samples$lib.size <- colSums(y$counts) y <- calcNormFactors(y) y$samples group lib.size norm.factors Cell-line1 1 153195968 0.969847 Cell-line2 1 96981415 1.031090 Patient <- factor(c("Cell-line1", "Cell-line2")) Tissue <- factor(c("BREAST1","BREAST2")) data.frame(Sample=colnames(y),Patient,Tissue) Sample Patient Tissue 1 Cell-line1 Cell-line1 BREAST1 2 Cell-line2 Cell-line2 BREAST2 design <- model.matrix(~Patient+Tissue) rownames(design) <- colnames(y) design y <- estimateDisp(y, design) Warning message: In estimateDisp.default(y = y$counts, design = design, group = group, : No residual df: setting dispersion to NA
Can anyone please help me out whats wrong with data or code?
This is not a DESeq2 question so I’ve removed the tag.
Hi Michael,
I would like to know whether I can do differential analysis between two cell-lines with Deseq2?
DESeq2 needs replicates for performing differential analysis. It will give you a warning/error if you try to analyze data without replicates.