Question: EdgeR for differential analysis between two cell lines without replication
gravatar for Biologist
10 weeks ago by
Biologist70 wrote:

I have data like following: 56318 genes and two cell-lines with counts data.


             Name Description       Cell-line1     Cell-line2
1 ENSG00000223972     DDX11L1            1               2
2 ENSG00000227232      WASH7P         1639            1138
3 ENSG00000243485  MIR1302-11            7               1
4 ENSG00000237613     FAM138A            0               2
5 ENSG00000268020      OR4G4P            0               0
y <- DGEList(counts = counts[,3:4], genes = counts[,2])

o <- order(rowSums(y$counts), decreasing=TRUE)
y <- y[o,]
d <- duplicated(y$genes$genes)
y <- y[!d,]
[1] 54354

y$samples$lib.size <- colSums(y$counts)
y <- calcNormFactors(y)
           group  lib.size norm.factors
Cell-line1     1 153195968     0.969847
Cell-line2     1  96981415     1.031090

Patient <- factor(c("Cell-line1", "Cell-line2"))
Tissue <- factor(c("BREAST1","BREAST2"))

       Sample     Patient   Tissue
1    Cell-line1  Cell-line1 BREAST1
2    Cell-line2  Cell-line2 BREAST2

design <- model.matrix(~Patient+Tissue)

rownames(design) <- colnames(y)

y <- estimateDisp(y, design)
Warning message:
In estimateDisp.default(y = y$counts, design = design, group = group,  :
  No residual df: setting dispersion to NA

Can anyone please help me out whats wrong with data or code?


ADD COMMENTlink modified 10 weeks ago by Gordon Smyth35k • written 10 weeks ago by Biologist70

This is not a DESeq2 question so I’ve removed the tag.

ADD REPLYlink written 10 weeks ago by Michael Love19k

Hi Michael,

I would like to know whether I can do differential analysis between two cell-lines with Deseq2?

ADD REPLYlink written 10 weeks ago by Biologist70

DESeq2 needs replicates for performing differential analysis. It will give you a warning/error if you try to analyze data without replicates.

ADD REPLYlink written 10 weeks ago by Michael Love19k
gravatar for Gordon Smyth
10 weeks ago by
Gordon Smyth35k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth35k wrote:

See Section 2.11 of the edgeR User's Guide "What to do if you have no replicates".

You asked the same question a few months ago and got the same answer: Differential analysis between single sample vs single sample (control vs treatment) with no replicates

ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by Gordon Smyth35k

Hi Gordon,

Thank you. I followed the tutorial and did the analysis.

I have raw counts data of 72 genes for two cell-lines in dataframe "df". Three columns. First columns has genes and other columns are cell-lines.

df <- data.frame(df[,-1], row.names=df[,1])

y <- DGEList(counts=df[,2:3], genes=tin[,1], group = 1:2)
y <- calcNormFactors(y,method = "TMM")


                group lib.size norm.factors
AU565_BREAST        1     5101    0.8226359
MDAMB468_BREAST     2     6144    1.2156047
bcv <- 0.1

et <- exactTest(y, dispersion=bcv^2)
tab <- topTags(et,n=Inf)

Down     8
NotSig  54
Up      10
keep <- tab$table$FDR <= 0.05

The summary shows 1+2 How to say that 10 genes are Upregulated in which cell-line?

ADD REPLYlink written 10 weeks ago by Biologist70

The column heading is supposed to be "2-1" meaning group2 vs group1. So the 10 DE genes are up in the MDAMB468 cell line.

You can find out what exactTests() does by reading the help page ?exactTest. By default it compares the 2nd group to the 1st.

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by Gordon Smyth35k

Thank you very much for the reply.

ADD REPLYlink written 9 weeks ago by Biologist70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 123 users visited in the last hour