EdgeR for differential analysis between two cell lines without replication
1
0
Entering edit mode
Biologist ▴ 120
@biologist-9801
Last seen 4.7 years ago

I have data like following: 56318 genes and two cell-lines with counts data.

head(counts)[1:5,]

             Name Description       Cell-line1     Cell-line2
1 ENSG00000223972     DDX11L1            1               2
2 ENSG00000227232      WASH7P         1639            1138
3 ENSG00000243485  MIR1302-11            7               1
4 ENSG00000237613     FAM138A            0               2
5 ENSG00000268020      OR4G4P            0               0
library(edgeR)
y <- DGEList(counts = counts[,3:4], genes = counts[,2])

o <- order(rowSums(y$counts), decreasing=TRUE)
y <- y[o,]
d <- duplicated(y$genes$genes)
y <- y[!d,]
nrow(y)
[1] 54354

y$samples$lib.size <- colSums(y$counts)
y <- calcNormFactors(y)
y$samples
           group  lib.size norm.factors
Cell-line1     1 153195968     0.969847
Cell-line2     1  96981415     1.031090

Patient <- factor(c("Cell-line1", "Cell-line2"))
Tissue <- factor(c("BREAST1","BREAST2"))
data.frame(Sample=colnames(y),Patient,Tissue)

       Sample     Patient   Tissue
1    Cell-line1  Cell-line1 BREAST1
2    Cell-line2  Cell-line2 BREAST2

design <- model.matrix(~Patient+Tissue)

rownames(design) <- colnames(y)
design

y <- estimateDisp(y, design)
Warning message:
In estimateDisp.default(y = y$counts, design = design, group = group,  :
  No residual df: setting dispersion to NA

Can anyone please help me out whats wrong with data or code?

 

edger differential gene expression rnaseq • 2.9k views
ADD COMMENT
0
Entering edit mode

This is not a DESeq2 question so I’ve removed the tag.

ADD REPLY
0
Entering edit mode

Hi Michael,

I would like to know whether I can do differential analysis between two cell-lines with Deseq2?

ADD REPLY
1
Entering edit mode

DESeq2 needs replicates for performing differential analysis. It will give you a warning/error if you try to analyze data without replicates.

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 11 hours ago
WEHI, Melbourne, Australia

See Section 2.11 of the edgeR User's Guide "What to do if you have no replicates".

You asked the same question a few months ago and got the same answer: Differential analysis between single sample vs single sample (control vs treatment) with no replicates

ADD COMMENT
0
Entering edit mode

Hi Gordon,

Thank you. I followed the tutorial and did the analysis.

I have raw counts data of 72 genes for two cell-lines in dataframe "df". Three columns. First columns has genes and other columns are cell-lines.

df <- data.frame(df[,-1], row.names=df[,1])

library(edgeR)
y <- DGEList(counts=df[,2:3], genes=tin[,1], group = 1:2)
y <- calcNormFactors(y,method = "TMM")

y$samples

                group lib.size norm.factors
AU565_BREAST        1     5101    0.8226359
MDAMB468_BREAST     2     6144    1.2156047
bcv <- 0.1

et <- exactTest(y, dispersion=bcv^2)
topTags(et,n=100)
tab <- topTags(et,n=Inf)
summary(decideTestsDGE(et))

       1+2
Down     8
NotSig  54
Up      10
keep <- tab$table$FDR <= 0.05
tab$table[keep,]

The summary shows 1+2 How to say that 10 genes are Upregulated in which cell-line?

ADD REPLY
1
Entering edit mode

The column heading is supposed to be "2-1" meaning group2 vs group1. So the 10 DE genes are up in the MDAMB468 cell line.

You can find out what exactTests() does by reading the help page ?exactTest. By default it compares the 2nd group to the 1st.

ADD REPLY
0
Entering edit mode

Thank you very much for the reply.

ADD REPLY

Login before adding your answer.

Traffic: 361 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6