Question

Identify the normalized count from differential expression

0

Entering edit mode

shuklashweta33 • 0

@shuklashweta33-16334

Last seen 5.9 years ago

Dear All,

I am using EdgeR package to extract the differential expression . I do not have replicates in my samples.

Now I want to know the normalized read count of my datasets.

I have tried with rpkm method but edgeR follow the TMM method for normalization so there is a variation at normalized read.

Could anyone suggest how can I get the normalized read count for all my reads.

Here is my code-

countdata <- read.table("CD.txt",header = T,sep = "\t",row.names = 1)
attach(countdata)
group <- factor(c(1,2))
y <- DGEList(counts = countdata, lib.size = c(126724,347238), group = group)
y$samples
keep <- rowSums(rpkm(y)>1) >= 2
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y,method="TMM")
design <- model.matrix(~group)
y$samples
bcv <- 0.2
y <- DGEList(counts = countdata, lib.size = c(126724,347238), group = group,genes=data.frame(Length=GeneLength))
y <- calcNormFactors(y)
RPKM <- rpkm(y)
et <- exactTest(y, dispersion=bcv^2)
y1 <- y
y1$samples$group <- 1
y0 <- estimateCommonDisp(y1[c(CHI_ReadCnt,DEH_ReadCnt),])
y$common.dispersion <- y0$common.dispersion
fit <- glmFit(y, design)
lrt <- glmLRT(fit)
resSig <- res[ which(res$PValue < 0.05 ), ]

I shall be very thankful.

edger • 918 views

ADD COMMENT • link updated 5.9 years ago by Gordon Smyth 50k • written 5.9 years ago by shuklashweta33 • 0

score 0 · Answer 1 · 2018-07-03

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

What you may be misunderstanding is that the TMM normalization factors are automatically taken into account when you compute the RPKMs.

I very much dislike the term "normalized read count" because I don't think there is any such thing. What you probably need is just the RPKM. And it is clear from your code that edgeR has already computed these for you.

ADD COMMENT • link 5.9 years ago Gordon Smyth 50k

0

Entering edit mode

Thank you for your response.

yes I want RPKM values of my differential datasets. Can I get that from edgeR?

ADD REPLY • link 5.9 years ago shuklashweta33 • 0

0

Entering edit mode

Could you please provide me a codes for complete analysis ??

I shall be very thankful.

ADD REPLY • link 5.9 years ago shuklashweta33 • 0

0

Entering edit mode

The code you gave in your question already computes RPKM. You even called it "RPKM".

ADD REPLY • link 5.9 years ago Gordon Smyth 50k

0

Entering edit mode

I agree but when I calculate manually to cross check my final results from edger using log values it gives different values seems it is not normalized by RPKM method it taking TMM method in background. Even if it follow TMM method can I get per sample normalized read count.

ADD REPLY • link 5.9 years ago shuklashweta33 • 0

0

Entering edit mode

If I were you, I'd be explicit about the value of the gene.length parameter in my call to rpkm, but other than that, you can rest assured that it is doing the right thing.

If you really want to know what these functions are doing in more detail so that you can compare them with what you are doing manually, you should look at the source code for rpkm.DGEList and rpkm.default.

You'll see that this delegates to the cpm function, which has been rewritten in C/C++ for efficiency ... I went back to find the last "pure R" version of cpm, which seems to be edgeR version 3.14.0

You can download the source version of that package and look at the code in the R/cpm.R function if you really want to know what's going on.

ADD REPLY • link 5.9 years ago Steve Lianoglou ★ 13k