Identify the normalized count from differential expression
1
0
Entering edit mode
@shuklashweta33-16334
Last seen 5.9 years ago

Dear All,

I am using EdgeR package to extract the differential expression . I do not have replicates in my samples.

Now I want to know the normalized read count of my datasets.

I have tried with rpkm method but edgeR follow the TMM method for normalization so there is a variation at normalized read.

Could anyone suggest how can I get the normalized read count for all my reads.

Here is my code-

countdata <- read.table("CD.txt",header = T,sep = "\t",row.names = 1)
attach(countdata)
group <- factor(c(1,2))
y <- DGEList(counts = countdata, lib.size = c(126724,347238), group = group)
y$samples
keep <- rowSums(rpkm(y)>1) >= 2
y <- y[keep, , keep.lib.sizes=FALSE]
y <- calcNormFactors(y,method="TMM")
design <- model.matrix(~group)
y$samples
bcv <- 0.2
y <- DGEList(counts = countdata, lib.size = c(126724,347238), group = group,genes=data.frame(Length=GeneLength))
y <- calcNormFactors(y)
RPKM <- rpkm(y)
et <- exactTest(y, dispersion=bcv^2)
y1 <- y
y1$samples$group <- 1
y0 <- estimateCommonDisp(y1[c(CHI_ReadCnt,DEH_ReadCnt),])
y$common.dispersion <- y0$common.dispersion
fit <- glmFit(y, design)
lrt <- glmLRT(fit)
resSig <- res[ which(res$PValue < 0.05 ), ]

 

I shall be very thankful.

 

edger • 918 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

What you may be misunderstanding is that the TMM normalization factors are automatically taken into account when you compute the RPKMs.

I very much dislike the term "normalized read count" because I don't think there is any such thing. What you probably need is just the RPKM. And it is clear from your code that edgeR has already computed these for you.

 

ADD COMMENT
0
Entering edit mode

Thank you for your response.

yes I want RPKM values of my differential datasets. Can I get that from edgeR?

ADD REPLY
0
Entering edit mode

Could you please provide me a codes for complete analysis ??

I shall be very thankful.

ADD REPLY
0
Entering edit mode

The code you gave in your question already computes RPKM. You even called it "RPKM".

ADD REPLY
0
Entering edit mode

I agree but when I calculate manually to cross check my final results from edger using log values it gives different values seems it is not normalized by RPKM method it taking TMM method in background. Even if it follow TMM method can I get per sample normalized read count.

ADD REPLY
0
Entering edit mode

If I were you, I'd be explicit about the value of the gene.length parameter in my call to rpkm, but other than that, you can rest assured that it is doing the right thing.

If you really want to know what these functions are doing in more detail so that you can compare them with what you are doing manually, you should look at the source code for rpkm.DGEList and rpkm.default.

You'll see that this delegates to the cpm function, which has been rewritten in C/C++ for efficiency ... I went back to find the last "pure R" version of cpm, which seems to be edgeR version 3.14.0

You can download the source version of that package and look at the code in the R/cpm.R function if you really want to know what's going on.

ADD REPLY

Login before adding your answer.

Traffic: 381 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6