EdgeR: getting CPM values after batch effect correction
3
0
Entering edit mode
@gilgi-friedlander-6113
Last seen 9.7 years ago
Dear list, In edgeR, it possible to get CPM values after batch effect correction (and after TMM normalization)? Thanks a lot! [[alternative HTML version deleted]]
edgeR edgeR • 3.7k views
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 9 months ago
Scripps Research, La Jolla, CA
You would have to define batch correction. Are you talking about fitting a model of the form "~ experimetalVar + batchEffect" and then subtracting out the batch effect coefficient? On Sun Aug 25 11:11:55 2013, Gilgi Friedlander wrote: > Dear list, > > In edgeR, it possible to get CPM values after batch effect correction (and after TMM normalization)? > > Thanks a lot! > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 9 months ago
Scripps Research, La Jolla, CA
Hi Gilgi, If you're just using edgeR for the CPM calculation, I believe there's no need to estimate dispersions; they won't be used in calculating the CPM values. You may wish to use normalized.lib.sizes=TRUE in the call to cpm to get CPM values that take into account the normalization factors computed by calcNormFactors (otherwise there's no point in doing that either). Second, according to the help text for removeBatchEffect, you're not supposed to include the batch effect in the design matrix. The design matrix should include all the experimental variables, while the batch variable should indicate the technical batching. Finally, instead of doing a simple logCPM transform, you might also try the variance-stabilizing transformation provided by the DESeq package, which is intended for clustering and machine learning types of analyses. -Ryan On 08/27/2013 10:52 PM, Gilgi Friedlander wrote: > Hi Ryan, > > Thanks a lot for the reply. > > I followed EdgeR user's manual, and defined a model: > y1<-DGEList(counts=countdata1,group=batch) > y1<- calcNormFactors( y1 ) > design1 <- model.matrix(~batch+Treat1) > > batch has values 1 or 2, according to the batch of the experiment that was done. > > Treat has 10 different samples. > > In order to define a contrast I did the following: > y1 <- estimateGLMCommonDisp(y1, design1, verbose=TRUE)#Now we are ready to construct an edgeR specific > y1 <- estimateGLMTrendedDisp(y1, design1) > y1 <- estimateGLMTagwiseDisp(y1, design1) > lrt <- glmLRT(fit,contrast=c(0,0,1,-1,0,0,0,0,0,0,0,0,0,0)) > > I want now to get also the log counts after removal of the batch effect (for the purpose of clustering of the genes). > > Is it correct to obtain the batch removed log counts in the following way: > > logCPM <- cpm(y1, log=TRUE, prior.count=3) > logCPM_batchRemoved<-removeBatchEffect(logCPM,batch=batch,design=des ign1) > > Many thanks, > Gilgi > > -----Original Message----- > From: Ryan [mailto:rct at thompsonclan.org] > Sent: Wednesday, August 28, 2013 2:46 AM > To: Gilgi Friedlander > Cc: bioconductor at r-project.org > Subject: Re: [BioC] EdgeR: getting CPM values after batch effect correction > > You would have to define batch correction. Are you talking about fitting a model of the form "~ experimetalVar + batchEffect" and then subtracting out the batch effect coefficient? > > On Sun Aug 25 11:11:55 2013, Gilgi Friedlander wrote: >> Dear list, >> >> In edgeR, it possible to get CPM values after batch effect correction (and after TMM normalization)? >> >> Thanks a lot! >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 9 months ago
Scripps Research, La Jolla, CA
Hi Gilgi, All you have to do is remove "batch" from the formula in your call to "model.matrix", and using the resulting matrix in your call to removeBatchEffect. Note, however, that for differential expression testing, you want to use the design matrix that you're already using and avoid removeBatchEffect. For combining batch effect removal with the variance stabilizing transformation, I believe you would first apply the VST to get normalized, variance-stabilized logCPM, and then use removeBatchEffect on the result. -Ryan P.S. Please use "Reply All" when you reply to posts in order to ensure that the whole mailing list sees your reply. On 08/28/2013 11:08 PM, Gilgi Friedlander wrote: > Hi Ryan, > > Thanks a lot for your reply. > > The reason I used estimate dispersions is that I wanted to do also differential expression analysis. > > Regarding the removeBatchEffect - I am not sure I understand how to define the design and then how to define the batch. Do you happen to have an example how to use it? > > Thanks for the variance-stabilizing transformation tip. Would it be correct to pass it as the matrix given to removeBatchEffect? > > Thanks a lot, > Gilgi > > -----Original Message----- > From: Ryan C. Thompson [mailto:rct at thompsonclan.org] > Sent: Wednesday, August 28, 2013 11:02 AM > To: Gilgi Friedlander > Cc: bioconductor > Subject: Re: [BioC] EdgeR: getting CPM values after batch effect correction > > Hi Gilgi, > > If you're just using edgeR for the CPM calculation, I believe there's no need to estimate dispersions; they won't be used in calculating the CPM values. You may wish to use normalized.lib.sizes=TRUE in the call to cpm to get CPM values that take into account the normalization factors computed by calcNormFactors (otherwise there's no point in doing that either). > > Second, according to the help text for removeBatchEffect, you're not supposed to include the batch effect in the design matrix. The design matrix should include all the experimental variables, while the batch variable should indicate the technical batching. > > Finally, instead of doing a simple logCPM transform, you might also try the variance-stabilizing transformation provided by the DESeq package, which is intended for clustering and machine learning types of analyses. > > -Ryan > > On 08/27/2013 10:52 PM, Gilgi Friedlander wrote: >> Hi Ryan, >> >> Thanks a lot for the reply. >> >> I followed EdgeR user's manual, and defined a model: >> y1<-DGEList(counts=countdata1,group=batch) >> y1<- calcNormFactors( y1 ) >> design1 <- model.matrix(~batch+Treat1) >> >> batch has values 1 or 2, according to the batch of the experiment that was done. >> >> Treat has 10 different samples. >> >> In order to define a contrast I did the following: >> y1 <- estimateGLMCommonDisp(y1, design1, verbose=TRUE)#Now we are >> ready to construct an edgeR specific >> y1 <- estimateGLMTrendedDisp(y1, design1) >> y1 <- estimateGLMTagwiseDisp(y1, design1) lrt <- >> glmLRT(fit,contrast=c(0,0,1,-1,0,0,0,0,0,0,0,0,0,0)) >> >> I want now to get also the log counts after removal of the batch effect (for the purpose of clustering of the genes). >> >> Is it correct to obtain the batch removed log counts in the following way: >> >> logCPM <- cpm(y1, log=TRUE, prior.count=3) >> logCPM_batchRemoved<-removeBatchEffect(logCPM,batch=batch,design=desig >> n1) >> >> Many thanks, >> Gilgi >> >> -----Original Message----- >> From: Ryan [mailto:rct at thompsonclan.org] >> Sent: Wednesday, August 28, 2013 2:46 AM >> To: Gilgi Friedlander >> Cc: bioconductor at r-project.org >> Subject: Re: [BioC] EdgeR: getting CPM values after batch effect >> correction >> >> You would have to define batch correction. Are you talking about fitting a model of the form "~ experimetalVar + batchEffect" and then subtracting out the batch effect coefficient? >> >> On Sun Aug 25 11:11:55 2013, Gilgi Friedlander wrote: >>> Dear list, >>> >>> In edgeR, it possible to get CPM values after batch effect correction (and after TMM normalization)? >>> >>> Thanks a lot! >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 811 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6