predFC doesn't use norm factors?
1
0
Entering edit mode
Jenny Drnevich ★ 2.0k
@jenny-drnevich-2812
Last seen 4 days ago
United States
Hi Gordon, I used to use predFC() to get modified log count-per-million values per sample but now I'm switching to cpm(). I just realized that predFC() doesn't use the normalization factors in the DGEList object when design=NULL, but it does appear to use them when the design is specified (see example below) and there is no argument to specify them, unlike cpm(). Is this a bug or the intended behavior of predFC()? Thanks, Jenny > library(edgeR) Loading required package: limma > > # generate counts for a two group experiment with n=2 in each group and 100 genes > dispersion <- 0.1 > y1 <- matrix(rnbinom(400,size=1/dispersion,mu=4),nrow=100) > y1 <- DGEList(y1,group=c(1,1,2,2)) > design <- model.matrix(~group, data=y1$samples) > > y2 <- y1 > y2$samples$norm.factors <- c(0.9,0.9,1.1,1.1) > > > #estimate the predictive log fold changes > > predlfc1 <- predFC(y1,design,dispersion=dispersion,prior.count=1) > predlfc2 <- predFC(y2,design,dispersion=dispersion,prior.count=1) > > all.equal(predlfc1,predlfc2) [1] "Mean relative difference: 0.04869379" > > > predlfc3 <- predFC(y1,dispersion=dispersion,prior.count=1) > predlfc4 <- predFC(y2,dispersion=dispersion,prior.count=1) > > all.equal(predlfc3,predlfc4) [1] TRUE > > cpm1 <- cpm(y1,log=T,prior.count=1) > cpm2 <- cpm(y2,log=T,prior.count=1) > > all.equal(cpm1,cpm2) [1] "Mean relative difference: 0.007799144" > > all.equal(cpm1,predlfc3) [1] TRUE > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.2.4 limma_3.16.7 > Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist High Performance Biological Computing Program and The Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign NOTE NEW OFFICE LOCATION 2112 IGB 1206 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-300-6543 fax: 217-265-5066 e-mail: drnevich@illinois.edu [[alternative HTML version deleted]]
Normalization Normalization • 1.1k views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia
Dear Jenny, On Fri, 30 Aug 2013, Zadeh, Jenny Drnevich wrote: > Hi Gordon, > > I used to use predFC() to get modified log count-per-million values per > sample but now I'm switching to cpm(). That's good. When we introduced cpm() a couple of years ago, we intended it to take over this role. > I just realized that predFC() doesn't use the normalization factors in > the DGEList object when design=NULL, but it does appear to use them when > the design is specified (see example below) and there is no argument to > specify them, unlike cpm(). Is this a bug or the intended behavior of > predFC()? It is not how I want it to work. Really it is a carry over from old behavior that has been kept for historical reasons and backward compatibility. Computing cpm was never the main purpose the predFC(), although it was used to do that before cpm() existed as a separate function. Our first implementation of cpm() did not use normalization factors. I am going to deprecate this behaviour entirely and return predFC() exclusively to its main purpose. For the next release cycle, predFC() will give a warning message when it gets a NULL design, asking users to switch to cpm(). In the longer term future, predFC() will treat NULL design matrices in the same way that glmFit() does. Best wishes Gordon > Thanks, > Jenny > >> library(edgeR) > Loading required package: limma >> >> # generate counts for a two group experiment with n=2 in each group and 100 genes >> dispersion <- 0.1 >> y1 <- matrix(rnbinom(400,size=1/dispersion,mu=4),nrow=100) >> y1 <- DGEList(y1,group=c(1,1,2,2)) >> design <- model.matrix(~group, data=y1$samples) >> >> y2 <- y1 >> y2$samples$norm.factors <- c(0.9,0.9,1.1,1.1) >> >> >> #estimate the predictive log fold changes >> >> predlfc1 <- predFC(y1,design,dispersion=dispersion,prior.count=1) >> predlfc2 <- predFC(y2,design,dispersion=dispersion,prior.count=1) >> >> all.equal(predlfc1,predlfc2) > [1] "Mean relative difference: 0.04869379" >> >> >> predlfc3 <- predFC(y1,dispersion=dispersion,prior.count=1) >> predlfc4 <- predFC(y2,dispersion=dispersion,prior.count=1) >> >> all.equal(predlfc3,predlfc4) > [1] TRUE >> >> cpm1 <- cpm(y1,log=T,prior.count=1) >> cpm2 <- cpm(y2,log=T,prior.count=1) >> >> all.equal(cpm1,cpm2) > [1] "Mean relative difference: 0.007799144" >> >> all.equal(cpm1,predlfc3) > [1] TRUE >> sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 > [2] LC_CTYPE=English_United States.1252 > [3] LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.2.4 limma_3.16.7 >> > > > Jenny Drnevich, Ph.D. > > Functional Genomics Bioinformatics Specialist > High Performance Biological Computing Program > and The Roy J. Carver Biotechnology Center > University of Illinois, Urbana-Champaign > > NOTE NEW OFFICE LOCATION > 2112 IGB > 1206 W. Gregory Dr. > Urbana, IL 61801 > USA > ph: 217-300-6543 > fax: 217-265-5066 > e-mail: drnevich at illinois.edu > > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENT

Login before adding your answer.

Traffic: 530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6