Question: TMM and calcNormFactors: Normalization in baySeq to match edgeR and DESeq
0
gravatar for Gordon Smyth
7.9 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:
Dear Hilary and Thomas, The calcNormFactors() argument formerly called "quantile" was renamed to "p" in the edgeR package in Bioc-devel on 10 July, because the quantity is a probability and not a quantile. At the same time, the option method="quantile" was renamed to method="upperquartile", to better match the original terminology for Bullard et al (2010) paper and to distinguish it from full quantile normalization now being proposed by a number of authors. Best wishes Gordon > Date: Thu, 17 Nov 2011 10:07:31 -0500 (EST) > From: "Smith, Hilary A" <hilary.smith at="" gatech.edu=""> > To: bioconductor at r-project.org > Subject: [BioC] TMM and calcNormFactors: Normalization in baySeq to > match edgeR and DESeq > > Hello, > I'm working on a couple analyses (currently pairwise) for 3'-DGE. Using > baySeq, edgeR, and DESeq are yielding different answers; specifically > DESeq and baySeq find different subsets of the genes found by edgeR. In > trying to isolate the discrepancy, I've been trying to make items like > normalization procedures similar to see if that improves congruency, or > if the differences merely stem from how the pairwise tests are run and > use of bayesian vs. exact-type statistics. I saw that baySeq's function > "getLibsizes" can use the edgeR implementation of TMM, but when I try to > do this I get an error message about a quantile argument not being used. > This error appears whether or not I specify a quantile, and I'm further > confused because the edgeR program itself does not require specifying > quantiles for its TMM-based calcNormFactors. EdgeR seems to run fine so > I think the problem is in the implementation of baySeq; perhaps I'm > misunderstanding/coding something? Any help is greatly appreciated; > commands excerpted from an R session are below. > > >> library(baySeq) > > Attaching package: 'baySeq' > > The following object(s) are masked from 'package:base': > > rbind > >> library(snow) >> cl = makeCluster(4, "SOCK") >> library(edgeR) >> simData = read.delim(file="2011.11.03counts.txt", header=TRUE) >> rownames(simData)=simData$CompID >> simData=simData[,-1] >> simData=as.matrix(simData) >> head(simData) > X1E_F X1E_R X2E_F X2E_R X3E_F X3E_R X1P_F X1P_R X2P_F X2P_R X3P_F > comp0 1065 1159 1207 1572 1477 1817 1841 605 1915 1113 1645 > comp1 544 534 341 675 333 739 690 236 502 451 571 > comp10 30423 37677 28044 54466 23961 58271 53852 34712 59300 40312 44575 > comp100 1060 1065 999 1332 918 1620 1697 658 1117 861 1336 > comp1000 130 157 229 266 141 247 263 135 182 188 168 > comp10000 35 14 15 37 10 47 28 17 22 21 12 > X3P_R > comp0 1732 > comp1 799 > comp10 51243 > comp100 1370 > comp1000 244 > comp10000 64 >> replicates = c("F", "R", "F", "R", "F", "R", "F", "R", "F", "R", "F", "R") >> groups = list(NDE = c(1,1,1,1,1,1,1,1,1,1,1,1), DE = c(1,2,1,2,1,2,1,2,1,2,1,2)) >> cD = new("countData", data = simData, replicates = replicates, groups=groups) >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR") > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="TMM") > Error in match.arg(estimationType) : > 'arg' should be one of "quantile", "total", "edgeR" >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR", quantile=0.75) > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, quantile=0.75, estimationType="edgeR") > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType=c("edgeR", quantile=0.75)) > Error in match.arg(estimationType) : 'arg' must be of length 1 >> calcNormFactors(cD) > Error in calcNormFactors(cD) : > calcNormFactors() only operates on 'matrix' and 'DGEList' objects >> calcNormFactors(simData) > X1E_F X1E_R X2E_F X2E_R X3E_F X3E_R X1P_F X1P_R > 1.0353157 0.9529524 0.9868063 1.1068479 1.0054938 1.0218195 0.9600905 0.8287707 > X2P_F X2P_R X3P_F X3P_R > 1.0550414 0.8955669 1.0869486 1.1052472 >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR") > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) >> cD at libsizes = getLibsizes(data=simData, replicates=replicates, subset=NULL, estimationType="edgeR") > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENTlink modified 7.9 years ago by Smith, Hilary A40 • written 7.9 years ago by Gordon Smyth38k
Answer: TMM and calcNormFactors: Normalization in baySeq to match edgeR and DESeq
0
gravatar for Smith, Hilary A
7.9 years ago by
Smith, Hilary A40 wrote:
Thank you very much, and thank you for posting the code to allow baySeq to use the calcNormFactors/TMM normalization. Best, Hilary ----- Original Message ----- From: "Gordon K Smyth" <smyth@wehi.edu.au> To: "Hilary A Smith" <hilary.smith at="" gatech.edu=""> Cc: "Bioconductor mailing list" <bioconductor at="" r-project.org="">, "Thomas J Hardcastle" <tjh48 at="" cam.ac.uk=""> Sent: Friday, November 18, 2011 10:10:23 PM Subject: TMM and calcNormFactors: Normalization in baySeq to match edgeR and DESeq Dear Hilary and Thomas, The calcNormFactors() argument formerly called "quantile" was renamed to "p" in the edgeR package in Bioc-devel on 10 July, because the quantity is a probability and not a quantile. At the same time, the option method="quantile" was renamed to method="upperquartile", to better match the original terminology for Bullard et al (2010) paper and to distinguish it from full quantile normalization now being proposed by a number of authors. Best wishes Gordon > Date: Thu, 17 Nov 2011 10:07:31 -0500 (EST) > From: "Smith, Hilary A" <hilary.smith at="" gatech.edu=""> > To: bioconductor at r-project.org > Subject: [BioC] TMM and calcNormFactors: Normalization in baySeq to > match edgeR and DESeq > > Hello, > I'm working on a couple analyses (currently pairwise) for 3'-DGE. Using > baySeq, edgeR, and DESeq are yielding different answers; specifically > DESeq and baySeq find different subsets of the genes found by edgeR. In > trying to isolate the discrepancy, I've been trying to make items like > normalization procedures similar to see if that improves congruency, or > if the differences merely stem from how the pairwise tests are run and > use of bayesian vs. exact-type statistics. I saw that baySeq's function > "getLibsizes" can use the edgeR implementation of TMM, but when I try to > do this I get an error message about a quantile argument not being used. > This error appears whether or not I specify a quantile, and I'm further > confused because the edgeR program itself does not require specifying > quantiles for its TMM-based calcNormFactors. EdgeR seems to run fine so > I think the problem is in the implementation of baySeq; perhaps I'm > misunderstanding/coding something? Any help is greatly appreciated; > commands excerpted from an R session are below. > > >> library(baySeq) > > Attaching package: 'baySeq' > > The following object(s) are masked from 'package:base': > > rbind > >> library(snow) >> cl = makeCluster(4, "SOCK") >> library(edgeR) >> simData = read.delim(file="2011.11.03counts.txt", header=TRUE) >> rownames(simData)=simData$CompID >> simData=simData[,-1] >> simData=as.matrix(simData) >> head(simData) > X1E_F X1E_R X2E_F X2E_R X3E_F X3E_R X1P_F X1P_R X2P_F X2P_R X3P_F > comp0 1065 1159 1207 1572 1477 1817 1841 605 1915 1113 1645 > comp1 544 534 341 675 333 739 690 236 502 451 571 > comp10 30423 37677 28044 54466 23961 58271 53852 34712 59300 40312 44575 > comp100 1060 1065 999 1332 918 1620 1697 658 1117 861 1336 > comp1000 130 157 229 266 141 247 263 135 182 188 168 > comp10000 35 14 15 37 10 47 28 17 22 21 12 > X3P_R > comp0 1732 > comp1 799 > comp10 51243 > comp100 1370 > comp1000 244 > comp10000 64 >> replicates = c("F", "R", "F", "R", "F", "R", "F", "R", "F", "R", "F", "R") >> groups = list(NDE = c(1,1,1,1,1,1,1,1,1,1,1,1), DE = c(1,2,1,2,1,2,1,2,1,2,1,2)) >> cD = new("countData", data = simData, replicates = replicates, groups=groups) >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR") > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="TMM") > Error in match.arg(estimationType) : > 'arg' should be one of "quantile", "total", "edgeR" >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR", quantile=0.75) > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, quantile=0.75, estimationType="edgeR") > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType=c("edgeR", quantile=0.75)) > Error in match.arg(estimationType) : 'arg' must be of length 1 >> calcNormFactors(cD) > Error in calcNormFactors(cD) : > calcNormFactors() only operates on 'matrix' and 'DGEList' objects >> calcNormFactors(simData) > X1E_F X1E_R X2E_F X2E_R X3E_F X3E_R X1P_F X1P_R > 1.0353157 0.9529524 0.9868063 1.1068479 1.0054938 1.0218195 0.9600905 0.8287707 > X2P_F X2P_R X3P_F X3P_R > 1.0550414 0.8955669 1.0869486 1.1052472 >> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR") > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) >> cD at libsizes = getLibsizes(data=simData, replicates=replicates, subset=NULL, estimationType="edgeR") > Calculating library sizes from column totals. > Error in calcNormFactors(d, quantile = quantile, ...) : > unused argument(s) (quantile = quantile) ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}
ADD COMMENTlink written 7.9 years ago by Smith, Hilary A40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 277 users visited in the last hour