Bugs in HTqPCR implementation of data normalization
0
0
Entering edit mode
@peter-langfelder-4469
Last seen 12 weeks ago
United States
Hello, I was hoping to use HTqPCR to analyze my data, but found some bugs that make the package hard to use (or giving wrong results...). Looking at the source code of normalizeCtData, it appears that there is a bug in the geometric.mean and scale.rankinvariant methods - that data seem to be multiplied by scaling factors instead of being divided by them. This magnifies inter-sample differences instead of reducing them! Here's an example script that illustrates the problem. I generate 4 samples with a small number of measurements, with all measurements the same in each sample. library(HTqPCR) n = 10; m = 4; htData = new( "qPCRset", exprs = t(matrix(20 + 2*(1:m), m, n))); featureNames(htData) = paste0("mir.", 1:n); sampleNames(htData) = paste0("sample", 1:m); featureType(htData) = rep("Target", n); exprs(htData) # sample1 sample2 sample3 sample4 #mir.1 22 24 26 28 #mir.2 22 24 26 28 #mir.3 22 24 26 28 #mir.4 22 24 26 28 #mir.5 22 24 26 28 #mir.6 22 24 26 28 #mir.7 22 24 26 28 #mir.8 22 24 26 28 #mir.9 22 24 26 28 #mir.10 22 24 26 28 htNorm.gm = normalizeCtData(htData, norm = "geometric.mean") exprshtNorm.gm); #Scaling Ct values # Using geometric mean within each sample # Scaling factors: 1.00 1.09 1.18 1.27 > exprshtNorm.gm); # sample1 sample2 sample3 sample4 #mir.1 22 26.18182 30.72727 35.63636 #mir.2 22 26.18182 30.72727 35.63636 #mir.3 22 26.18182 30.72727 35.63636 #mir.4 22 26.18182 30.72727 35.63636 #mir.5 22 26.18182 30.72727 35.63636 #mir.6 22 26.18182 30.72727 35.63636 #mir.7 22 26.18182 30.72727 35.63636 #mir.8 22 26.18182 30.72727 35.63636 #mir.9 22 26.18182 30.72727 35.63636 #mir.10 22 26.18182 30.72727 35.63636 Note that the sample differences were maginfied, not reduced. Same thing happens for the scale.rankinvariant method: n = 100; m = 4; htData.2 = new( "qPCRset", exprs = t(matrix(20 + 2*(1:m), m, n))); set.seed(1); exprs(htData.2) = exprs(htData.2) + runif(m) htNorm.sri = normalizeCtData(htData.2, norm = "scale.rankinvariant") head(exprs(htNorm.sri)) Furthermore, the geometric.mean normalization has been implemented differently from the article referenced in connection with it (Mestdagh et al., 2009). While the article is written so it lacks exact details and formulas, I have confirmed with the authors that "geometric mean" applies to the linearized expression data, that is, to exponentiated Ct-values; this is equivalent to averaging Ct using arithmetic mean, not geometric mean as implemented in normalizeCtData. Further, the article seems to suggest dividing the linearized values by the geometric mean, which means subtracting, not dividing, the Ct values by the arithmetic mean. This distinction should be pointed out in the vignette, or the method should be re-implemented according to the article. I also keep getting errors when trying to print objects. In the above example, typing htData produces the following output: An object of class "qPCRset" Size: 10 features, 4 samples Feature types: Feature names: mir.1 mir.2 mir.3 ... Feature classes: Error in `rownames<-`(`*tmp*`, value = c("mir.1", "mir.2", "mir.3", "mir.4", : attempt to set 'rownames' on an object with no dimensions Best, Peter
Normalization HTqPCR Normalization HTqPCR • 1.1k views
ADD COMMENT

Login before adding your answer.

Traffic: 775 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6