I am trying to run the qpcrImpute function from the nondetects package on my qPCR data, and I am running into some issues. After some experimentation, I have formatted my data with one row per gene, with each technical replicate in a separate column, which seems to match the format of the data in the included oncogene2013 data set, and I have normalized my data, so now I just need to run qpcrImpute.
The data set is 8 biological samples, with 3 endogenous control genes and 9 target genes. Each biological sample has 4 technical replicates. That makes a total of 384 observations. These were all done by hand by a collaborator on a single 384-well plate. 89 out of 384 observations are non-detections, which is around 23%. I wish to use qpcrImpute to impute these missing values.
However, when I run it, I get the following error:
> imp <- qpcrImpute(qpcr.wide, dj=rep(0, ncol(qpcr.wide)), groupVars = c("animal", "condition")) ~0 + nrep <environment: 0xcb4d6c0> Error in Name.s2[ind2] <- names(tst$sigma[i]) : replacement has length zero In addition: Warning message: Partial NA coefficients for 2 probe(s)
I was able to fix that error by adding names(tst$sigma) <- colnames(DesLM)
after the first call to lmFit
(this is not a proper fix, just a workaround). However, I then ran into another error (qpcrImputeCustom
is my copy of the function with the above fix added):
> imp <- qpcrImputeCustom(qpcr.wide, dj=rep(0, ncol(qpcr.wide)), groupVars = c("animal", "condition")) ~0 + nrep <environment: 0xcb2b0d0> [1] "1 / 100" Error in integrate(f, lower = -Inf, upper = Inf) : non-finite function value In addition: Warning message: Partial NA coefficients for 2 probe(s)
At this point, I'm not sure what the problem is or how to fix it, so I am stuck unless someone else knows what to do.
You can access the qPCRset object that I'm using at this URL: https://www.dropbox.com/s/85wp915os8e1xgt/qPCRset.RDS?dl=0
Are you loading the file with
readRDS
?I was able to load the data. We did not work with the data with over 15% of missing data. I suspect there is an estimation problem in limma. I will closely, and get back to you.
In case it matters, the design formula for this experiment is
~condition + animal
.Hello Ryan,
I checked exprs(yourdata) and found that some of the values are NA. We use maximum possible Ct value as a starting point for the EM algorithm. I put a code snippet of creating a qPCRset object that I used in simulation study, hope this helps.
I hope this would solve the problem.
Valeriia
Ok, so I should simply replace my NA values with 40 (the maximum value), and then run the imputation algorithm?
Edit: Doing so seems to have worked.
You said that the data you are using is normalized. I would replace NA with 40, normalize and than perform the imputation. It should speed up the convergence, as the majority of the values you have in the data matrix are in single digits.