Search
Question: Help running the nondetects::qpcrImpute on my data?
0
4 months ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson6.7k wrote:

I am trying to run the qpcrImpute function from the nondetects package on my qPCR data, and I am running into some issues. After some experimentation, I have formatted my data with one row per gene, with each technical replicate in a separate column, which seems to match the format of the data in the included oncogene2013 data set, and I have normalized my data, so now I just need to run qpcrImpute.

The data set is 8 biological samples, with 3 endogenous control genes and 9 target genes. Each biological sample has 4 technical replicates. That makes a total of 384 observations. These were all done by hand by a collaborator on a single 384-well plate. 89 out of 384 observations are non-detections, which is around 23%. I wish to use qpcrImpute to impute these missing values.

However, when I run it, I get the following error:

> imp <- qpcrImpute(qpcr.wide, dj=rep(0, ncol(qpcr.wide)), groupVars = c("animal", "condition"))
~0 + nrep
<environment: 0xcb4d6c0>
Error in Name.s2[ind2] <- names(tst$sigma[i]) : replacement has length zero In addition: Warning message: Partial NA coefficients for 2 probe(s) I was able to fix that error by adding names(tst$sigma) <- colnames(DesLM) after the first call to lmFit (this is not a proper fix, just a workaround). However, I then ran into another error (qpcrImputeCustom is my copy of the function with the above fix added):

> imp <- qpcrImputeCustom(qpcr.wide, dj=rep(0, ncol(qpcr.wide)), groupVars = c("animal", "condition"))
~0 + nrep
<environment: 0xcb2b0d0>
[1] "1 / 100"
Error in integrate(f, lower = -Inf, upper = Inf) :
non-finite function value
Partial NA coefficients for 2 probe(s)


At this point, I'm not sure what the problem is or how to fix it, so I am stuck unless someone else knows what to do.

You can access the qPCRset object that I'm using at this URL: https://www.dropbox.com/s/85wp915os8e1xgt/qPCRset.RDS?dl=0

modified 11 weeks ago by corocla0 • written 4 months ago by Ryan C. Thompson6.7k
1
4 months ago by
United States
valery.sherina10 wrote:

Dear Ryan,

Thank you for your interest in nondetects, and for pointing out the problem.

May I ask you to send me the original data matrix. I am having trouble loading into R the qPCRset object you created.

Thank you,

Valeriia

Are you loading the file with readRDS?

I was able to load the data. We did not work with the data with over 15% of missing data. I suspect there is an estimation problem in limma. I will closely, and get back to you.

In case it matters, the design formula for this experiment is ~condition + animal.

Hello Ryan,

I checked exprs(yourdata) and found that some of the values are NA. We use maximum possible Ct value as a starting point for the EM algorithm. I put a code snippet of creating a qPCRset object that I used in simulation study, hope this helps.

  # Ct_nd is a maximum Ct value
# nsampt is a number of sample types (trt vs. normal)
# nrepl is a number of replicates per sample type

object<-DataName # data matrix
ft <- rep("Target",nrow(object))
# Coltrol gene
if (!is.null(ControlGene)) { # Control Gene is a name of the gene
ind<-which(rownames(object)==ControlGene)
ft[ind] <- "Endogenous Control"
}

fc <- matrix("OK",nrow=nrow(object),ncol=ncol(object))
fc[which(object>(Ct_nd-0.0001),arr.ind=TRUE)] <- "Undetermined"
colnames(fc) <- colnames(object)
rownames(fc) <- rownames(object)

fl <- matrix("Passed",nrow=nrow(object),ncol=ncol(object))
fl[which(object>(Ct_nd-0.0001),arr.ind=TRUE)] <- "Flagged"
colnames(fl) <- colnames(object)
rownames(fl) <- rownames(object)

myData <- new("qPCRset", exprs=object, flag=fl)
featureNames(myData) <- rownames(object)
featureType(myData) <- ft
featureCategory(myData) <- as.data.frame(fc)

sType <- c(rep(LETTERS[1:nsampt], each=nrepl))
tab <- data.frame(sampleName=colnames(object), sampleType=sType)
phenoData(myData) <- AnnotatedDataFrame(data=tab)

I hope this would solve the problem.

Valeriia

Ok, so I should simply replace my NA values with 40 (the maximum value), and then run the imputation algorithm?

Edit: Doing so seems to have worked.

You said that the data you are using is normalized. I would replace NA with 40, normalize and than perform the imputation. It should speed up the convergence, as the majority of the values you have in the data matrix are in single digits.

0
11 weeks ago by
corocla0
corocla0 wrote:

Hi I have a similar problem... but with the example dataset. How can I fix it?

> data(oncogene2013)
> qpcrImpute(oncogene2013)
~0 + nrep
<environment: 0x7fd304e3db40>
Error in Name.s2[ind2] <- names(tst\$sigma[i]) :
replacement has length zero
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred

Please take a look at the example in the qpcrImpute man page:

data(oncogene2013)
tst <- qpcrImpute(oncogene2013, groupVars=c("sampleType","treatment"), outform="Param")

You need to specify which samples are replicates using the groupVars argument.

Matt