Search
Question: Help running the nondetects::qpcrImpute on my data?
0
gravatar for Ryan C. Thompson
6 months ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson6.8k wrote:

I am trying to run the qpcrImpute function from the nondetects package on my qPCR data, and I am running into some issues. After some experimentation, I have formatted my data with one row per gene, with each technical replicate in a separate column, which seems to match the format of the data in the included oncogene2013 data set, and I have normalized my data, so now I just need to run qpcrImpute.

The data set is 8 biological samples, with 3 endogenous control genes and 9 target genes. Each biological sample has 4 technical replicates. That makes a total of 384 observations. These were all done by hand by a collaborator on a single 384-well plate. 89 out of 384 observations are non-detections, which is around 23%. I wish to use qpcrImpute to impute these missing values.

However, when I run it, I get the following error:

> imp <- qpcrImpute(qpcr.wide, dj=rep(0, ncol(qpcr.wide)), groupVars = c("animal", "condition"))
~0 + nrep
<environment: 0xcb4d6c0>
Error in Name.s2[ind2] <- names(tst$sigma[i]) : 
  replacement has length zero
In addition: Warning message:
Partial NA coefficients for 2 probe(s)

I was able to fix that error by adding names(tst$sigma) <- colnames(DesLM) after the first call to lmFit (this is not a proper fix, just a workaround). However, I then ran into another error (qpcrImputeCustom is my copy of the function with the above fix added):

> imp <- qpcrImputeCustom(qpcr.wide, dj=rep(0, ncol(qpcr.wide)), groupVars = c("animal", "condition"))
~0 + nrep
<environment: 0xcb2b0d0>
[1] "1 / 100"
Error in integrate(f, lower = -Inf, upper = Inf) : 
  non-finite function value
In addition: Warning message:
Partial NA coefficients for 2 probe(s) 

At this point, I'm not sure what the problem is or how to fix it, so I am stuck unless someone else knows what to do.

You can access the qPCRset object that I'm using at this URL: https://www.dropbox.com/s/85wp915os8e1xgt/qPCRset.RDS?dl=0

ADD COMMENTlink modified 5 months ago by corocla0 • written 6 months ago by Ryan C. Thompson6.8k
1
gravatar for valery.sherina
6 months ago by
United States
valery.sherina10 wrote:

Dear Ryan,

Thank you for your interest in nondetects, and for pointing out the problem.

May I ask you to send me the original data matrix. I am having trouble loading into R the qPCRset object you created.

Thank you,

Valeriia 

ADD COMMENTlink written 6 months ago by valery.sherina10

Are you loading the file with readRDS?

ADD REPLYlink written 6 months ago by Ryan C. Thompson6.8k

I was able to load the data. We did not work with the data with over 15% of missing data. I suspect there is an estimation problem in limma. I will closely, and get back to you.

ADD REPLYlink written 6 months ago by valery.sherina10

In case it matters, the design formula for this experiment is ~condition + animal.

ADD REPLYlink written 6 months ago by Ryan C. Thompson6.8k

Hello Ryan,


I checked exprs(yourdata) and found that some of the values are NA. We use maximum possible Ct value as a starting point for the EM algorithm. I put a code snippet of creating a qPCRset object that I used in simulation study, hope this helps.

  # Ct_nd is a maximum Ct value
  # nsampt is a number of sample types (trt vs. normal)
  # nrepl is a number of replicates per sample type

  object<-DataName # data matrix
  ft <- rep("Target",nrow(object))
  # Coltrol gene
  if (!is.null(ControlGene)) { # Control Gene is a name of the gene
    ind<-which(rownames(object)==ControlGene)
    ft[ind] <- "Endogenous Control"
  }

  fc <- matrix("OK",nrow=nrow(object),ncol=ncol(object))
  fc[which(object>(Ct_nd-0.0001),arr.ind=TRUE)] <- "Undetermined"
  colnames(fc) <- colnames(object)
  rownames(fc) <- rownames(object)

  fl <- matrix("Passed",nrow=nrow(object),ncol=ncol(object))
  fl[which(object>(Ct_nd-0.0001),arr.ind=TRUE)] <- "Flagged"
  colnames(fl) <- colnames(object)
  rownames(fl) <- rownames(object)

  myData <- new("qPCRset", exprs=object, flag=fl)
  featureNames(myData) <- rownames(object)
  featureType(myData) <- ft
  featureCategory(myData) <- as.data.frame(fc)

  sType <- c(rep(LETTERS[1:nsampt], each=nrepl))
  tab <- data.frame(sampleName=colnames(object), sampleType=sType)
  phenoData(myData) <- AnnotatedDataFrame(data=tab)

I hope this would solve the problem. 

Valeriia

ADD REPLYlink modified 6 months ago • written 6 months ago by valery.sherina10

Ok, so I should simply replace my NA values with 40 (the maximum value), and then run the imputation algorithm?

Edit: Doing so seems to have worked.

ADD REPLYlink modified 6 months ago • written 6 months ago by Ryan C. Thompson6.8k

You said that the data you are using is normalized. I would replace NA with 40, normalize and than perform the imputation. It should speed up the convergence, as the majority of the values you have in the data matrix are in single digits. 

ADD REPLYlink written 6 months ago by valery.sherina10
0
gravatar for corocla
5 months ago by
corocla0
corocla0 wrote:

Hi I have a similar problem... but with the example dataset. How can I fix it?

> data(oncogene2013)
> qpcrImpute(oncogene2013)
~0 + nrep
<environment: 0x7fd304e3db40>
Error in Name.s2[ind2] <- names(tst$sigma[i]) : 
  replacement has length zero
In addition: Warning messages:
1: glm.fit: algorithm did not converge 
2: glm.fit: fitted probabilities numerically 0 or 1 occurred 

ADD COMMENTlink written 5 months ago by corocla0

Please take a look at the example in the qpcrImpute man page:

data(oncogene2013)
tst <- qpcrImpute(oncogene2013, groupVars=c("sampleType","treatment"), outform="Param")

You need to specify which samples are replicates using the groupVars argument. 

Matt

ADD REPLYlink written 5 months ago by Matthew McCall820
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 316 users visited in the last hour