Question

Help running the nondetects::qpcrImpute on my data?

0

Entering edit mode

Ryan C. Thompson ★ 7.9k

@ryan-c-thompson-5618

Last seen 9 weeks ago

Icahn School of Medicine at Mount Sinai…

I am trying to run the qpcrImpute function from the nondetects package on my qPCR data, and I am running into some issues. After some experimentation, I have formatted my data with one row per gene, with each technical replicate in a separate column, which seems to match the format of the data in the included oncogene2013 data set, and I have normalized my data, so now I just need to run qpcrImpute.

The data set is 8 biological samples, with 3 endogenous control genes and 9 target genes. Each biological sample has 4 technical replicates. That makes a total of 384 observations. These were all done by hand by a collaborator on a single 384-well plate. 89 out of 384 observations are non-detections, which is around 23%. I wish to use qpcrImpute to impute these missing values.

However, when I run it, I get the following error:

> imp <- qpcrImpute(qpcr.wide, dj=rep(0, ncol(qpcr.wide)), groupVars = c("animal", "condition"))
~0 + nrep
<environment: 0xcb4d6c0>
Error in Name.s2[ind2] <- names(tst$sigma[i]) : 
  replacement has length zero
In addition: Warning message:
Partial NA coefficients for 2 probe(s)

I was able to fix that error by adding names(tst$sigma) <- colnames(DesLM) after the first call to lmFit (this is not a proper fix, just a workaround). However, I then ran into another error (qpcrImputeCustom is my copy of the function with the above fix added):

> imp <- qpcrImputeCustom(qpcr.wide, dj=rep(0, ncol(qpcr.wide)), groupVars = c("animal", "condition"))
~0 + nrep
<environment: 0xcb2b0d0>
[1] "1 / 100"
Error in integrate(f, lower = -Inf, upper = Inf) : 
  non-finite function value
In addition: Warning message:
Partial NA coefficients for 2 probe(s)

At this point, I'm not sure what the problem is or how to fix it, so I am stuck unless someone else knows what to do.

You can access the qPCRset object that I'm using at this URL: https://www.dropbox.com/s/85wp915os8e1xgt/qPCRset.RDS?dl=0

nondetects software error • 2.0k views

ADD COMMENT • link updated 6.8 years ago by corocla • 0 • written 6.9 years ago by Ryan C. Thompson ★ 7.9k

0

Entering edit mode

corocla • 0

@corocla-15131

Last seen 6.8 years ago

Hi I have a similar problem... but with the example dataset. How can I fix it?

> data(oncogene2013)
> qpcrImpute(oncogene2013)
~0 + nrep
<environment: 0x7fd304e3db40>
Error in Name.s2[ind2] <- names(tst$sigma[i]) :
replacement has length zero
In addition: Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred
>

ADD COMMENT • link 6.8 years ago corocla • 0

0

Entering edit mode

Please take a look at the example in the qpcrImpute man page:

data(oncogene2013)
tst <- qpcrImpute(oncogene2013, groupVars=c("sampleType","treatment"), outform="Param")

You need to specify which samples are replicates using the groupVars argument.

Matt

ADD REPLY • link 6.8 years ago Matthew McCall ▴ 830

score 1 · Accepted Answer · 2018-01-19

1

Entering edit mode

valery.sherina ▴ 10

@valerysherina-8940

Last seen 6.9 years ago

United States

Dear Ryan,

Thank you for your interest in nondetects, and for pointing out the problem.

May I ask you to send me the original data matrix. I am having trouble loading into R the qPCRset object you created.

Thank you,

Valeriia

ADD COMMENT • link 6.9 years ago valery.sherina ▴ 10

0

Entering edit mode

Are you loading the file with readRDS?

ADD REPLY • link 6.9 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

I was able to load the data. We did not work with the data with over 15% of missing data. I suspect there is an estimation problem in limma. I will closely, and get back to you.

ADD REPLY • link 6.9 years ago valery.sherina ▴ 10

0

Entering edit mode

In case it matters, the design formula for this experiment is ~condition + animal.

ADD REPLY • link 6.9 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

Hello Ryan,

I checked exprs(yourdata) and found that some of the values are NA. We use maximum possible Ct value as a starting point for the EM algorithm. I put a code snippet of creating a qPCRset object that I used in simulation study, hope this helps.

  # Ct_nd is a maximum Ct value
  # nsampt is a number of sample types (trt vs. normal)
  # nrepl is a number of replicates per sample type

  object<-DataName # data matrix
  ft <- rep("Target",nrow(object))
  # Coltrol gene
  if (!is.null(ControlGene)) { # Control Gene is a name of the gene
    ind<-which(rownames(object)==ControlGene)
    ft[ind] <- "Endogenous Control"
  }

  fc <- matrix("OK",nrow=nrow(object),ncol=ncol(object))
  fc[which(object>(Ct_nd-0.0001),arr.ind=TRUE)] <- "Undetermined"
  colnames(fc) <- colnames(object)
  rownames(fc) <- rownames(object)

  fl <- matrix("Passed",nrow=nrow(object),ncol=ncol(object))
  fl[which(object>(Ct_nd-0.0001),arr.ind=TRUE)] <- "Flagged"
  colnames(fl) <- colnames(object)
  rownames(fl) <- rownames(object)

  myData <- new("qPCRset", exprs=object, flag=fl)
  featureNames(myData) <- rownames(object)
  featureType(myData) <- ft
  featureCategory(myData) <- as.data.frame(fc)

  sType <- c(rep(LETTERS[1:nsampt], each=nrepl))
  tab <- data.frame(sampleName=colnames(object), sampleType=sType)
  phenoData(myData) <- AnnotatedDataFrame(data=tab)

I hope this would solve the problem.

Valeriia

ADD REPLY • link 6.9 years ago valery.sherina ▴ 10

0

Entering edit mode

Ok, so I should simply replace my NA values with 40 (the maximum value), and then run the imputation algorithm?

Edit: Doing so seems to have worked.

ADD REPLY • link 6.9 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

You said that the data you are using is normalized. I would replace NA with 40, normalize and than perform the imputation. It should speed up the convergence, as the majority of the values you have in the data matrix are in single digits.

ADD REPLY • link 6.9 years ago valery.sherina ▴ 10