Is mean adjusted for purity?
Entering edit mode
twtoal ▴ 10
Last seen 12 months ago
United States

You previously indicated that the C values in the PureCN output have been adjusted for purity.  Does this also hold true for the "mean" values for mean log ratios?  It appears to me that they are adjusted for purity (but are still log mean ratios, not actual mean ratios and not twice the ratio to give a copy number value).



PureCN mean ratio • 1.7k views
Entering edit mode
Last seen 20 months ago
United States

All the log-ratios are standard log2 tumor vs normal coverage (of course after normalization for total sequencing coverage). Exactly what you would get from any other copy number tool that does not do any purity/ploidy adjustment like CNVkit, GATK4 etc. So no purity adjustment. 

If you need purity adjustment of log-ratios for some reason, for example when downstream tools like GISTIC expect log2 ratios, you can follow (section Impurity-corrected GISTIC).

Feel free to add a GitHub issue if you think that some of the output is not clearly documented in the main vignette (mainly Tables 1-5).

Entering edit mode

Thanks, that was a helpful reference.  However, I see that it has a mistake in its equation for R'(x).  I believe the correct equation should be:

R'(x) = q(x)/T = [atR(x) + 2(1-a)R(x) - 2(1-a)] / aT

T = tau, a = alpha, q(x) = integer CN in cancer cells, R(x) = observed CN ratio, R'(x) = CN ratio in tumor cells

His derivation:

  R(x) = (aq(x)+2(1-a))/D
  D = aT + 2(1-a)
  q(x) = DR(x)/a - 2(1-a)/a
  R'(x) = q(x)/T = R(x)/a - 2(1-a)/aT


  R'(x) = adjusted coverage ratio
  R(x) = raw coverage ratio
  q(x) = integer copy number in cancer cells
  D = average ploidy across all cells of tumor (of sample)
  a = purity
  T = tumor ploidy

However, in the last step where he substituted q(x) in q(x)/T, he did the algebra wrong.  The correct algebra is:

R'(x) = q(x)/T = DR(x)/aT - 2(1-a)/aT = (aT + 2(1-a))R(x)/aT - 2(1-a)/aT
      = R(x) + 2(1-a)R(x)/aT - 2(1-a)/aT
      = [aTR(x) + 2(1-a)R(x) - 2(1-a)]/aT

As a test, say that purity = a = 0.5, tumor ploidy = T = 2, and raw coverage ratio is 1.5.  Then we expect the adjusted coverage ratio to be 2 (tumor segment is 2X amplify (4 copies) and this becomes raw ratio of 1.5 when purity is 1/2:   [0.5*4 + 0.5*2] / 2 = 1.5).

His: R'(x) = 1.5/0.5 - 2(0.5)/(0.5 * 2) = 3 - 2(0.5) = 2 (correct)
​Mine: R'(x) = [0.5*2*1.5 + 2(0.5)1.5 - 2(0.5)] / (0.5*2) = 1.5 + 1.5 - 1 = 2 (correct)

But now suppose that tumor ploidy = T = 4, and we still have purity=a=0.5.  Say raw coverage ratio = 1.0, which means there is no tumor amplification, the number of copies at any locus is the same as the mean number of copies, in both the 2X normal and 4X tumor tissue.  Then we expect the adjusted coverage ratio to also be 1.

His: R'(x) = 1/0.5 - 2(0.5)/(0.5 * 4) = 2 - 2(0.5)/2 = 2 - 1/2 = 1.5 (wrong)
Mine: R'(x) = [0.5*4*1 + 2(0.5)1 - 2(0.5)] / (0.5 * 4) = [2 + 1 - 1] / 2 = 2 / 2 = 1 (correct)
Entering edit mode

Not sure, I looked into this more than 2 years ago. I used the following and believe it's correct:

rds <- readRDS("Sampleid.rds")

r <- rds$results[[1]]

r$seg$seg.mean.adjusted <- r$seg$seg.mean/r$purity - 2*(1-r$purity)/(r$purity*r$ploidy)

I haven't used it much though because I found little benefit in GISTIC and for everything else you usually want the absolute copy numbers.


Entering edit mode

Your equation above matches the one in the paper you cited, which is incorrect.  Your seg.mean is his R(x), your purity is his a, your ploidy is his T.

I found that PureCN:::.calcExpectedRatio() is doing it correctly (it is doing the inverse operation, computing R(x) from R'(x)).

However, in runAbsoluteCN(), I find this line:

    opt.C <- (2^(seg$seg.mean + log.ratio.offset) *  total.ploidy)/p - ((2 * (1 - p))/p)

and since C = ratio * ploidy, the above equation is the paper's (incorrect) R'(x) * ploidy.  It seems to be wrong.  Please check it.  Maybe I'm missing something, but to me it looks like a definite algebra mistake.


Entering edit mode

I think I'll go ahead and open an issue on the PureCN github project for this.



Login before adding your answer.

Traffic: 821 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6