Question

general CGH threshold question

0

Entering edit mode

Wolfgang RAFFELSBERGER ▴ 140

@wolfgang-raffelsberger-2876

Last seen 7 weeks ago

France

Dear Bioconductors, for segmenting CGH data there are several packages available and most of them tend to give me similar overall results. However, when it comes to make the point about a collection of cancer-specimens (=patients), I have to decide of how to combine all the so nicely segmented individual profiles. And at some point I'm forced to take the arbitrary decision for a threshold deciding if a given position/segment from a specimen (patient) should be considered/counted as aberrant or not. Of course one could say, that in theory a given segment should either be there as a single copy of doubled, tripled (etc..) or lost and that expected rations should follow this. However in my view reality is quite different. Surgeons tend to remove (a bit) more tissue than the tumor itself, so there is reason to assume some normal tissue, plus tumors may be heterogeneous. All these reasons contribute to the fact that I see log-rations less than +/- 1 (which would describe this ideal case), and I wonder how many of them could still represent "true" alterations. Now I've seen people making fairly arbitrary decisions about such thresholds, like 0.5 (corresponds to : ~40% of molecules tested with doubled DNA while the rest may be normal) or other values in that range. Unfortunately the biologists/clinicians can't help me on the question which fraction of cells should be altered to be still considered. Now another part of the story enters the scene. From some (preliminary) comparisons I've seen that Agilent software may give quite different results about the frequency of lost/amplified zones of the genome (while at least CBS, GLAD, aCGH and snapCGH were in major agreement for penetration counts at a given threshold - I apologize for not mentioning all the other BioC packages available). And not-bioinformatics people keep asking me why this might be so. After all I wonder if this might have something to do with the choice of the threshold mentioned above. Of course, if you choose a threshold closer to 0 (like 0.1 or 0.2) you'll find more aberrations above threshold, but not just more, to my surprise - at sudden - entire chromosome-arms show up as enriched for gains or losses, making the results (a bit) more look like the Agilent results. So when looking at all the distribution of all log2-ratios (say for some 100 patients) I see a rather bell-shaped (slightly asymmetric) distribution. A qqplot has a slight sigmoid character and the 99.9% (t-distribution) confidence interval with that many df is way to close to 0. So my question : What do you suggest as a procedure to define a threshold to decide if a given position/segment may be considered as altered when piling up all the biopsies/patients in study ? Besides statistical ideas I also wonder if anybody has data from comparisons with other experimental techniques to understand the "true" status and the discrepancy with the Agilent software ? Thank's in advance, Wolfgang Raffelsberger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et G?nomique Int?gratives CNRS UMR7104, IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger (at) igbmc.fr

aCGH CGH aCGH GLAD snapCGH aCGH CGH aCGH GLAD snapCGH • 1.0k views

ADD COMMENT • link 14.7 years ago Wolfgang RAFFELSBERGER ▴ 140