Question

CGH microarrays significance test

0

Entering edit mode

João Fadista ▴ 500

@joao-fadista-1942

Last seen 9.7 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070321/ 3911af12/attachment.pl

• 672 views

ADD COMMENT • link updated 17.1 years ago by Ramon Diaz ★ 1.1k • written 17.1 years ago by João Fadista ▴ 500

score 0 · Answer 1 · 2007-03-21

Dear Joao, On Wednesday 21 March 2007 16:33, Jo?o Fadista wrote: > Dear list, > > I have a CGH microarray experiment where I compare male vs. female in each > sample (3 technical replicates with dye swaps = 6 samples). So in theory I > would expect to see a difference in log2ratios of the X chromosome compared > to the autosomes. This experiment is made mainly to assess/optimize the > reliability of the protocol and the in-house microarray platform for CGH > microarrays experiments. > > I already used packages in Bioconductor that deal with CGH microarrays but > I would also like to have a statistical test to see if there is a > significance difference between the mean values of log2ratios from the X > chromosome compared to the autosomes. I already did a two-sample T-test and > a Wilcox.test where the log2ratios for autosomal clones represent the first > sample and log2ratios for clones from chromosome X represent the second > sample. > I get confused here. It is not clear to me whether you want to compare between males and females (as you say in the first paragraph) or between autosomal and the X. I think the later, so here are some thoughts: 1. First, you have something like a paired design: for each subject you measure both autosomal and the X. Since these are all arrayed in the same glass, etc, you definitely want to account for this. More or less like the logic behind a paired t test. 2. You do not only have one value for the X and one value for the autosomal, but actually a collection of each. And the autosomals come in 22 packages. 3. My first thought would be to use a mixed effects models (with package nlme) including terms for subject and, possibly, chromosome (within the autosomals); the chromosome random effect might be crossed with subject or nested within subject. I'd be inclined to nest it within subject. 4. By using the mixed-effects model you can also include your technical replicates as technical replicates by adding a term for biological sample. 5. A simpler, direct, approach, would be to just take the average of all autosomals and all the X within subject, average this over tech. replicates, and do a paired t-test. But I would not recommend it. 6. With nlme and mixed effects models in general there are a battery of diagnostics; in addition, you have very large sample sizes relative to the number of (random and fixed) effects you are modeling. 7. You can also use heteroscedastic models with mixed effects to account for the differences in variances between samples, thus performing the weighting you refer to. 8. (You have gene information; technically, you might want to incorporate a crossed gene effect. But you will then probably have difficulties fiting the model, and you'll end up with a huge number of terms). These are some half-cooked ideas. I do not think the above will be a simple, 10 minute, walk in the woods, but I think it might be a worthwile modelling exercise. A different approach: since you have used some of the CGH packages, you probably have estimates of regions of gains and loss. Thus, a different type of analysis would be not to use the log2ratios, but use instead the inferences about gains and losses, by arguing that the later are actually denoised versions of the former (and, thus, "better things to" base your downstream inferences upon). Best, R. > 1 - Should have done another more robust test? Is there any other kind of > statistical tests that I can perform to assess the reliability of my > experiment (assuming that the pre-processing and normalization is already > optimized)? > > 2 - Is it statistical acceptable to average my technical replicates (the > average is a weighted average where the arrays with "more quality" have a > higher weight) in order to reduce the variance? > > > Med venlig hilsen / Regards > > Jo?o Fadista > Ph.d. studerende / Ph.d. student > > > > AARHUS UNIVERSITET / UNIVERSITY OF AARHUS > Det Jordbrugsvidenskabelige Fakultet / Faculty of Agricultural Sciences > Forskningscenter Foulum / Research Centre Foulum > Genetik og Bioteknologi / Dept. of Genetics and Biotechnology > Blichers All? 20, P.O. BOX 50 > DK-8830 Tjele > > Tel: +45 8999 1900 > Direct: +45 8999 1900 > Mobile: +45 > E-mail: Joao.Fadista at agrsci.dk <mailto:joao.fadista at="" agrsci.dk=""> > Web: www.agrsci.dk <http: www.agrsci.dk=""/> > ________________________________ > > Tilmeld dig DJF's nyhedsbrev / Subscribe Faculty of Agricultural Sciences > Newsletter <http: www.agrsci.dk="" user="" register?lan="dan-DK"> . > > Denne email kan indeholde fortrolig information. Enhver brug eller > offentligg?relse af denne email uden skriftlig tilladelse fra DJF er ikke > tilladt. Hvis De ikke er den tilt?nkte adressat, bedes De venligst straks > underrette DJF samt slette emailen. > > This email may contain information that is confidential. Any use or > publication of this email without written permission from Faculty of > Agricultural Sciences is not allowed. If you are not the intended > recipient, please notify Faculty of Agricultural Sciences immediately and > delete this email. > > > > [[alternative HTML version deleted]] -- Ram?n D?az-Uriarte Statistical Computing Team Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en s...{{dropped}}