Re: "validity" of p-values
0
0
Entering edit mode
@gordon-smyth
Last seen 7 hours ago
WEHI, Melbourne, Australia
At 06:27 AM 29/09/2003, Jenny Drnevich wrote: >See below... > > >>However, have you seen: Chu, Weir, & Wolfinger. A systematic > >> statistical linear modeling approach to oligonucleotide array > >> experiments MATH BIOSCI 176 (1): 35-51 Sp. Iss. SI MAR 2002 > >>They advocate using the probe-level data in a linear mixed model. > >> Assuming that each probe is an independent measure (which I know is not > >> true because many of them overlap, but I'm ignoring this for now), > >> using probe-level data gives 14-20 "replicates" per chip. We've based > >> our analysis methods on this, and with two biological replicates per > >> genetic line, and three genetic lines per phenotypic group, we've been > >> able to detect as little as a 15% difference in gene expression at > >> p=0.0001 (we only expect 2 FP and get 60 genes with p=0.0001). > > > > Mmmm. Getting very low p-values from just two biological replicates > > doesn't lead you to question the validity of the p-values?? :) > >But we don't just have two biological replicates. We're interested in >consistent gene expression differences between phenotype 1 and phenotype >2. We looked at three different genetic lines showing phenotype 1 and >three other lines that had phenotype 2. If I understand correctly, you have 6 arrays on each phenotype, all biologically independent. > We made two biological replicates >of each line, and the expression level of each gene was estimated by 14 >probes. By running a mixed-model ANOVA separately for each gene with >phenotype, line (nested within phenotype), probe, and all second- order >interactions, the phenotype comparison has around 120 df (or so, off the >top of my head). There are only 2 phenotypes, so the phenotype comparison has 1 df. I think what you mean is that you have something like 120 df for estimating the variability of repeated measurements at the probe level. But this isn't the most important variance component for comparing phenotypes. Your model, if I understand it, neglects any variance component at the array level even though your treatments (the phenotypes) are applied at the array level. You are in a way treating the probes as if they were separate arrays, and one doesn't have to be a mathematical statistician to question to validity of that. > That's how we can detect a 15% difference in gene >expression. As long as the statistical model is set up correctly, I never >"question" the validity of p-values, although I might question the >biological significance... :) You should! A famous and true saying goes "All statistical models are wrong, but some are useful." It is encumbant on you to understand how the assumptions of your statistical model relate to reality and how sensitive your conclusions are to these assumptions. There are actually deep reasons why, in my opinion, none of the statistical methods for small numbers of arrays can produce p-values which are believable in an absolute sense (and this inclues my own methods in the limma package). The real test would be to try out your method on some data sets where the answers are known, for example to apply to methods to some replicate arrays hybridized with RNA from the same source. My guess is that the method would detect a lot of spurious differential expression. Gordon
probe probe • 781 views
ADD COMMENT

Login before adding your answer.

Traffic: 781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6