Drosophila GeneChip analysis

0

Entering edit mode

Paul Mack ▴ 30

@paul-mack-766

Last seen 11.4 years ago

I am in the midst of analyzing Affymetrix Drosophila GeneChip data using RMA such that separate regression lines are estimated for each gene. It was recommended to me that I use a p-value of .0001 as a cutoff for the effect estimates rather than try to apply Bonferroni or other multiple test corrections. Lately, however, I have begun to wonder if others doing this sort of analysis use similar cutoffs and, in general, what others think about statistical stringency in this situation. Any help will be most appreciated; I will summarize any replies that I get that are not sent directly to the list. Thank you. Paul Mack, Ph.D Department of Genetics University of Georgia Athens, GA USA 706-542-1578 (w) 706-542-3910 (fax) paulmack@arches.uga.edu

Regression Regression • 1.4k views

ADD COMMENT • link updated 21.7 years ago by Arne.Muller@aventis.com ▴ 620 • written 21.7 years ago by Paul Mack ▴ 30

0

Entering edit mode

Arne.Muller@aventis.com ▴ 620

@arnemulleraventiscom-466

Last seen 11.4 years ago

do you have a factorial design, and you run one linear model for each gene, and then looking at the p-values for the coefficients? Could you give some more information about what you're doing, I'm not sure I understand ...? regards, Arne -- Arne Muller, Ph.D. Toxicogenomics, Aventis Pharma arne dot muller domain=aventis com > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf Of Paul Mack > Sent: 14 May 2004 16:20 > Subject: [BioC] Drosophila GeneChip analysis > > > > I am in the midst of analyzing Affymetrix Drosophila GeneChip > data using > RMA such that separate regression lines are estimated for > each gene. It was > recommended to me that I use a p-value of .0001 as a cutoff > for the effect > estimates rather than try to apply Bonferroni or other multiple test > corrections. Lately, however, I have begun to wonder if > others doing this > sort of analysis use similar cutoffs and, in general, what > others think > about statistical stringency in this situation. Any help will be most > appreciated; I will summarize any replies that I get that are > not sent > directly to the list. Thank you. > > > Paul Mack, Ph.D > Department of Genetics > University of Georgia > Athens, GA > USA > > 706-542-1578 (w) > 706-542-3910 (fax) > paulmack@arches.uga.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 21.7 years ago Arne.Muller@aventis.com ▴ 620

0

Entering edit mode

Arne.Muller@aventis.com ▴ 620

@arnemulleraventiscom-466

Last seen 11.4 years ago

Hi Paul, this makes things more clear but I still have some question ..., please see below for some comments. > -----Original Message----- > From: Paul Mack [mailto:paulmack@arches.uga.edu] > Sent: 14 May 2004 18:07 > To: Muller, Arne PH/FR > Subject: RE: [BioC] Drosophila GeneChip analysis > > > Hi, Arne: > > Thanks for your response. Hopefully I can clarify. I have 4 > classification > variables in the model I use: gene; category (meaning treated > or control; I This means you're running a single model, i.e. if you've 10,000 genes on the chip you've the gene factor contains 10,000 levels, right? How are you running this in R, I've tried once, and it quickly run out of memory because I've >12k gene on the chip ... :-( > have only one treatment and it is qualitative); array (designated as > random); and probe (there are 14 probes per gene on each > chip). It also I'm also running linear models on the probe level (I think it gives a good kind of pseudo-replication). what is your model call, something like this: lme(intensity ~ gene*cat*probe, random = ~ 1 | array) or do you also include the array in the fixed effects. I'm not sure about this call (just received the mixed model book from Pinheiro and Bates). > includes a category x array interaction term. The model predicts gene > expression as a function of array, categoy, array, probe and the ^^^^ you mean gene here? > interaction term. I then look at the estimated category > coefficients gene I'm currently doing a similar thing, and despite the trouble to decide for a method to correct for multiple testing (I'm using p.adjust(pvalue, 'fdr')) and the actual p-value cutoff, I found that residuals of the models are not normal distributed (see one of my last postings to the Bioconductor list). I think one realy needs to check the model quality, otherwise the p-values don't mean too much anyway ... . Kerr and Churchill (2002) have reported this problem, and argue that one actually needs to use bootstrapping to calculate condifence intervals (since the distribution of residuals has extreme tails). This is rather discouraging since bootstrapping will take too long for my analysis (MG-U74Av2 chip with >12k gene). Did you try the mmanova package from Kerr & Churchill (http://www.jax.org/staff/churchill/labsite/software/anova/)? I'm not sure it works for affy chips. regards, Arne > by gene. Hope this makes more sense. > > Paul > > At 04:28 PM 5/14/2004 +0200, you wrote: > >do you have a factorial design, and you run one linear model > for each > >gene, and then looking at the p-values for the coefficients? > Could you > >give some more information about what you're doing, I'm not sure I > >understand ...? > > > > regards, > > > > Arne > > > >-- > >Arne Muller, Ph.D. > >Toxicogenomics, Aventis Pharma > >arne dot muller domain=aventis com > > > > > -----Original Message----- > > > From: bioconductor-bounces@stat.math.ethz.ch > > > [mailto:bioconductor-bounces@stat.math.ethz.ch]On Behalf > Of Paul Mack > > > Sent: 14 May 2004 16:20 > > > Subject: [BioC] Drosophila GeneChip analysis > > > > > > > > > > > > I am in the midst of analyzing Affymetrix Drosophila GeneChip > > > data using > > > RMA such that separate regression lines are estimated for > > > each gene. It was > > > recommended to me that I use a p-value of .0001 as a cutoff > > > for the effect > > > estimates rather than try to apply Bonferroni or other > multiple test > > > corrections. Lately, however, I have begun to wonder if > > > others doing this > > > sort of analysis use similar cutoffs and, in general, what > > > others think > > > about statistical stringency in this situation. Any help > will be most > > > appreciated; I will summarize any replies that I get that are > > > not sent > > > directly to the list. Thank you. > > > > > > > > > Paul Mack, Ph.D > > > Department of Genetics > > > University of Georgia > > > Athens, GA > > > USA > > > > > > 706-542-1578 (w) > > > 706-542-3910 (fax) > > > paulmack@arches.uga.edu > > > > > > _______________________________________________ > > > Bioconductor mailing list > > > Bioconductor@stat.math.ethz.ch > > > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > > Paul Mack, Ph.D > Department of Genetics > University of Georgia > Athens, GA > USA > > 706-542-1578 (w) > 706-542-3910 (fax) > paulmack@arches.uga.edu > > > >

ADD COMMENT • link 21.7 years ago Arne.Muller@aventis.com ▴ 620

Login before adding your answer.