Unequally spaced replicates in limma
4
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 9.7 years ago
Hi As I have varying numbers of replicates, and they are not regularly spaced on the array, and given that I would like a list of differentially expressed genes which is averaged over replicates, I assume the best thing to do is normalise my data, and then average over replicates in the MAList object, and then pass the averaged data to lmFit() etc? Is that right? Cheers Mick
• 943 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 30 minutes ago
WEHI, Melbourne, Australia
> Hi > > As I have varying numbers of replicates, and they are not regularly > spaced on the array, and given that I would like a list of > differentially expressed genes which is averaged over replicates, I assume that these are within-array replicates. > I > assume the best thing to do is normalise my data, and then average over > replicates in the MAList object, and then pass the averaged data to > lmFit() etc? Yes, you could do that. It does raise subtle issues though concerning how the variance of the averages depends on the number of replicates. You might like to compute weights based on the number of replicates for each probe and pass that to lmFit also. Gordon > Is that right? > > Cheers > Mick
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 9.7 years ago
Thanks Gordon Actually when I did this, I got some odd results. If I ran lmFit(), eBayes() and topTable() on my data set on a per-spot basis, I found ~800 SPOTS with a p-value <= 0.05. Now most of my genes are replicated in duplicate on the arrays (within-array replicates) and when I averaged over those replicates, and used that data to feed into lmFit(), eBayes() and topTable() I got ~1100 GENES with a p-value <=0.05. Does this suggest that after averaging over replicate spots, the measurements for my genes are more tightly distributed than the individual spots were..? Cheers Mick -----Original Message----- From: Gordon K Smyth [mailto:smyth@wehi.EDU.AU] Sent: 01 September 2004 23:12 To: michael watson (IAH-C) Cc: bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Unequally spaced replicates in limma > Hi > > As I have varying numbers of replicates, and they are not regularly > spaced on the array, and given that I would like a list of > differentially expressed genes which is averaged over replicates, I assume that these are within-array replicates. > I > assume the best thing to do is normalise my data, and then average > over replicates in the MAList object, and then pass the averaged data > to > lmFit() etc? Yes, you could do that. It does raise subtle issues though concerning how the variance of the averages depends on the number of replicates. You might like to compute weights based on the number of replicates for each probe and pass that to lmFit also. Gordon > Is that right? > > Cheers > Mick
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 30 minutes ago
WEHI, Melbourne, Australia
At 07:23 PM 2/09/2004, michael watson (IAH-C) wrote: >Thanks Gordon > >Actually when I did this, I got some odd results. The results look to me as you would hope for and expect. >If I ran lmFit(), eBayes() and topTable() on my data set on a per- spot >basis, I found ~800 SPOTS with a p-value <= 0.05. Now most of my genes >are replicated in duplicate on the arrays (within-array replicates) and >when I averaged over those replicates, and used that data to feed into >lmFit(), eBayes() and topTable() I got ~1100 GENES with a p-value ><=0.05. > >Does this suggest that after averaging over replicate spots, the >measurements for my genes are more tightly distributed than the >individual spots were..? 1. You've reduced the number of genes by half, hence you do only half the adjustment for multiple testing, hence you end up with lower p-values. 2. You'd certainly hope that averages are more tightly distributed than the individual spots, that's why averaging is a good thing. If your genes are virtually all in duplicate, and the others have an even number of reps, you could sort your MA object by gene ID and then use duplicateCorrelation() with ndups=2 and spacing=1. Gordon >Cheers >Mick > >-----Original Message----- >From: Gordon K Smyth [mailto:smyth@wehi.EDU.AU] >Sent: 01 September 2004 23:12 >To: michael watson (IAH-C) >Cc: bioconductor@stat.math.ethz.ch >Subject: Re: [BioC] Unequally spaced replicates in limma > > > > Hi > > > > As I have varying numbers of replicates, and they are not regularly > > spaced on the array, and given that I would like a list of > > differentially expressed genes which is averaged over replicates, > >I assume that these are within-array replicates. > > > I > > assume the best thing to do is normalise my data, and then average > > over replicates in the MAList object, and then pass the averaged data > > to > > lmFit() etc? > >Yes, you could do that. It does raise subtle issues though concerning >how the variance of the averages depends on the number of replicates. >You might like to compute weights based on the number of replicates for >each probe and pass that to lmFit also. > >Gordon > > > Is that right? > > > > Cheers > > Mick
ADD COMMENT
0
Entering edit mode
@elizabeth-brooke-powell-838
Last seen 9.7 years ago
Hi Gordon, Is the solution of sorting the table available in LimmaGUI? Should I resort the input files to get the replicates taken into account using ndups=2 and spacing=1? What happens to the replicates if you have no spot weighting, are they just averaged? Thank you for your help, Liz ------------------------------ Date: Thu, 02 Sep 2004 19:44:34 +1000 From: Gordon Smyth <smyth@wehi.edu.au> Subject: RE: [BioC] Unequally spaced replicates in limma To: "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> Cc: bioconductor@stat.math.ethz.ch Message-ID: <6.0.1.1.1.20040902193610.02984088@imaphost.wehi.edu.au> Content-Type: text/plain; charset="us-ascii"; format=flowed At 07:23 PM 2/09/2004, michael watson (IAH-C) wrote: >Thanks Gordon > >Actually when I did this, I got some odd results. The results look to me as you would hope for and expect. >If I ran lmFit(), eBayes() and topTable() on my data set on a per- spot >basis, I found ~800 SPOTS with a p-value <= 0.05. Now most of my genes >are replicated in duplicate on the arrays (within-array replicates) and >when I averaged over those replicates, and used that data to feed into >lmFit(), eBayes() and topTable() I got ~1100 GENES with a p-value ><=0.05. > >Does this suggest that after averaging over replicate spots, the >measurements for my genes are more tightly distributed than the >individual spots were..? 1. You've reduced the number of genes by half, hence you do only half the adjustment for multiple testing, hence you end up with lower p-values. 2. You'd certainly hope that averages are more tightly distributed than the individual spots, that's why averaging is a good thing. If your genes are virtually all in duplicate, and the others have an even number of reps, you could sort your MA object by gene ID and then use duplicateCorrelation() with ndups=2 and spacing=1. Gordon >Cheers >Mick >
ADD COMMENT

Login before adding your answer.

Traffic: 489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6