averaging replicates within arrays

0

Entering edit mode

alex lam RI ▴ 310

@alex-lam-ri-1491

Last seen 11.5 years ago

Dear Colleagues, Hi, I am a first year PhD student recently started on a project involving microarray data analysis at the Roslin Institute in Scotland. I have managed to follow the limma vignette in loading the data and performed the default normalization within arrays. On each array, probes of the same genes have been placed in more than one spot. What I would like is to do is to group spots by gene names in MA$genes and calculate the average logratio as the expression level (better still, ignore the spots with zero weight). I guess I can dump the data and process it in perl but would like to know how to do this a bit more elegantly in R. Your help is greatly appreciated. Many thanks, Alex

Microarray Normalization limma PROcess Microarray Normalization limma PROcess • 1.8k views

ADD COMMENT • link updated 20.3 years ago by Naomi Altman ★ 6.0k • written 20.3 years ago by alex lam RI ▴ 310

0

Entering edit mode

Adaikalavan Ramasamy ★ 1.8k

@adaikalavan-ramasamy-675

Last seen 11.5 years ago

As Dr. Roger Gray (of Heriot-Watt) is fond of saying "If you put one foot in a bucket of boiling hot water and another foot in a bucket of ice, then you should be comfortable on average". But if you still want to do this in R, then tapply() might help. Regards, Adai On Tue, 2005-11-08 at 22:49 +0000, alex lam (RI) wrote: > Dear Colleagues, > > Hi, I am a first year PhD student recently started on a project involving microarray data analysis at the Roslin Institute in Scotland. I have managed to follow the limma vignette in loading the data and performed the default normalization within arrays. On each array, probes of the same genes have been placed in more than one spot. What I would like is to do is to group spots by gene names in MA$genes and calculate the average logratio as the expression level (better still, ignore the spots with zero weight). > > I guess I can dump the data and process it in perl but would like to know how to do this a bit more elegantly in R. Your help is greatly appreciated. > > Many thanks, > Alex > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 20.3 years ago Adaikalavan Ramasamy ★ 1.8k

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.8 years ago

United States

I am facing a similar problem, and here is what I plan. (I am putting out this suggestion for general discussion). Step 1: Use duplicateCorrelation in limma. The genes with 6 spots will be treated as 2 groups of 3. We need equal groups for the eBayes step. Step 2: Adjust the eBayes estimate of variance for 6 spots instead of 3. Compute all the contrasts using all 6 spots, and write a bit of code to redo the tests on the 6 spots data. I think this is better than using the extra spots to check the consistency of results, as has been suggested previously on this list. We have more spots for some genes because, given the space on the array, we did extra duplication of genes that were of primary interest. We ought to use the increased power this gives us. --Naomi p.s. In our case, we have either 1 or 2 spots per gene. In designing an array, I probably would use 4 spots for every gene rather than 3 for some and 6 for others. At 11:03 AM 11/9/2005, alex lam (RI) wrote: >Dear Gordon, >Thanks for your reply. The probes are identical and every gene is >replicated but not in the same number. Some are replicated 3 times >and some 6 times. Is that going to be a problem? > >I should rephrase the comment on the spots with zero weights. I knew >that they were ignored in normalization. Are they also ignored in >other limma methods? I had to explicitly exclude them in boxplot, >but I guess boxplot is just a generic method. > >Regards, >Alex >------------------------------------ >Alex Lam >PhD student >Department of Genetics and Genomics >Roslin Institute (Edinburgh) >Roslin >Midlothian EH25 9PS > >Phone +44 131 5274471 >Web http://www.roslin.ac.uk > > >-----Original Message----- >From: Gordon Smyth [mailto:smyth at wehi.edu.au] >Sent: 08 November 2005 23:17 >To: alex lam (RI) >Cc: BioC Mailing List >Subject: [BioC] averaging replicates within arrays > > >Dear Alex, > >Do you have the same number of replicate spots for every gene of interest >and are the replicate probes identical? If so, see the case study in limma >User's Guide on "Within array replicate spots". > >If only some of your genes are replicated, or if the probes are not >identical, I would strongly advice you not to attempt to pre- emptively >average the spots. There is little to be gained and much to be lost. > >I don't understand you comment about ignoring spots with zero weight. limma >already does this. > >Best wishes >Gordon > > >[BioC] averaging replicates within arrays > >alex lam (RI) alex.lam at bbsrc.ac.uk > >Tue Nov 8 23:49:46 CET 2005 > > > >Dear Colleagues, > > > >Hi, I am a first year PhD student recently started on a project involving > >microarray data analysis at the Roslin Institute in Scotland. I have > >managed to follow the limma vignette in loading the data and performed > >the default normalization within arrays. On each array, probes of the same > >genes have been placed in more than one spot. What I would like is to do > >is to group spots by gene names in MA$genes and calculate the average > >logratio as the expression level (better still, ignore the spots with zero > >weight). > > > >I guess I can dump the data and process it in perl but would like to know > >how to do this a bit more elegantly in R. Your help is greatly appreciated. > > > >Many thanks, > >Alex > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 20.3 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

On Thu, November 10, 2005 4:33 am, Naomi Altman wrote: > I am facing a similar problem, and here is what I plan. (I am > putting out this suggestion for general discussion). > > Step 1: Use duplicateCorrelation in limma. The genes with 6 spots > will be treated as 2 groups of 3. We need equal groups for the eBayes step. > > Step 2: Adjust the eBayes estimate of variance for 6 spots instead of > 3. Compute all the contrasts using all 6 spots, and write a bit of > code to redo the tests on the 6 spots data. > > I think this is better than using the extra spots to check the > consistency of results, as has been suggested previously on this > list. We have more spots for some genes because, given the space on > the array, we did extra duplication of genes that were of primary > interest. We ought to use the increased power this gives us. > > --Naomi > > p.s. In our case, we have either 1 or 2 spots per gene. In designing > an array, I probably would use 4 spots for every gene rather than 3 > for some and 6 for others. That seems a well motivated approach to me. Cheers Gordon

ADD REPLY • link 20.3 years ago Gordon Smyth 53k

0

Entering edit mode

alex lam RI ▴ 310

@alex-lam-ri-1491

Last seen 11.5 years ago

Dear Gordon, Thanks for your reply. The probes are identical and every gene is replicated but not in the same number. Some are replicated 3 times and some 6 times. Is that going to be a problem? I should rephrase the comment on the spots with zero weights. I knew that they were ignored in normalization. Are they also ignored in other limma methods? I had to explicitly exclude them in boxplot, but I guess boxplot is just a generic method. Regards, Alex ------------------------------------ Alex Lam PhD student Department of Genetics and Genomics Roslin Institute (Edinburgh) Roslin Midlothian EH25 9PS Phone +44 131 5274471 Web http://www.roslin.ac.uk -----Original Message----- From: Gordon Smyth [mailto:smyth@wehi.edu.au] Sent: 08 November 2005 23:17 To: alex lam (RI) Cc: BioC Mailing List Subject: [BioC] averaging replicates within arrays Dear Alex, Do you have the same number of replicate spots for every gene of interest and are the replicate probes identical? If so, see the case study in limma User's Guide on "Within array replicate spots". If only some of your genes are replicated, or if the probes are not identical, I would strongly advice you not to attempt to pre-emptively average the spots. There is little to be gained and much to be lost. I don't understand you comment about ignoring spots with zero weight. limma already does this. Best wishes Gordon >[BioC] averaging replicates within arrays >alex lam (RI) alex.lam at bbsrc.ac.uk >Tue Nov 8 23:49:46 CET 2005 > >Dear Colleagues, > >Hi, I am a first year PhD student recently started on a project involving >microarray data analysis at the Roslin Institute in Scotland. I have >managed to follow the limma vignette in loading the data and performed >the default normalization within arrays. On each array, probes of the same >genes have been placed in more than one spot. What I would like is to do >is to group spots by gene names in MA$genes and calculate the average >logratio as the expression level (better still, ignore the spots with zero >weight). > >I guess I can dump the data and process it in perl but would like to know >how to do this a bit more elegantly in R. Your help is greatly appreciated. > >Many thanks, >Alex

ADD COMMENT • link 20.3 years ago alex lam RI ▴ 310

0

Entering edit mode

On Thu, November 10, 2005 3:03 am, alex lam $RI$ wrote: > I should rephrase the comment on the spots with zero weights. I knew that they were ignored in > normalization. Are they also ignored in other limma methods? I had to explicitly exclude them in > boxplot, but I guess boxplot is just a generic method. > > Regards, > Alex > ------------------------------------ > Alex Lam > PhD student > Department of Genetics and Genomics > Roslin Institute (Edinburgh) > Roslin > Midlothian EH25 9PS > > Phone +44 131 5274471 > Web http://www.roslin.ac.uk All differential expression methods in limma take into account weights. Basically normalizeWithinArrays, lmFit, contrast.fit and plotMA() do. Other functions which take results from these functions inherit the treatment of weights. Functions like boxplot() which are just R functions can't be expected to handle weights. When in doubt, the help pages will tell you. If a function has a 'weights' argument or similar, then it uses weights. Best wishes Gordon

ADD REPLY • link 20.3 years ago Gordon Smyth 53k

Login before adding your answer.