averaging replicates within arrays
3
0
Entering edit mode
alex lam RI ▴ 310
@alex-lam-ri-1491
Last seen 7.7 years ago
Dear Colleagues, Hi, I am a first year PhD student recently started on a project involving microarray data analysis at the Roslin Institute in Scotland. I have managed to follow the limma vignette in loading the data and performed the default normalization within arrays. On each array, probes of the same genes have been placed in more than one spot. What I would like is to do is to group spots by gene names in MA$genes and calculate the average logratio as the expression level (better still, ignore the spots with zero weight). I guess I can dump the data and process it in perl but would like to know how to do this a bit more elegantly in R. Your help is greatly appreciated. Many thanks, Alex ADD COMMENT 0 Entering edit mode @adaikalavan-ramasamy-675 Last seen 7.7 years ago As Dr. Roger Gray (of Heriot-Watt) is fond of saying "If you put one foot in a bucket of boiling hot water and another foot in a bucket of ice, then you should be comfortable on average". But if you still want to do this in R, then tapply() might help. Regards, Adai On Tue, 2005-11-08 at 22:49 +0000, alex lam (RI) wrote: > Dear Colleagues, > > Hi, I am a first year PhD student recently started on a project involving microarray data analysis at the Roslin Institute in Scotland. I have managed to follow the limma vignette in loading the data and performed the default normalization within arrays. On each array, probes of the same genes have been placed in more than one spot. What I would like is to do is to group spots by gene names in MA$genes and calculate the average logratio as the expression level (better still, ignore the spots with zero weight). > > I guess I can dump the data and process it in perl but would like to know how to do this a bit more elegantly in R. Your help is greatly appreciated. > > Many thanks, > Alex > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 13 months ago
United States
I am facing a similar problem, and here is what I plan. (I am putting out this suggestion for general discussion). Step 1: Use duplicateCorrelation in limma. The genes with 6 spots will be treated as 2 groups of 3. We need equal groups for the eBayes step. Step 2: Adjust the eBayes estimate of variance for 6 spots instead of 3. Compute all the contrasts using all 6 spots, and write a bit of code to redo the tests on the 6 spots data. I think this is better than using the extra spots to check the consistency of results, as has been suggested previously on this list. We have more spots for some genes because, given the space on the array, we did extra duplication of genes that were of primary interest. We ought to use the increased power this gives us. --Naomi p.s. In our case, we have either 1 or 2 spots per gene. In designing an array, I probably would use 4 spots for every gene rather than 3 for some and 6 for others. At 11:03 AM 11/9/2005, alex lam (RI) wrote: >Dear Gordon, >Thanks for your reply. The probes are identical and every gene is >replicated but not in the same number. Some are replicated 3 times >and some 6 times. Is that going to be a problem? > >I should rephrase the comment on the spots with zero weights. I knew >that they were ignored in normalization. Are they also ignored in >other limma methods? I had to explicitly exclude them in boxplot, >but I guess boxplot is just a generic method. > >Regards, >Alex >------------------------------------ >Alex Lam >PhD student >Department of Genetics and Genomics >Roslin Institute (Edinburgh) >Roslin >Midlothian EH25 9PS > >Phone +44 131 5274471 >Web http://www.roslin.ac.uk > > >-----Original Message----- >From: Gordon Smyth [mailto:smyth at wehi.edu.au] >Sent: 08 November 2005 23:17 >To: alex lam (RI) >Cc: BioC Mailing List >Subject: [BioC] averaging replicates within arrays > > >Dear Alex, > >Do you have the same number of replicate spots for every gene of interest >and are the replicate probes identical? If so, see the case study in limma >User's Guide on "Within array replicate spots". > >If only some of your genes are replicated, or if the probes are not >identical, I would strongly advice you not to attempt to pre- emptively >average the spots. There is little to be gained and much to be lost. > >I don't understand you comment about ignoring spots with zero weight. limma >already does this. > >Best wishes >Gordon > > >[BioC] averaging replicates within arrays > >alex lam (RI) alex.lam at bbsrc.ac.uk > >Tue Nov 8 23:49:46 CET 2005 > > > >Dear Colleagues, > > > >Hi, I am a first year PhD student recently started on a project involving > >microarray data analysis at the Roslin Institute in Scotland. I have > >managed to follow the limma vignette in loading the data and performed > >the default normalization within arrays. On each array, probes of the same > >genes have been placed in more than one spot. What I would like is to do > >is to group spots by gene names in MA$genes and calculate the average > >logratio as the expression level (better still, ignore the spots with zero > >weight). > > > >I guess I can dump the data and process it in perl but would like to know > >how to do this a bit more elegantly in R. Your help is greatly appreciated. > > > >Many thanks, > >Alex > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111 ADD COMMENT 0 Entering edit mode On Thu, November 10, 2005 4:33 am, Naomi Altman wrote: > I am facing a similar problem, and here is what I plan. (I am > putting out this suggestion for general discussion). > > Step 1: Use duplicateCorrelation in limma. The genes with 6 spots > will be treated as 2 groups of 3. We need equal groups for the eBayes step. > > Step 2: Adjust the eBayes estimate of variance for 6 spots instead of > 3. Compute all the contrasts using all 6 spots, and write a bit of > code to redo the tests on the 6 spots data. > > I think this is better than using the extra spots to check the > consistency of results, as has been suggested previously on this > list. We have more spots for some genes because, given the space on > the array, we did extra duplication of genes that were of primary > interest. We ought to use the increased power this gives us. > > --Naomi > > p.s. In our case, we have either 1 or 2 spots per gene. In designing > an array, I probably would use 4 spots for every gene rather than 3 > for some and 6 for others. That seems a well motivated approach to me. Cheers Gordon ADD REPLY 0 Entering edit mode alex lam RI ▴ 310 @alex-lam-ri-1491 Last seen 7.7 years ago Dear Gordon, Thanks for your reply. The probes are identical and every gene is replicated but not in the same number. Some are replicated 3 times and some 6 times. Is that going to be a problem? I should rephrase the comment on the spots with zero weights. I knew that they were ignored in normalization. Are they also ignored in other limma methods? I had to explicitly exclude them in boxplot, but I guess boxplot is just a generic method. Regards, Alex ------------------------------------ Alex Lam PhD student Department of Genetics and Genomics Roslin Institute (Edinburgh) Roslin Midlothian EH25 9PS Phone +44 131 5274471 Web http://www.roslin.ac.uk -----Original Message----- From: Gordon Smyth [mailto:smyth@wehi.edu.au] Sent: 08 November 2005 23:17 To: alex lam (RI) Cc: BioC Mailing List Subject: [BioC] averaging replicates within arrays Dear Alex, Do you have the same number of replicate spots for every gene of interest and are the replicate probes identical? If so, see the case study in limma User's Guide on "Within array replicate spots". If only some of your genes are replicated, or if the probes are not identical, I would strongly advice you not to attempt to pre-emptively average the spots. There is little to be gained and much to be lost. I don't understand you comment about ignoring spots with zero weight. limma already does this. Best wishes Gordon >[BioC] averaging replicates within arrays >alex lam (RI) alex.lam at bbsrc.ac.uk >Tue Nov 8 23:49:46 CET 2005 > >Dear Colleagues, > >Hi, I am a first year PhD student recently started on a project involving >microarray data analysis at the Roslin Institute in Scotland. I have >managed to follow the limma vignette in loading the data and performed >the default normalization within arrays. On each array, probes of the same >genes have been placed in more than one spot. What I would like is to do >is to group spots by gene names in MA$genes and calculate the average >logratio as the expression level (better still, ignore the spots with zero >weight). > >I guess I can dump the data and process it in perl but would like to know >how to do this a bit more elegantly in R. Your help is greatly appreciated. > >Many thanks, >Alex
0
Entering edit mode
On Thu, November 10, 2005 3:03 am, alex lam $$RI$$ wrote: > I should rephrase the comment on the spots with zero weights. I knew that they were ignored in > normalization. Are they also ignored in other limma methods? I had to explicitly exclude them in > boxplot, but I guess boxplot is just a generic method. > > Regards, > Alex > ------------------------------------ > Alex Lam > PhD student > Department of Genetics and Genomics > Roslin Institute (Edinburgh) > Roslin > Midlothian EH25 9PS > > Phone +44 131 5274471 > Web http://www.roslin.ac.uk All differential expression methods in limma take into account weights. Basically normalizeWithinArrays, lmFit, contrast.fit and plotMA() do. Other functions which take results from these functions inherit the treatment of weights. Functions like boxplot() which are just R functions can't be expected to handle weights. When in doubt, the help pages will tell you. If a function has a 'weights' argument or similar, then it uses weights. Best wishes Gordon

Traffic: 549 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.