Question

Data analysis

0

Entering edit mode

Jason Skelton ▴ 510

@jason-skelton-135

Last seen 9.6 years ago

Hi I have a question about data analysis after normalisation I have normalised in limma and applied the generalized least squares linear models to my data with very nice results! My experiment is a Three sample experiment (Three different treatments)compared to a commmon reference. The Three samples have six slides per experiment, 3 in one dye orientation and 3 dye swapped to give 18 slides in total. I have approx 3500 genes in duplicate on my array at present. Currently I have normalised all three sets of data seperately but would like to be able to compare the three data sets. I was thinking of using the mva functions like dist/hclust etc. My questions: Is this the best way of comparing this data or are there other/better methods that could be used that anyone has had experience with. e.g. similar to the two-sample experiment example in limma user guide where results from the linear model and ebayes are displayed with a heatmap ? (sorry I'm presuming that this is the kind of thing I should be doing ?) Also I'm presuming the data I want to use for these methods are the normalised $M values ? OR do I want to use the results from gls.series/lm.series and ebayes for a 3 sample comparison ? please could someone give me an example of the best method they recommend with some commands that I could try using... Thanks very much to anyone who can help Jason -- -------------------------------- Jason Skelton Pathogen Microarrays Wellcome Trust Sanger Institute Hinxton Cambridge CB10 1SA Tel +44(0)1223 834244 Ext 7123 Fax +44(0)1223 494919

limma limma • 900 views

ADD COMMENT • link updated 20.5 years ago by Gordon Smyth 50k • written 20.5 years ago by Jason Skelton ▴ 510

score 0 · Answer 1 · 2003-10-14

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 5 hours ago

WEHI, Melbourne, Australia

At 02:20 AM 14/10/2003, Jason Skelton wrote: >I have a question about data analysis after normalisation >I have normalised in limma and applied the generalized least squares >linear models to my data with very nice results! > >My experiment is a Three sample experiment >(Three different treatments)compared to a commmon reference. >The Three samples have six slides per experiment, 3 in one dye orientation >and 3 dye swapped to give 18 slides in total. >I have approx 3500 genes in duplicate on my array at present. > >Currently I have normalised all three sets of data seperately but would >like to be able to compare the three data sets. >I was thinking of using the mva functions like dist/hclust etc. > >My questions: >Is this the best way of comparing this data or are there other/better >methods that could be used that anyone has had experience with. I would use the limma commands lmFit (or lm.series or gls.series) followed by makeContrasts, eBayes and classifyTests. See the earliers posts: https://stat.ethz.ch/pipermail/bioconductor/2003-September/002406.html https://stat.ethz.ch/pipermail/bioconductor/2003-September/002405.html This would allow you to answer the following questions: 1. Which genes are differentially expressed between each pair of treatments? 2. Which genes are differentially expressed between each treatment and the reference? 3. Which genes show _any_ differences between the treatments? Your problem doesn't sound like a cluster analysis problem to me. Cheers Gordon >e.g. similar to the two-sample experiment example in limma user guide >where results from the linear model and ebayes are displayed with a heatmap ? >(sorry I'm presuming that this is the kind of thing I should be doing ?) > >Also I'm presuming the data I want to use for these methods are the >normalised $M values ? OR do I want to use the results from >gls.series/lm.series and ebayes for a 3 sample comparison ? > >please could someone give me an example of the best method they recommend >with some commands that I could try using... > >Thanks very much to anyone who can help > >Jason

ADD COMMENT • link 20.5 years ago Gordon Smyth 50k

0

Entering edit mode

Gordon Smyth wrote: > > I would use the limma commands lmFit (or lm.series or gls.series) > followed by makeContrasts, eBayes and classifyTests. See the earliers > posts: > Thanks for this infomation Gordon I'll try this and see what results I get......... On a different note The arrays I have tested LIMMA on have 2 duplicates and are spaced evenly throughout the array and so have no problems running your functions. Someone else at the Sanger Insitite would like to be able to use LIMMA but the number of duplicates for each gene differs on their array e.g for some genes their are two copies and for others there would be four copies or more which inturn obviously effects spacing etc between replicates. I'm not sure why they would want differing numbers of copies of genes but they would like to be able to estimate the correlation between these genes anyway and obviously see the results as one data point per merged gene. I've tried to think of how this can be done but it seems overly complex and I'm not sure if it is at all possible in R or Limma. I'm guessing there is no way of carryout the correlation, series model fits etc based simply on the "Name" specified in the GAL files ? or some how specifying the duplicate number for each gene seperately and somehow merging this information for use as a parameter ? I'm doubting very much that this can be done at all but it's worth asking ;-) thanks Jason -- -------------------------------- Jason Skelton Pathogen Microarrays Wellcome Trust Sanger Institute Hinxton Cambridge CB10 1SA Tel +44(0)1223 834244 Ext 7123 Fax +44(0)1223 494919

ADD REPLY • link 20.5 years ago Jason Skelton ▴ 510

0

Entering edit mode

At 11:53 PM 16/10/2003, Jason Skelton wrote: >Gordon Smyth wrote: >> >>I would use the limma commands lmFit (or lm.series or gls.series) >>followed by makeContrasts, eBayes and classifyTests. See the earliers posts: >Thanks for this infomation Gordon I'll try this and see what results I >get......... > >On a different note >The arrays I have tested LIMMA on have 2 duplicates and are spaced evenly >throughout the array and so have no problems running your functions. > >Someone else at the Sanger Insitite would like to be able to use LIMMA but >the number of duplicates for each gene differs on their array e.g for some >genes their are two copies and for others there would be four copies or >more which inturn obviously effects spacing etc between replicates. >I'm not sure why they would want differing numbers of copies of genes but >they would like to be able to estimate the correlation between these genes >anyway and obviously see the results as one data point per merged gene. I haven't implemented this in limma because it seems to me that it might invalidate the assumptions behind the duplicate correlation approach. See the earlier post: https://stat.ethz.ch/pipermail/bioconductor/2003-August/002224.html >I've tried to think of how this can be done but it seems overly complex >and I'm not sure if it is at all possible in R or Limma. > >I'm guessing there is no way of carryout the correlation, series model >fits etc based simply on the "Name" specified in the GAL files ? No. Cheers Gordon >or some how specifying the duplicate number for each gene seperately >and somehow merging this information for use as a parameter ? > >I'm doubting very much that this can be done at all but it's worth asking ;-) > >thanks > >Jason

ADD REPLY • link 20.5 years ago Gordon Smyth 50k