Advice on experimental setup

0

Entering edit mode

David Westergaard ▴ 280

@david-westergaard-5119

Last seen 9.6 years ago

Hello, I am assisting in the setup of an experiment, in which 3 groups, each consisting of 8 subjects, will be fed 3 diets: Group 1 - Diet A Group 2 - Diet B Group 3 - Diet C We plan on using limma to identify the differentially expressed genes. Reading the limma users guide, a factorial design matrix seems to be appropriate. I am, however, wondering if we, by using this setup, can elucidate the differentially expressed genes for each diet, and not just the ones between groups, e.g. when comparing Group 1 - Group 2. What is your advice on this? Thanks in advance! Best regards, David Westergaard

limma limma • 1.3k views

ADD COMMENT • link updated 11.6 years ago by Alex Gutteridge ▴ 650 • written 11.6 years ago by David Westergaard ▴ 280

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 3 months ago

United States

On Wed, Sep 5, 2012 at 4:55 AM, David Westergaard <david@harsk.dk> wrote: > Hello, > > I am assisting in the setup of an experiment, in which 3 groups, each > consisting of 8 subjects, will be fed 3 diets: > Group 1 - Diet A > Group 2 - Diet B > Group 3 - Diet C > > We plan on using limma to identify the differentially expressed genes. > Reading the limma users guide, a factorial design matrix seems to be > appropriate. I am, however, wondering if we, by using this setup, can > elucidate the differentially expressed genes for each diet, and not > just the ones between groups, e.g. when comparing Group 1 - Group 2. > > What is your advice on this? > Hi, David. Are groups 1, 2, and 3 different, or do they differ only in the diet being fed? Sean [[alternative HTML version deleted]]

ADD COMMENT • link 11.6 years ago Sean Davis 21k

0

Entering edit mode

Alex Gutteridge ▴ 650

@alex-gutteridge-2935

Last seen 9.6 years ago

United States

On 05.09.2012 09:55, David Westergaard wrote: > Hello, > > I am assisting in the setup of an experiment, in which 3 groups, each > consisting of 8 subjects, will be fed 3 diets: > Group 1 - Diet A > Group 2 - Diet B > Group 3 - Diet C > > We plan on using limma to identify the differentially expressed > genes. > Reading the limma users guide, a factorial design matrix seems to be > appropriate. I am, however, wondering if we, by using this setup, can > elucidate the differentially expressed genes for each diet, and not > just the ones between groups, e.g. when comparing Group 1 - Group 2. From your reply to Sean it's not clear what you mean by this last sentence. What are the 'differentially expressed genes for each diet'? Any differential expression analysis must compare groups of samples by definition, no? You could compare, say Diet A with the average of Diet B and Diet C (or even the average of all three). Is that what you mean? Whether that makes any sense depends on your experimental design. Most obviously, is one of the the three diets a 'control' diet? If not then would it be appropriate to consider an average of the three diets a kind of meta-control (probably not a word, but hopefully you know what I mean!)? -- Alex Gutteridge

ADD COMMENT • link 11.6 years ago Alex Gutteridge ▴ 650

0

Entering edit mode

Hi Alex, There is no control group as such. One of the diets is somewhat of a control group, but not quite because it is still a diet that has some 'special' properties. I am used to working with experiments which has atleast one control group, so this setup is a bit out of my domain, which is the reason I'm asking this list for advice. I guess what I meant by 'differentially expressed genes for each diet', was a list of genes that can be attributed to this exact diet. Now that I think about it, it may be more appropriate to collect mRNA at the start, mid and end of the experiment, and measure the change in each group, instead of comparing these. The experiment is set to run for 4months. I have not before dealt with experiments which have ran for so long. Would the data collected be suited for microarray analysis? And if so, when should the microarray analysis be performed? When each sample is collected, or all together at the end? Best, David 2012/9/5 Alex Gutteridge <alexg at="" ruggedtextile.com="">: > On 05.09.2012 09:55, David Westergaard wrote: >> >> Hello, >> >> I am assisting in the setup of an experiment, in which 3 groups, each >> consisting of 8 subjects, will be fed 3 diets: >> Group 1 - Diet A >> Group 2 - Diet B >> Group 3 - Diet C >> >> We plan on using limma to identify the differentially expressed genes. >> Reading the limma users guide, a factorial design matrix seems to be >> appropriate. I am, however, wondering if we, by using this setup, can >> elucidate the differentially expressed genes for each diet, and not >> just the ones between groups, e.g. when comparing Group 1 - Group 2. > > > From your reply to Sean it's not clear what you mean by this last sentence. > What are the 'differentially expressed genes for each diet'? Any > differential expression analysis must compare groups of samples by > definition, no? > > You could compare, say Diet A with the average of Diet B and Diet C (or even > the average of all three). Is that what you mean? Whether that makes any > sense depends on your experimental design. Most obviously, is one of the the > three diets a 'control' diet? If not then would it be appropriate to > consider an average of the three diets a kind of meta-control (probably not > a word, but hopefully you know what I mean!)? > > -- > Alex Gutteridge > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 11.6 years ago David Westergaard ▴ 280

0

Entering edit mode

On 05.09.2012 13:53, David Westergaard wrote: > Hi Alex, > > There is no control group as such. One of the diets is somewhat of a > control group, but not quite because it is still a diet that has some > 'special' properties. I am used to working with experiments which has > atleast one control group, so this setup is a bit out of my domain, > which is the reason I'm asking this list for advice. > > I guess what I meant by 'differentially expressed genes for each > diet', was a list of genes that can be attributed to this exact diet. > Now that I think about it, it may be more appropriate to collect mRNA > at the start, mid and end of the experiment, and measure the change > in > each group, instead of comparing these. The experiment is set to run > for 4months. I have not before dealt with experiments which have ran > for so long. Collecting a baseline measure sounds sensible. If these are human subjects you should expect a lot of variation (more than in an inbred animal model), the baseline measure can help correct for that. Your question is still quite hard though. It's often useful for me to think through some scenarios for patterns of expresison that might appear and plot them out before deciding which ones will be interesting and then how to design the experiment to find them. E.g: Say Gene X goes up two fold after 4 months of Diet A and eight fold after 4 months on Diet B do you consider that a Diet B 'specific' gene or not? It goes up in both A and B, but much more in B, so either interpretation is possible. If you do consider that gene Diet B specific then you could do a contrast like (DietBEnd - DietBStart) - (DietAEnd - DietAStart), which shows you genes where the effect was greater in diet B than A without excluding genes that still showed an effect in A. In experiments like these I am always quite wary of the temptation to get differentially expressed gene sets and then do set subtraction. I.e. Diet A 'specific' genes = Diet A DE genes - Diet B DE genes - Diet C DE genes. I always find that approach is very sensitive to the cutoff used to define DE, but it can be easier to interpret I suppose. Again if there is really no control diet then creating a mean 'meta-diet' might simplify the analysis (at the cost of the interpretation being more abstract). So something like: (DietAEnd - DietAStart) - ((DietAEnd - DietAStart)+(DietBEnd - DietBStart)+(DietCEnd - DietCStart))/3. > Would the data collected be suited for microarray analysis? And if > so, > when should the microarray analysis be performed? When each sample is > collected, or all together at the end? I would go for altogether at the end. RNA is very prone to degradation though, so you need to take all neccessary steps to preserve the samples (remove RNases and get to -80C) as soon as possible after collection. > Best, > David > > > 2012/9/5 Alex Gutteridge <alexg at="" ruggedtextile.com="">: >> On 05.09.2012 09:55, David Westergaard wrote: >>> >>> Hello, >>> >>> I am assisting in the setup of an experiment, in which 3 groups, >>> each >>> consisting of 8 subjects, will be fed 3 diets: >>> Group 1 - Diet A >>> Group 2 - Diet B >>> Group 3 - Diet C >>> >>> We plan on using limma to identify the differentially expressed >>> genes. >>> Reading the limma users guide, a factorial design matrix seems to >>> be >>> appropriate. I am, however, wondering if we, by using this setup, >>> can >>> elucidate the differentially expressed genes for each diet, and not >>> just the ones between groups, e.g. when comparing Group 1 - Group >>> 2. >> >> >> From your reply to Sean it's not clear what you mean by this last >> sentence. >> What are the 'differentially expressed genes for each diet'? Any >> differential expression analysis must compare groups of samples by >> definition, no? >> >> You could compare, say Diet A with the average of Diet B and Diet C >> (or even >> the average of all three). Is that what you mean? Whether that makes >> any >> sense depends on your experimental design. Most obviously, is one of >> the the >> three diets a 'control' diet? If not then would it be appropriate to >> consider an average of the three diets a kind of meta-control >> (probably not >> a word, but hopefully you know what I mean!)? >> >> -- >> Alex Gutteridge >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor -- Alex Gutteridge

ADD REPLY • link 11.6 years ago Alex Gutteridge ▴ 650

0

Entering edit mode

Hi Alex, That was a very informative reply on a difficult question, I think you very much for your advice. I might revive this mail at some point, when we get closer to the actual statistical analysis. Best, David 2012/9/6 Alex Gutteridge <alexg at="" ruggedtextile.com="">: > On 05.09.2012 13:53, David Westergaard wrote: >> >> Hi Alex, >> >> There is no control group as such. One of the diets is somewhat of a >> control group, but not quite because it is still a diet that has some >> 'special' properties. I am used to working with experiments which has >> atleast one control group, so this setup is a bit out of my domain, >> which is the reason I'm asking this list for advice. >> >> I guess what I meant by 'differentially expressed genes for each >> diet', was a list of genes that can be attributed to this exact diet. >> Now that I think about it, it may be more appropriate to collect mRNA >> at the start, mid and end of the experiment, and measure the change in >> each group, instead of comparing these. The experiment is set to run >> for 4months. I have not before dealt with experiments which have ran >> for so long. > > > Collecting a baseline measure sounds sensible. If these are human subjects > you should expect a lot of variation (more than in an inbred animal model), > the baseline measure can help correct for that. > > Your question is still quite hard though. It's often useful for me to think > through some scenarios for patterns of expresison that might appear and plot > them out before deciding which ones will be interesting and then how to > design the experiment to find them. E.g: Say Gene X goes up two fold after 4 > months of Diet A and eight fold after 4 months on Diet B do you consider > that a Diet B 'specific' gene or not? It goes up in both A and B, but much > more in B, so either interpretation is possible. If you do consider that > gene Diet B specific then you could do a contrast like (DietBEnd - > DietBStart) - (DietAEnd - DietAStart), which shows you genes where the > effect was greater in diet B than A without excluding genes that still > showed an effect in A. > > In experiments like these I am always quite wary of the temptation to get > differentially expressed gene sets and then do set subtraction. I.e. Diet A > 'specific' genes = Diet A DE genes - Diet B DE genes - Diet C DE genes. I > always find that approach is very sensitive to the cutoff used to define DE, > but it can be easier to interpret I suppose. Again if there is really no > control diet then creating a mean 'meta-diet' might simplify the analysis > (at the cost of the interpretation being more abstract). So something like: > (DietAEnd - DietAStart) - ((DietAEnd - DietAStart)+(DietBEnd - > DietBStart)+(DietCEnd - DietCStart))/3. > > >> Would the data collected be suited for microarray analysis? And if so, >> when should the microarray analysis be performed? When each sample is >> collected, or all together at the end? > > > I would go for altogether at the end. RNA is very prone to degradation > though, so you need to take all neccessary steps to preserve the samples > (remove RNases and get to -80C) as soon as possible after collection. > > >> Best, >> David >> >> >> 2012/9/5 Alex Gutteridge <alexg at="" ruggedtextile.com="">: >>> >>> On 05.09.2012 09:55, David Westergaard wrote: >>>> >>>> >>>> Hello, >>>> >>>> I am assisting in the setup of an experiment, in which 3 groups, each >>>> consisting of 8 subjects, will be fed 3 diets: >>>> Group 1 - Diet A >>>> Group 2 - Diet B >>>> Group 3 - Diet C >>>> >>>> We plan on using limma to identify the differentially expressed genes. >>>> Reading the limma users guide, a factorial design matrix seems to be >>>> appropriate. I am, however, wondering if we, by using this setup, can >>>> elucidate the differentially expressed genes for each diet, and not >>>> just the ones between groups, e.g. when comparing Group 1 - Group 2. >>> >>> >>> >>> From your reply to Sean it's not clear what you mean by this last >>> sentence. >>> What are the 'differentially expressed genes for each diet'? Any >>> differential expression analysis must compare groups of samples by >>> definition, no? >>> >>> You could compare, say Diet A with the average of Diet B and Diet C (or >>> even >>> the average of all three). Is that what you mean? Whether that makes any >>> sense depends on your experimental design. Most obviously, is one of the >>> the >>> three diets a 'control' diet? If not then would it be appropriate to >>> consider an average of the three diets a kind of meta-control (probably >>> not >>> a word, but hopefully you know what I mean!)? >>> >>> -- >>> Alex Gutteridge >>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- > Alex Gutteridge

ADD REPLY • link 11.6 years ago David Westergaard ▴ 280

Login before adding your answer.