limma help - choosing an approach

0

Entering edit mode

john seers IFR ▴ 810

@john-seers-ifr-1605

Last seen 11.4 years ago

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060912/ 1644e8ef/attachment.pl

• 1.6k views

ADD COMMENT • link updated 19.4 years ago by Gordon Smyth 53k • written 19.4 years ago by john seers IFR ▴ 810

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.8 years ago

United States

The only text I know that covers the two-factor factorial design with added control is Experimental Design by W. Federer. It is out of print, but should be available in a university library. In the limma context, I would use separate analysis of 2 channel data and use the "factor effects" model that considers each treatment combination to be a treatment. Limma in any case uses contrasts rather than computing factor effects, and you probably know which treatments you want to compare, so the additional controls should not create any problems. --Naomi At 04:57 AM 9/12/2006, john seers $IFR$ wrote: > >Hello > >I am trying to analyse some slides but I am a bit stumped. The arrays I >have been given vary with time and a concentration of a substance. i.e >two experimental dimensions (See targets data below). What is throwing >me is that the zero concentration is only in the control and I cannot >work out what model is suitable for this. Can anybody help with some >advice or tips on the approach to use? > >I have looked at the factorial design and the timecourse but neither >seem to offer a comparison/contrast against the control. Is there a way >to do this? Do I have to use the "Separate Analysis of Two Channel data" >in Chapter 9? > >Any advice gratefully appreciated. > >Regards > > >John Seers > > > > > My targets file looks like this: > > > > >SlideNumber FileName Cy3 Cy5 Time conc >598 598new.gpr f100t1 Control t1 c100 >599 599new.gpr f20t4 Control t4 c20 >600 600new.gpr f100t4 Control t1 c100 >617 617new.gpr f20t1 Control t1 c20 >621 621new.gpr f20t1 Control t1 c20 >637 637new.gpr f20t4 Control t4 c20 >638 638new.gpr f20t1 Control t1 c20 >639 639new.gpr f100t1 Control t1 c100 >748 748new.gpr f20t4 Control t4 c20 >751 751new.gpr f20t4 Control t4 c20 >833 833new.gpr f100t4 Control t4 c100 >835 835new.gpr f100t1 Control t1 c100 >836 836new.gpr f100t4 Control t4 c100 >957 957new.gpr f100t1 Control t1 c100 >958 958new.gpr f20t1 Control t1 c20 > > > > > > >--- > >John Seers >Institute of Food Research >Norwich Research Park >Colney >Norwich >NR4 7UA > > >tel +44 (0)1603 251497 >fax +44 (0)1603 507723 >e-mail john.seers at bbsrc.ac.uk >e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ ><http: www.ifr.ac.uk="" edisclaimer=""/> > >Web sites: > >www.ifr.ac.uk <http: www.ifr.ac.uk=""/> >www.foodandhealthnetwork.com <http: www.foodandhealthnetwork.com=""/> > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 19.4 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Thank you very much for the reply Naomi. So, am I right in thinking that this is not a usual approach? I have been told that this is a standard design and that Genespring et al handle it routinely. Everything I look at in limma is tantalisingly close to what I want but not quite. I am wondering if I am making it more difficult than it is? Can you tell me what you mean by the "factor effects" model? i.e. is it in the limma userguide? Regards John Seers --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail john.seers at bbsrc.ac.uk e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -----Original Message----- From: Naomi Altman [mailto:naomi@stat.psu.edu] Sent: 12 September 2006 13:16 To: john seers (IFR); bioconductor at stat.math.ethz.ch Subject: Re: [BioC] limma help - choosing an approach The only text I know that covers the two-factor factorial design with added control is Experimental Design by W. Federer. It is out of print, but should be available in a university library. In the limma context, I would use separate analysis of 2 channel data and use the "factor effects" model that considers each treatment combination to be a treatment. Limma in any case uses contrasts rather than computing factor effects, and you probably know which treatments you want to compare, so the additional controls should not create any problems. --Naomi At 04:57 AM 9/12/2006, john seers $IFR$ wrote: > >Hello > >I am trying to analyse some slides but I am a bit stumped. The arrays I >have been given vary with time and a concentration of a substance. i.e >two experimental dimensions (See targets data below). What is throwing >me is that the zero concentration is only in the control and I cannot >work out what model is suitable for this. Can anybody help with some >advice or tips on the approach to use? > >I have looked at the factorial design and the timecourse but neither >seem to offer a comparison/contrast against the control. Is there a way >to do this? Do I have to use the "Separate Analysis of Two Channel data" >in Chapter 9? > >Any advice gratefully appreciated. > >Regards > > >John Seers > > > > > My targets file looks like this: > > > > >SlideNumber FileName Cy3 Cy5 Time conc >598 598new.gpr f100t1 Control t1 c100 >599 599new.gpr f20t4 Control t4 c20 >600 600new.gpr f100t4 Control t1 c100 >617 617new.gpr f20t1 Control t1 c20 >621 621new.gpr f20t1 Control t1 c20 >637 637new.gpr f20t4 Control t4 c20 >638 638new.gpr f20t1 Control t1 c20 >639 639new.gpr f100t1 Control t1 c100 >748 748new.gpr f20t4 Control t4 c20 >751 751new.gpr f20t4 Control t4 c20 >833 833new.gpr f100t4 Control t4 c100 >835 835new.gpr f100t1 Control t1 c100 >836 836new.gpr f100t4 Control t4 c100 >957 957new.gpr f100t1 Control t1 c100 >958 958new.gpr f20t1 Control t1 c20 > > > > > > >--- > >John Seers >Institute of Food Research >Norwich Research Park >Colney >Norwich >NR4 7UA > > >tel +44 (0)1603 251497 >fax +44 (0)1603 507723 >e-mail john.seers at bbsrc.ac.uk >e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ ><http: www.ifr.ac.uk="" edisclaimer=""/> > >Web sites: > >www.ifr.ac.uk <http: www.ifr.ac.uk=""/> >www.foodandhealthnetwork.com <http: www.foodandhealthnetwork.com=""/> > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD REPLY • link 19.4 years ago john seers IFR ▴ 810

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.8 years ago

United States

Dear John, You have 2 factors - time and concentration. In a factorial design, you would have each concentration measured at each time. I am not quite sure I understand your design. It looks like the control might be concentration 0, and it is measured at each time. If so, you actually do have a factorial design, and the model statement for this design y~time + conc + time:conc can be used to create a design matrix for the single channel analysis (and you do not need the material from Federer). This is the factor effects model. You do need to include duplicatecorrelation to handle the 2 samples on the same array. I am not sure how Genespring would handle this, but I would be quite surprised if it is sufficiently flexible to handle your set-up absolutely correctly. --Naomi At 08:40 AM 9/12/2006, john seers $IFR$ wrote: >Thank you very much for the reply Naomi. > >So, am I right in thinking that this is not a usual approach? I have >been told that this is a standard design and that Genespring et al >handle it routinely. Everything I look at in limma is tantalisingly >close to what I want but not quite. I am wondering if I am making it >more difficult than it is? > >Can you tell me what you mean by the "factor effects" model? i.e. is it >in the limma userguide? > >Regards > > >John Seers > > > > >--- > >John Seers >Institute of Food Research >Norwich Research Park >Colney >Norwich >NR4 7UA > > >tel +44 (0)1603 251497 >fax +44 (0)1603 507723 >e-mail john.seers at bbsrc.ac.uk >e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ > >Web sites: > >www.ifr.ac.uk >www.foodandhealthnetwork.com > > >-----Original Message----- >From: Naomi Altman [mailto:naomi at stat.psu.edu] >Sent: 12 September 2006 13:16 >To: john seers (IFR); bioconductor at stat.math.ethz.ch >Subject: Re: [BioC] limma help - choosing an approach > > >The only text I know that covers the two-factor factorial design with >added control is Experimental Design by W. Federer. It is out of >print, but should be available in a university library. > >In the limma context, I would use separate analysis of 2 channel data >and use the "factor effects" model that considers each treatment >combination to be a treatment. Limma in any case uses contrasts >rather than computing factor effects, and you probably know which >treatments you want to compare, so the additional controls should not >create any problems. > >--Naomi > >At 04:57 AM 9/12/2006, john seers $IFR$ wrote: > > > >Hello > > > >I am trying to analyse some slides but I am a bit stumped. The arrays I > >have been given vary with time and a concentration of a substance. i.e > >two experimental dimensions (See targets data below). What is throwing > >me is that the zero concentration is only in the control and I cannot > >work out what model is suitable for this. Can anybody help with some > >advice or tips on the approach to use? > > > >I have looked at the factorial design and the timecourse but neither > >seem to offer a comparison/contrast against the control. Is there a way > >to do this? Do I have to use the "Separate Analysis of Two Channel >data" > >in Chapter 9? > > > >Any advice gratefully appreciated. > > > >Regards > > > > > >John Seers > > > > > > > > > > My targets file looks like this: > > > > > > > > > >SlideNumber FileName Cy3 Cy5 Time conc > >598 598new.gpr f100t1 Control t1 c100 > >599 599new.gpr f20t4 Control t4 c20 > >600 600new.gpr f100t4 Control t1 c100 > >617 617new.gpr f20t1 Control t1 c20 > >621 621new.gpr f20t1 Control t1 c20 > >637 637new.gpr f20t4 Control t4 c20 > >638 638new.gpr f20t1 Control t1 c20 > >639 639new.gpr f100t1 Control t1 c100 > >748 748new.gpr f20t4 Control t4 c20 > >751 751new.gpr f20t4 Control t4 c20 > >833 833new.gpr f100t4 Control t4 c100 > >835 835new.gpr f100t1 Control t1 c100 > >836 836new.gpr f100t4 Control t4 c100 > >957 957new.gpr f100t1 Control t1 c100 > >958 958new.gpr f20t1 Control t1 c20 > > > > > > > > > > > > > >--- > > > >John Seers > >Institute of Food Research > >Norwich Research Park > >Colney > >Norwich > >NR4 7UA > > > > > >tel +44 (0)1603 251497 > >fax +44 (0)1603 507723 > >e-mail john.seers at bbsrc.ac.uk > >e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ > ><http: www.ifr.ac.uk="" edisclaimer=""/> > > > >Web sites: > > > >www.ifr.ac.uk <http: www.ifr.ac.uk=""/> > >www.foodandhealthnetwork.com <http: www.foodandhealthnetwork.com=""/> > > > > > > [[alternative HTML version deleted]] > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor at stat.math.ethz.ch > >https://stat.ethz.ch/mailman/listinfo/bioconductor > >Search the archives: > >http://news.gmane.org/gmane.science.biology.informatics.conductor > >Naomi S. Altman 814-865-3791 (voice) >Associate Professor >Dept. of Statistics 814-863-7114 (fax) >Penn State University 814-865-1348 (Statistics) >University Park, PA 16802-2111 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 19.4 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Hi Naomi Thanks again for taking the time to reply. Yes, your appreciation of the design is correct. (Not actually my design or I would understand it better!). The control is of concentration zero. By "single channel analysis" I think you mean the "Separate Channel Analysis" described in Chapter 9 of the limma user guide? So, this is what I have done. Does it make any sense? ...... Read in data etc and normalise into MA.nba # Read in the targets file and convert it to separate channel format targets<-readTargets(targetsfile) targetsC<-targetsA2C(targets) # Make a design file Time<-targetsC$Time Conc<-targetsC$Conc Time<-factor(Time, levels=unique(Time)) Conc<-factor(Conc, levels=unique(Conc)) design<-model.matrix(~Time + Conc + Time:Conc) colnames(design)<-c("Intercept", "Time", "Conc", "TimeConc") # corfit<-intraspotCorrelation(MA.nba, design) fit<-lmscFit(MA.nba, design, correlation=corfit$consensus) # contrast matrix contrast.matrix<-makeContrasts(Time, Conc, TimeConc, levels=design) # contrast timepoints and controls fit2<- contrasts.fit(fit, contrast.matrix) # eBayes eb<- eBayes(fit2) ngenes<-20 topa1<-topTable(eb, coef=1, number=ngenes, adjust="none", sort.by="M") topa2<-topTable(eb, coef=2, number=ngenes, adjust="none", sort.by="M") topa3<-topTable(eb, coef=3, number=ngenes, adjust="none", sort.by="M") Thanks. John Seers Target file: SlideNumber FileName Cy3 Cy5 Time Conc 598 598new.gpr f100t1 Control t1 c100 599 599new.gpr f20t4 Control t4 c20 600 600new.gpr f100t4 Control t1 c100 617 617new.gpr f20t1 Control t1 c20 621 621new.gpr f20t1 Control t1 c20 637 637new.gpr f20t4 Control t4 c20 638 638new.gpr f20t1 Control t1 c20 639 639new.gpr f100t1 Control t1 c100 748 748new.gpr f20t4 Control t4 c20 751 751new.gpr f20t4 Control t4 c20 833 833new.gpr f100t4 Control t4 c100 835 835new.gpr f100t1 Control t1 c100 836 836new.gpr f100t4 Control t4 c100 957 957new.gpr f100t1 Control t1 c100 958 958new.gpr f20t1 Control t1 c20 --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail john.seers at bbsrc.ac.uk e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com

ADD REPLY • link 19.4 years ago john seers IFR ▴ 810

0

Entering edit mode

Quoting "john seers (IFR)" <john.seers at="" bbsrc.ac.uk="">: > 600 600new.gpr f100t4 Control t1 c100 Hi John, I'm afraid I have no comments to make about your design, but I noticed this and I wondered if you have an error in your targets file: for slide 600 the "time" says t1, but there's a t4 on the Cy3 sample name... just wondering. Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK

ADD REPLY • link 19.4 years ago J.delasHeras@ed.ac.uk ★ 1.9k

0

Entering edit mode

Yes, quite likely to have a mistake in it as I am changing it a lot at the moment experimenting with approaches. Thanks for pointing this out. Regards J --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail john.seers at bbsrc.ac.uk e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -----Original Message----- From: J.delasHeras@ed.ac.uk [mailto:J.delasHeras@ed.ac.uk] Sent: 13 September 2006 13:08 To: john seers (IFR) Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] limma help - choosing an approach Quoting "john seers (IFR)" <john.seers at="" bbsrc.ac.uk="">: > 600 600new.gpr f100t4 Control t1 c100 Hi John, I'm afraid I have no comments to make about your design, but I noticed this and I wondered if you have an error in your targets file: for slide 600 the "time" says t1, but there's a t4 on the Cy3 sample name... just wondering. Jose -- Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374 Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360 Swann Building, Mayfield Road University of Edinburgh Edinburgh EH9 3JR UK

ADD REPLY • link 19.4 years ago john seers IFR ▴ 810

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.8 years ago

United States

It looks OK to me except that a) I needed to build my targets from a list of treatments for red and gree. b) The design matrix will have more than 4 columns, as you have 3 levels of concentration. So there are 2 columns for concentration and 2 for time:concentration. --Naomi At 05:28 AM 9/13/2006, john seers $IFR$ wrote: >Hi Naomi > >Thanks again for taking the time to reply. > >Yes, your appreciation of the design is correct. (Not actually my design >or I would understand it better!). The control is of concentration zero. > >By "single channel analysis" I think you mean the "Separate Channel >Analysis" described in Chapter 9 of the limma user guide? > >So, this is what I have done. Does it make any sense? > > >...... Read in data etc and normalise into MA.nba > > ># Read in the targets file and convert it to separate channel format >targets<-readTargets(targetsfile) >targetsC<-targetsA2C(targets) > ># Make a design file >Time<-targetsC$Time >Conc<-targetsC$Conc >Time<-factor(Time, levels=unique(Time)) >Conc<-factor(Conc, levels=unique(Conc)) >design<-model.matrix(~Time + Conc + Time:Conc) >colnames(design)<-c("Intercept", "Time", "Conc", "TimeConc") > ># >corfit<-intraspotCorrelation(MA.nba, design) >fit<-lmscFit(MA.nba, design, correlation=corfit$consensus) > ># contrast matrix >contrast.matrix<-makeContrasts(Time, Conc, TimeConc, levels=design) > ># contrast timepoints and controls >fit2<- contrasts.fit(fit, contrast.matrix) > > ># eBayes >eb<- eBayes(fit2) > > >ngenes<-20 >topa1<-topTable(eb, coef=1, number=ngenes, adjust="none", sort.by="M") >topa2<-topTable(eb, coef=2, number=ngenes, adjust="none", sort.by="M") >topa3<-topTable(eb, coef=3, number=ngenes, adjust="none", sort.by="M") > >Thanks. > > >John Seers > > >Target file: > >SlideNumber FileName Cy3 Cy5 Time Conc >598 598new.gpr f100t1 Control t1 c100 >599 599new.gpr f20t4 Control t4 c20 >600 600new.gpr f100t4 Control t1 c100 >617 617new.gpr f20t1 Control t1 c20 >621 621new.gpr f20t1 Control t1 c20 >637 637new.gpr f20t4 Control t4 c20 >638 638new.gpr f20t1 Control t1 c20 >639 639new.gpr f100t1 Control t1 c100 >748 748new.gpr f20t4 Control t4 c20 >751 751new.gpr f20t4 Control t4 c20 >833 833new.gpr f100t4 Control t4 c100 >835 835new.gpr f100t1 Control t1 c100 >836 836new.gpr f100t4 Control t4 c100 >957 957new.gpr f100t1 Control t1 c100 >958 958new.gpr f20t1 Control t1 c20 > > > >--- > >John Seers >Institute of Food Research >Norwich Research Park >Colney >Norwich >NR4 7UA > > >tel +44 (0)1603 251497 >fax +44 (0)1603 507723 >e-mail john.seers at bbsrc.ac.uk >e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ > >Web sites: > >www.ifr.ac.uk >www.foodandhealthnetwork.com > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 19.4 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Hello Naomi (and anyone else) Thanks again for your help. It has been extremely helpful. Almost there I think, though I am still just doing it on autopilot and not quite sure how meaningful this is. I am slightly worried by the limma documentation which describes the single channel analysis as "experimental" so combined with my lack of understanding makes me feel quite uncertain. Anyway I can run it now and pull out lists of genes - though what they mean I am not sure yet. So, a bit of sanity checking if that is OK. > a) I needed to build my targets from a list of treatments for red and gree. I guess these are equivalent to Cy5 and Cy3 with Genepix. The function "targetsC<-targetsA2C(targets)" converts it to single channel format. (See targets files below, showing before and after). >b) The design matrix will have more than 4 columns, as you have 3 >levels of concentration. So there are 2 columns for concentration >and 2 for time:concentration. I have altered the targets file to reflect this better. So now my code looks like: Time<-targetsC$Time Conc<-targetsC$Target Time<-factor(Time, levels=c("t1", "t4")) Conc<-factor(Conc, levels=c("c0", "c20", "c100")) Giving: > Time [1] t1 t1 t4 t4 t4 t4 t1 t1 t1 t1 t4 t4 t1 t1 t1 t1 t4 t4 t4 t4 t4 t4 t1 t1 t4 t4 t1 t1 t1 t1 Levels: t1 t4 > Conc [1] c100 c0 c20 c0 c100 c0 c20 c0 c20 c0 c20 c0 c20 c0 c100 c0 c20 c0 c20 c0 c100 c0 c100 c0 c100 c0 [27] c100 c0 c20 c0 Levels: c0 c20 c100 > Set up the design file: design<-model.matrix(~Time + Conc + Time:Conc) Giving (first three lines): (Intercept) Timet4 Concc20 Concc100 Timet4:Concc20 Timet4:Concc100 1 1 0 0 1 0 0 2 1 0 0 0 0 0 colnames(design)<-c("Intercept", "t4", "c20", "c100", "t4c20", "t4c100") corfit<-intraspotCorrelation(MA.nba, design) fit<-lmscFit(MA.nba, design, correlation=corfit$consensus) # contrast matrix contrast.matrix<-makeContrasts(t4, c20, c100, t4c20, t4c100, levels=design) (What is possible here? What sort of contrasts are valid/meaningful? Presumably I could do c20 + c100 to compare against c0?). # contrast timepoints and controls fit2<- contrasts.fit(fit, contrast.matrix) # eBayes eb<- eBayes(fit2) ngenes<-20 topa1<-topTable(eb, coef=1, number=ngenes, adjust="none", sort.by="M") ..... Regards John Seers Targets data before transforming using targetsA2C SlideNumber FileName Cy3 Cy5 Time 598 598new.gpr c100 c0 t1 599 599new.gpr c20 c0 t4 600 600new.gpr c100 c0 t4 617 617new.gpr c20 c0 t1 621 621new.gpr c20 c0 t1 637 637new.gpr c20 c0 t4 638 638new.gpr c20 c0 t1 639 639new.gpr c100 c0 t1 748 748new.gpr c20 c0 t4 751 751new.gpr c20 c0 t4 833 833new.gpr c100 c0 t4 835 835new.gpr c100 c0 t1 836 836new.gpr c100 c0 t4 957 957new.gpr c100 c0 t1 958 958new.gpr c20 c0 t1 Targets data after transforming using targetsA2C > targetsC channel.col SlideNumber FileName Time Target 598new.1 1 598 598new.gpr t1 c100 598new.2 2 598 598new.gpr t1 c0 599new.1 1 599 599new.gpr t4 c20 599new.2 2 599 599new.gpr t4 c0 600new.1 1 600 600new.gpr t4 c100 600new.2 2 600 600new.gpr t4 c0 617new.1 1 617 617new.gpr t1 c20 617new.2 2 617 617new.gpr t1 c0 621new.1 1 621 621new.gpr t1 c20 621new.2 2 621 621new.gpr t1 c0 637new.1 1 637 637new.gpr t4 c20 637new.2 2 637 637new.gpr t4 c0 638new.1 1 638 638new.gpr t1 c20 638new.2 2 638 638new.gpr t1 c0 639new.1 1 639 639new.gpr t1 c100 639new.2 2 639 639new.gpr t1 c0 748new.1 1 748 748new.gpr t4 c20 748new.2 2 748 748new.gpr t4 c0 751new.1 1 751 751new.gpr t4 c20 751new.2 2 751 751new.gpr t4 c0 833new.1 1 833 833new.gpr t4 c100 833new.2 2 833 833new.gpr t4 c0 835new.1 1 835 835new.gpr t1 c100 835new.2 2 835 835new.gpr t1 c0 836new.1 1 836 836new.gpr t4 c100 836new.2 2 836 836new.gpr t4 c0 957new.1 1 957 957new.gpr t1 c100 957new.2 2 957 957new.gpr t1 c0 958new.1 1 958 958new.gpr t1 c20 958new.2 2 958 958new.gpr t1 c0 > --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail john.seers at bbsrc.ac.uk e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com

ADD REPLY • link 19.4 years ago john seers IFR ▴ 810

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.8 years ago

United States

I don't think that the single channel analysis is experimental. It follows the rules of mixed model analysis which has been around for a very long time. Any set of contrasts is statistically valid. The question is which ones may be biologically meaningful. And that depends in part on the biological process. E.g. often contrasts among time points with the same concentration and concentrations with the same time point may be of most interest. But if the chemical slows the cell metabolism by a factor of 4, you might want to do contrasts of concentration 1 time 1 with concentration 2 time 4 etc. --Naomi At 06:19 AM 9/14/2006, you wrote: >Hello Naomi (and anyone else) > >Thanks again for your help. It has been extremely helpful. Almost there >I think, though I am still just doing it on autopilot and not quite sure >how meaningful this is. I am slightly worried by the limma documentation >which describes the single channel analysis as "experimental" so >combined with my lack of understanding makes me feel quite uncertain. >Anyway I can run it now and pull out lists of genes - though what they >mean I am not sure yet. > >So, a bit of sanity checking if that is OK. > > > a) I needed to build my targets from a list of treatments for red and >gree. > >I guess these are equivalent to Cy5 and Cy3 with Genepix. The function >"targetsC<-targetsA2C(targets)" converts it to single channel format. >(See targets files below, showing before and after). > > > >b) The design matrix will have more than 4 columns, as you have 3 > >levels of concentration. So there are 2 columns for concentration > >and 2 for time:concentration. > >I have altered the targets file to reflect this better. > >So now my code looks like: > >Time<-targetsC$Time >Conc<-targetsC$Target >Time<-factor(Time, levels=c("t1", "t4")) >Conc<-factor(Conc, levels=c("c0", "c20", "c100")) > > >Giving: > > > Time > [1] t1 t1 t4 t4 t4 t4 t1 t1 t1 t1 t4 t4 t1 t1 t1 t1 t4 t4 t4 t4 t4 t4 >t1 t1 t4 t4 t1 t1 t1 t1 >Levels: t1 t4 > > Conc > [1] c100 c0 c20 c0 c100 c0 c20 c0 c20 c0 c20 c0 c20 c0 >c100 c0 c20 c0 c20 c0 c100 c0 c100 c0 c100 c0 >[27] c100 c0 c20 c0 >Levels: c0 c20 c100 > > > >Set up the design file: > >design<-model.matrix(~Time + Conc + Time:Conc) > >Giving (first three lines): > > (Intercept) Timet4 Concc20 Concc100 Timet4:Concc20 Timet4:Concc100 >1 1 0 0 1 0 0 >2 1 0 0 0 0 0 > > >colnames(design)<-c("Intercept", "t4", "c20", "c100", "t4c20", "t4c100") > > >corfit<-intraspotCorrelation(MA.nba, design) >fit<-lmscFit(MA.nba, design, correlation=corfit$consensus) > > ># contrast matrix >contrast.matrix<-makeContrasts(t4, c20, c100, t4c20, t4c100, >levels=design) > >(What is possible here? What sort of contrasts are valid/meaningful? >Presumably I could do c20 + c100 to compare against c0?). > > > ># contrast timepoints and controls >fit2<- contrasts.fit(fit, contrast.matrix) > > ># eBayes >eb<- eBayes(fit2) > > >ngenes<-20 >topa1<-topTable(eb, coef=1, number=ngenes, adjust="none", sort.by="M") > >..... > > > > >Regards > > >John Seers > > > >Targets data before transforming using targetsA2C > > >SlideNumber FileName Cy3 Cy5 Time >598 598new.gpr c100 c0 t1 >599 599new.gpr c20 c0 t4 >600 600new.gpr c100 c0 t4 >617 617new.gpr c20 c0 t1 >621 621new.gpr c20 c0 t1 >637 637new.gpr c20 c0 t4 >638 638new.gpr c20 c0 t1 >639 639new.gpr c100 c0 t1 >748 748new.gpr c20 c0 t4 >751 751new.gpr c20 c0 t4 >833 833new.gpr c100 c0 t4 >835 835new.gpr c100 c0 t1 >836 836new.gpr c100 c0 t4 >957 957new.gpr c100 c0 t1 >958 958new.gpr c20 c0 t1 > >Targets data after transforming using targetsA2C > > > > targetsC > channel.col SlideNumber FileName Time Target >598new.1 1 598 598new.gpr t1 c100 >598new.2 2 598 598new.gpr t1 c0 >599new.1 1 599 599new.gpr t4 c20 >599new.2 2 599 599new.gpr t4 c0 >600new.1 1 600 600new.gpr t4 c100 >600new.2 2 600 600new.gpr t4 c0 >617new.1 1 617 617new.gpr t1 c20 >617new.2 2 617 617new.gpr t1 c0 >621new.1 1 621 621new.gpr t1 c20 >621new.2 2 621 621new.gpr t1 c0 >637new.1 1 637 637new.gpr t4 c20 >637new.2 2 637 637new.gpr t4 c0 >638new.1 1 638 638new.gpr t1 c20 >638new.2 2 638 638new.gpr t1 c0 >639new.1 1 639 639new.gpr t1 c100 >639new.2 2 639 639new.gpr t1 c0 >748new.1 1 748 748new.gpr t4 c20 >748new.2 2 748 748new.gpr t4 c0 >751new.1 1 751 751new.gpr t4 c20 >751new.2 2 751 751new.gpr t4 c0 >833new.1 1 833 833new.gpr t4 c100 >833new.2 2 833 833new.gpr t4 c0 >835new.1 1 835 835new.gpr t1 c100 >835new.2 2 835 835new.gpr t1 c0 >836new.1 1 836 836new.gpr t4 c100 >836new.2 2 836 836new.gpr t4 c0 >957new.1 1 957 957new.gpr t1 c100 >957new.2 2 957 957new.gpr t1 c0 >958new.1 1 958 958new.gpr t1 c20 >958new.2 2 958 958new.gpr t1 c0 > > > > > > >--- > >John Seers >Institute of Food Research >Norwich Research Park >Colney >Norwich >NR4 7UA > > >tel +44 (0)1603 251497 >fax +44 (0)1603 507723 >e-mail john.seers at bbsrc.ac.uk >e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ > >Web sites: > >www.ifr.ac.uk >www.foodandhealthnetwork.com Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 19.4 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 4 hours ago

WEHI, Melbourne, Australia

Dear Stan, The biological study for which I developed this methodology is about to be submitted for publication in the next few weeks. As far as a methods paper, so far I have only published a proceedings paper Smyth, G. K. (2005). Individual channel analysis of two-colour microarray data. Invited Session IPM 11: Computational Tools For Microarray Analysis, 55th Session of the International Statistics Institute, 5-12 April 2005, Sydney Convention & Exhibition Centre, Sydney, Australia (CD Paper 116). http://www.statsci.org/smyth/pubs/ISI2005-116.pdf Best wishes Gordon At 05:17 PM 19/09/2006, Gaj Stan (BIGCAT) wrote: >Dear Gordon and rest, > >I've been following the outcome of this thread carefully and am persuing >a single-channel analysis for a couple of Agilent arrays as well. Do you >(or anyone else for that matter) happen to know if there are any >publications available that implemented or have tested this approach? > >Best wishes, > > Stan > > >PhD Student >Dept. of Human Biology & BiGCaT Bioinformatics >PO BOX 616 >UNS 40 Room 5.578 >6200 MD Maastricht >Tel: +31 (0) 433882913 > > > >-----Original Message----- >From: bioconductor-bounces at stat.math.ethz.ch >[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Gordon >Smyth >Sent: 15 September 2006 06:09 >To: BioC Mailing List >Subject: [BioC] limma help - choosing an approach > >Dear John, > >I wrote the warning in the ?lmscFit documentation about it being >somewhat experimental more than two and a half years ago, when the >function were first introduced into the limma package. I've done much >more testing since then, and the function has stood the test of time, >so I'll remove the warning. > >Best wishes >Gordon

ADD COMMENT • link 19.4 years ago Gordon Smyth 53k

0

Entering edit mode

Dear list, I am analyzing some affymetrix chicken data using limma and have a question on the best approach regarding random and fixed effects. The target matrix is as follows: samplenames sex tissue date individual PA009_kyckling_11_H16_060630.CEL male heart 1 16 PA009_kyckling_11_H19_060705.CEL male heart 3 19 PA009_kyckling_11_H21_060630.CEL male heart 1 21 PA009_kyckling_11_H9_060704.CEL male heart 2 9 PA009_kyckling_12_B16_060704.CEL male brain 2 16 PA009_kyckling_12_B19_060704.CEL male brain 2 19 PA009_kyckling_12_B21_060705.CEL male brain 3 21 PA009_kyckling_12_B9_060630.CEL male brain 1 9 PA009_kyckling_13_G16_060705.CEL male gonad 3 16 PA009_kyckling_13_G19_060630.CEL male gonad 1 19 PA009_kyckling_13_G21_060704.CEL male gonad 2 21 PA009_kyckling_13_G9_060705.CEL male gonad 3 9 PA009_kyckling_21_H10_060705.CEL female heart 3 10 PA009_kyckling_21_H12_060705.CEL female heart 3 12 PA009_kyckling_21_H20_060630.CEL female heart 1 20 PA009_kyckling_21_H2_060704.CEL female heart 2 2 PA009_kyckling_22_B10_060704.CEL female brain 2 10 PA009_kyckling_22_B12_060630.CEL female brain 1 12 PA009_kyckling_22_B20_060705.CEL female brain 3 20 PA009_kyckling_22_B2_060630.CEL female brain 1 2 PA009_kyckling_23_G10_060704.CEL female gonad 2 10 PA009_kyckling_23_G12_060630.CEL female gonad 1 12 PA009_kyckling_23_G20_060704.CEL female gonad 2 20 PA009_kyckling_23_G2_060705.CEL female gonad 3 2 The question of interest is what genes that differ between male and female in the different tissues and as well in general. My concern is if I have to block for the date/batch and individual effect. In a PCA plot (and other quality control plots) there isn't sign of any obvious batch or individual effect. I also used duplicateCorrelation to calculate the correlations for the batch and individual effects and the results were 0.1 for individual and -0.03 for batch. Would it be ok to exclude the batch effect from the model and treat the individual as a random effect or is there a way in limma to include two random effects? I also have a more general question regarding lmFit and eBayes. I fitted a model to the gonad samples only and then compared that to fitting a model to all samples and extracting the gonad contrast only (see design matrices below). Obviously the resulting p-values etc differ between the two approaches but I don't really understand the difference and know which is the preferred/correct approach. Only gonad samples: m f PA009_kyckling_13_G16_060705.CEL 1 0 PA009_kyckling_13_G19_060630.CEL 1 0 PA009_kyckling_13_G21_060704.CEL 1 0 PA009_kyckling_13_G9_060705.CEL 1 0 PA009_kyckling_23_G10_060704.CEL 0 1 PA009_kyckling_23_G12_060630.CEL 0 1 PA009_kyckling_23_G20_060704.CEL 0 1 PA009_kyckling_23_G2_060705.CEL 0 1 All samples: mh mb mg fh fb fg PA009_kyckling_11_H16_060630.CEL 1 0 0 0 0 0 PA009_kyckling_11_H19_060705.CEL 1 0 0 0 0 0 PA009_kyckling_11_H21_060630.CEL 1 0 0 0 0 0 PA009_kyckling_11_H9_060704.CEL 1 0 0 0 0 0 PA009_kyckling_12_B16_060704.CEL 0 1 0 0 0 0 PA009_kyckling_12_B19_060704.CEL 0 1 0 0 0 0 PA009_kyckling_12_B21_060705.CEL 0 1 0 0 0 0 PA009_kyckling_12_B9_060630.CEL 0 1 0 0 0 0 PA009_kyckling_13_G16_060705.CEL 0 0 1 0 0 0 PA009_kyckling_13_G19_060630.CEL 0 0 1 0 0 0 PA009_kyckling_13_G21_060704.CEL 0 0 1 0 0 0 PA009_kyckling_13_G9_060705.CEL 0 0 1 0 0 0 PA009_kyckling_21_H10_060705.CEL 0 0 0 1 0 0 PA009_kyckling_21_H12_060705.CEL 0 0 0 1 0 0 PA009_kyckling_21_H20_060630.CEL 0 0 0 1 0 0 PA009_kyckling_21_H2_060704.CEL 0 0 0 1 0 0 PA009_kyckling_22_B10_060704.CEL 0 0 0 0 1 0 PA009_kyckling_22_B12_060630.CEL 0 0 0 0 1 0 PA009_kyckling_22_B20_060705.CEL 0 0 0 0 1 0 PA009_kyckling_22_B2_060630.CEL 0 0 0 0 1 0 PA009_kyckling_23_G10_060704.CEL 0 0 0 0 0 1 PA009_kyckling_23_G12_060630.CEL 0 0 0 0 0 1 PA009_kyckling_23_G20_060704.CEL 0 0 0 0 0 1 PA009_kyckling_23_G2_060705.CEL 0 0 0 0 0 1 Any comments or suggestions would be greatly appreciated. Thank you! Best regards, Lina Rosenberg

ADD REPLY • link 19.4 years ago Lina Hultin-Rosenberg ▴ 180

0

Entering edit mode

Hi Lina, Lina Hultin-Rosenberg wrote: > Dear list, > > I am analyzing some affymetrix chicken data using limma and have a question > on the best approach regarding random and fixed effects. The target matrix > is as follows: > > samplenames sex tissue date > individual > PA009_kyckling_11_H16_060630.CEL male heart 1 > 16 > PA009_kyckling_11_H19_060705.CEL male heart 3 > 19 > PA009_kyckling_11_H21_060630.CEL male heart 1 > 21 > PA009_kyckling_11_H9_060704.CEL male heart 2 9 > PA009_kyckling_12_B16_060704.CEL male brain 2 > 16 > PA009_kyckling_12_B19_060704.CEL male brain 2 > 19 > PA009_kyckling_12_B21_060705.CEL male brain 3 > 21 > PA009_kyckling_12_B9_060630.CEL male brain 1 9 > PA009_kyckling_13_G16_060705.CEL male gonad 3 > 16 > PA009_kyckling_13_G19_060630.CEL male gonad 1 > 19 > PA009_kyckling_13_G21_060704.CEL male gonad 2 > 21 > PA009_kyckling_13_G9_060705.CEL male gonad 3 9 > PA009_kyckling_21_H10_060705.CEL female heart 3 10 > PA009_kyckling_21_H12_060705.CEL female heart 3 12 > PA009_kyckling_21_H20_060630.CEL female heart 1 20 > PA009_kyckling_21_H2_060704.CEL female heart 2 2 > PA009_kyckling_22_B10_060704.CEL female brain 2 10 > PA009_kyckling_22_B12_060630.CEL female brain 1 12 > PA009_kyckling_22_B20_060705.CEL female brain 3 20 > PA009_kyckling_22_B2_060630.CEL female brain 1 2 > PA009_kyckling_23_G10_060704.CEL female gonad 2 10 > PA009_kyckling_23_G12_060630.CEL female gonad 1 12 > PA009_kyckling_23_G20_060704.CEL female gonad 2 20 > PA009_kyckling_23_G2_060705.CEL female gonad 3 2 > > The question of interest is what genes that differ between male and female > in the different tissues and as well in general. My concern is if I have to > block for the date/batch and individual effect. In a PCA plot (and other > quality control plots) there isn't sign of any obvious batch or individual > effect. I also used duplicateCorrelation to calculate the correlations for > the batch and individual effects and the results were 0.1 for individual and > -0.03 for batch. Would it be ok to exclude the batch effect from the model > and treat the individual as a random effect or is there a way in limma to > include two random effects? > > I also have a more general question regarding lmFit and eBayes. I fitted a > model to the gonad samples only and then compared that to fitting a model to > all samples and extracting the gonad contrast only (see design matrices > below). Obviously the resulting p-values etc differ between the two > approaches but I don't really understand the difference and know which is > the preferred/correct approach. There are two differences between these approaches. First, when you restrict the samples to just the gonad samples, you are reducing the amount of data that can be used in the empirical Bayes step to estimate the prior variability. I haven't looked at this aspect very closely, but I do know that reducing the number of probesets can have a pretty large effect on the number of samples called significant. I'm sure Gordon will have a better idea of how important this is. Second, you have to remember that the denominator of your contrast is the sums of squares of error for the model (plus the eBayes prior). When you only use the gonad samples, this is the same as fitting a t-statistic. However, when you use all samples, the SSE measures how well the model fits all the data. This is often more powerful because you have more degrees of freedom, which implies that your estimates are more accurate (hence more powerful). As for which is preferred, I usually follow the second approach. With most microarray analyses the statistics we use are often woefully underpowered, so I try to use whatever method will tend to increase power. HTH, Jim > > Only gonad samples: > m f > PA009_kyckling_13_G16_060705.CEL 1 0 > PA009_kyckling_13_G19_060630.CEL 1 0 > PA009_kyckling_13_G21_060704.CEL 1 0 > PA009_kyckling_13_G9_060705.CEL 1 0 > PA009_kyckling_23_G10_060704.CEL 0 1 > PA009_kyckling_23_G12_060630.CEL 0 1 > PA009_kyckling_23_G20_060704.CEL 0 1 > PA009_kyckling_23_G2_060705.CEL 0 1 > > All samples: > mh mb mg fh > fb fg > PA009_kyckling_11_H16_060630.CEL 1 0 0 0 0 > 0 > PA009_kyckling_11_H19_060705.CEL 1 0 0 0 0 > 0 > PA009_kyckling_11_H21_060630.CEL 1 0 0 0 0 > 0 > PA009_kyckling_11_H9_060704.CEL 1 0 0 0 0 0 > PA009_kyckling_12_B16_060704.CEL 0 1 0 0 0 > 0 > PA009_kyckling_12_B19_060704.CEL 0 1 0 0 0 > 0 > PA009_kyckling_12_B21_060705.CEL 0 1 0 0 0 > 0 > PA009_kyckling_12_B9_060630.CEL 0 1 0 0 0 0 > PA009_kyckling_13_G16_060705.CEL 0 0 1 0 0 > 0 > PA009_kyckling_13_G19_060630.CEL 0 0 1 0 0 > 0 > PA009_kyckling_13_G21_060704.CEL 0 0 1 0 0 > 0 > PA009_kyckling_13_G9_060705.CEL 0 0 1 0 0 0 > PA009_kyckling_21_H10_060705.CEL 0 0 0 1 0 > 0 > PA009_kyckling_21_H12_060705.CEL 0 0 0 1 0 > 0 > PA009_kyckling_21_H20_060630.CEL 0 0 0 1 0 > 0 > PA009_kyckling_21_H2_060704.CEL 0 0 0 1 0 0 > PA009_kyckling_22_B10_060704.CEL 0 0 0 0 1 > 0 > PA009_kyckling_22_B12_060630.CEL 0 0 0 0 1 > 0 > PA009_kyckling_22_B20_060705.CEL 0 0 0 0 1 > 0 > PA009_kyckling_22_B2_060630.CEL 0 0 0 0 1 0 > PA009_kyckling_23_G10_060704.CEL 0 0 0 0 0 > 1 > PA009_kyckling_23_G12_060630.CEL 0 0 0 0 0 > 1 > PA009_kyckling_23_G20_060704.CEL 0 0 0 0 0 > 1 > PA009_kyckling_23_G2_060705.CEL 0 0 0 0 0 1 > > > Any comments or suggestions would be greatly appreciated. Thank you! > > Best regards, > Lina Rosenberg > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

ADD REPLY • link 19.4 years ago James W. MacDonald 68k

Login before adding your answer.