RE:proper pooling design

0

Entering edit mode

Stephen Moore ▴ 70

@stephen-moore-492

Last seen 11.4 years ago

Hi Fai, I must point out that I am not a statistician, however, I think your thinking is correct. If you pool samples and then hybridize that one sample to three different chips, then you would be assessing the variation between the Affy platforms rather than allowing for variation in the different biological samples and since the variation between the Affy platforms (as I understand it)tends to be small it may not be worthwhile doing this. However if you make three pooled samples and hyb to three different chips and then use the average of the three for your reference then I think you will be getting a better representation of biological variability. Cheers Steve. -----Original Message----- From: bioconductor-request@stat.math.ethz.ch [mailto:bioconductor-request@stat.math.ethz.ch] Sent: 20 January 2004 22:49 To: bioconductor@stat.math.ethz.ch Subject: [SPAM] - Bioconductor Digest, Vol 11, Issue 27 - Email found in subject Send Bioconductor mailing list submissions to bioconductor@stat.math.ethz.ch To subscribe or unsubscribe via the World Wide Web, visit https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor or, via email, send a message with subject or body 'help' to bioconductor-request@stat.math.ethz.ch You can reach the person managing the list at bioconductor-owner@stat.math.ethz.ch When replying, please edit your Subject line so it is more specific than "Re: Contents of Bioconductor digest..." Today's Topics: 1. Re: [maNorm] Normalization a complex experiment... (Marcelo Luiz de Laia) 2. limma and makeContrasts (Stephen Henderson) 3. Re: Blurriness assesment of scanner TIFF files (Simon Lin) 4. RE: [maImage: Draw multiple spatial plots on the SAME grap h] (Naomi Altman) 5. limma: out of bounds error (Straubhaar, Juerg) 6. Trouble installing 'makecdfenv' (Jim Breaux) 7. Proper pooling design (YUK FAI LEUNG) ---------------------------------------------------------------------- Message: 1 Date: Tue, 20 Jan 2004 09:20:34 -0300 From: Marcelo Luiz de Laia <mlaia@fcav.unesp.br> Subject: Re: [BioC] [maNorm] Normalization a complex experiment... To: bioconductor@stat.math.ethz.ch Message-ID: <20040120092034.00002c10@lbmsala4> Content-Type: text/plain; charset=ISO-8859-15 Dear Gordon and All, In my laboratory is impossible to have a statistician involved in anlaysis, now. My doubt about the normalization with marray appeared due I to have to use limma, after. Then, I thought: Maybe there be a way to enter with the data in marray so that the analyses in limma are easier. My questions, basically, are: - Which genes are up-regulated in the three times? - Which genes are down-regulated in the three times? - Which are up-regulateds in the time 1 and later they do decrease in the times 2 and 3? - Which are up-regulateds in the times 1 and 2 and later it does decrease in the time 3? - Which are down-regulateds in the time 1 and up-regulated in the times 2 and 3? - Which are down-regulateds in the times 1 and 2 and up-regulated in the time 3? I believe that these are the main subjects. Would you have suggestions? I hope to analyze our data with the help of the members of the list Bioconductor. Thanks a lot for your interest in help me. The experiment design is: Time 1day 2day 3day Rep1 Rep1 Rep1 Un Treated Rep2 Rep2 Rep2 Rep3 Rep3 Rep3 Rep1 Rep1 Rep1 Treated Rep2 Rep2 Rep2 Rep3 Rep3 Rep3 Marcelo Em Tue, 20 Jan 2004 10:42:40 +1100 Gordon Smyth <smyth@wehi.edu.au> escreveu: GS> At 12:52 AM 20/01/2004, Marcelo Luiz de Laia wrote: GS> >Hi All, GS> > GS> >I have a complex experiment (for me) and I do not known how do I do to GS> >normalize it. GS> GS> Why not normalize it exactly has you've normalized data in earlier studies? GS> GS> >More specifically, I don't know as building the file samples (targets) for GS> >marray. GS> > GS> >The design is: GS> > GS> > Time GS> > 1day 2day 3day GS> > GS> > Rep1 Rep1 Rep1 GS> >Un Treated Rep2 Rep2 Rep2 GS> > Rep3 Rep3 Rep3 GS> > GS> > Rep1 Rep1 Rep1 GS> >Un Treated Rep2 Rep2 Rep2 GS> > Rep3 Rep3 Rep3 GS> > GS> >If I have one time, my targets file for marrayinput is like this: GS> > GS> ># of slide Names experiment Cy3 experiment Cy5 date GS> >1 File1 Un Treated Treated 19/01/2004 GS> > GS> >It is a temporary series with three different times and three repetitions GS> >in each one of the times. GS> > GS> >Me already analysed some simpler experiments. For example, I know to GS> >analyse inside of every time, individually. However, I didn't get to find GS> >an example alike to mine in the marray vignettes. GS> > GS> >After the normalization, I am thinking about using limma. GS> > GS> >I would like to know which genes were differentialy expressed in every GS> >time. Besides, would I like to verify the behavior of these genes along GS> >the time (for example, were they increased or done decreased along the GS> >time?). I already had looking at the limma user's guide and I saw that GS> >there is the function heatdiagram. GS> GS> Heatdiagram may help you visualize your results, but what you really need GS> is the F-statistic computed by the classifyTests() function. This is not GS> yet explained in the User's Guide. Can you consult a local statistician for GS> help who knows a little about linear models and contrasts? GS> GS> Gordon GS> GS> >I will need to analyze it in the marray in a way that is easier of being GS> >analyzed in limma. Another doubt that I already have on limma would be the GS> >file design. GS> > GS> >All help will be very welcome. GS> > GS> >Best wishes. GS> > GS> >-- GS> >Marcelo Luiz de Laia, M.Sc. GS> >Dep. de Tecnologia, Lab. Bioqu?mica e de Biologia Molecular GS> >Universidade Estadual Paulista - UNESP GS> >Via de Acesso Prof. Paulo Donato Castelane, Km 05 GS> >14.884-900 - Jaboticabal, SP, Brazil GS> >PhoneFax: 16 3209-2675/2676/2677 R. 202/208/203 (trab.) GS> >HomePhone: 16 3203 2328 - www.lbm.fcav.unesp.br - mlaia@yahoo.com GS> -- Marcelo Luiz de Laia, M.Sc. Dep. de Tecnologia, Lab. Bioqu?mica e de Biologia Molecular Universidade Estadual Paulista - UNESP Via de Acesso Prof. Paulo Donato Castelane, Km 05 14.884-900 - Jaboticabal, SP, Brazil PhoneFax: 16 3209-2675/2676/2677 R. 202/208/203 (trab.) HomePhone: 16 3203 2328 - www.lbm.fcav.unesp.br - mlaia@yahoo.com ------------------------------ Message: 2 Date: Tue, 20 Jan 2004 12:38:11 -0000 From: Stephen Henderson <s.henderson@ucl.ac.uk> Subject: [BioC] limma and makeContrasts To: "'Bioconductor@stat.math.ethz.ch'" <bioconductor@stat.math.ethz.ch> Message-ID: <e7cf6bc2744cbe41ae4e87635c7c893c1c668e@exc.wibr.ucl.ac.uk> Content-Type: text/plain Hi I've been using the limma and eBayes function for looking at a set of affymetrix chips with 8 or so different groups within. Following the instructions on page 11=12 of the PDF guide, this works fine I think. My question regards different ways to specify the contrasts. If I wish to compare composite groups, e.g. (group1,group2) vs (group3,group4) and others is there a syntax to specify this within the makeContrasts function?, and where can I read about this?, or alternatively is there a better step at which I can achieve this (without changing my class vector repeatedly that is)? Stephen ********************************************************************** This email and any files transmitted with it are confidentia...{{dropped}} ------------------------------ Message: 3 Date: Tue, 20 Jan 2004 09:42:34 -0500 From: "Simon Lin" <simon.lin@duke.edu> Subject: [BioC] Re: Blurriness assesment of scanner TIFF files To: <bioconductor@stat.math.ethz.ch> Message-ID: <002201c3df63$a46366c0$71761098@ccis1184> Content-Type: text/plain; charset="iso-8859-1" Hi, Edo: I would like to share some opinions slightly outside of the image analysis context. To calibrate the scanner, it is better to use some calibration slides. There should be some 'standard patterns' on this calibration slide. Like the 'test paper' used by the Xerox copier repairman with lines and meshes at different intervals, or the checker board pattern used by the TV repairman. To make this calibration slide, a high-end solution is photo-etching. A quick solution is to spot some fluorescent dyes at known positions. Even for people with no obvious problem of the scanner, this should be checked at a regular interval to ensure the operating condition of the scanner. Ask the scanner maker; they may do the service. An accurate physical measurement will solve many of our headaches down the road! Good luck! Simon ================================================= Simon M. Lin, M.D. Manager, Duke Bioinformatics Shared Resource Assistant Research Professor, Biostatistics and Bioinformatics Box 3958, Duke University Medical Center Durham, NC 27710 Ph: (919) 681-9646 FAX: (919) 681-8028 Lin00025 (at) mc.duke.edu http://dbsr.duke.edu ================================================= Message: 3 Date: Mon, 19 Jan 2004 16:38:55 +0100 From: "Edo Plantinga" <a.e.d.plantinga@med.rug.nl> Subject: [BioC] Blurriness assesment of scanner TIFF files To: "Bioconductor" <bioconductor@stat.math.ethz.ch> Message-ID: <002801c3dea2$59154420$a38f7d81@edo> Content-Type: text/plain Dear all, At our department we have experienced some difficulties with our microarray scanner. I am looking for some software that can read in the raw TIFF files that come out of our scanner to asses how blurry the picture is (i.e. how sharp the edges are in the picture). I would also like to know which areas of the picture are the most blurry (we suspect a left to right effect). Does anyone know of (R?) software that can do this? Kind regards, Edo Plantinga [[alternative HTML version deleted]] ------------------------------ Message: 4 Date: Tue, 20 Jan 2004 09:43:37 -0500 From: Naomi Altman <naomi@stat.psu.edu> Subject: RE: [BioC] [maImage: Draw multiple spatial plots on the SAME grap h] To: "Foata,Francis,LAUSANNE,NRC/N&H" <francis.foata@rdls.nestle.com> Cc: bioconductor@stat.math.ethz.ch Message-ID: <6.0.0.22.2.20040120093920.01df5b40@stat.psu.edu> Content-Type: text/plain; charset="us-ascii" An HTML attachment was scrubbed... URL: https://www.stat.math.ethz.ch/pipermail/bioconductor/attachments/ 20040120/8622c608/attachment-0001.html ------------------------------ Message: 5 Date: Tue, 20 Jan 2004 17:06:25 -0500 From: "Straubhaar, Juerg" <juerg.straubhaar@umassmed.edu> Subject: [BioC] limma: out of bounds error To: <bioconductor@stat.math.ethz.ch> Message-ID: <1A42F1E1A1E73A4F8C6048789F34A32F2D8A99@edunivmail02.ad.umassmed.edu> Content-Type: text/plain; charset="Windows-1252" Dear Dr. Smyth, I am analysing a series of two-colour microarray data sets with limma. The sets were downloaded from SMD (Standford Microarray Database) and I read the data with the command: targets <- readTargets('N20targets.txt') RG<-read.maimages(targets$FileName, source="smd", fill=T, wt.fun=function(x) {return(x$FLAG)}) After reading the gal file and layout I proceeded with the normalization: MA<-normalizeWithinArrays(RG) This function terminated prematurely with an 'out of bounds' error. I found the error in the printtiploess code block of the normalizeWithinArrays function. The layout with 8 X 4 print grids, each containing 650 spots, provides for 20800 spots. The chips I am using have 20736 spots. I added a small amount of code to your normalizeWithinArrays() which eliminated the error. The code I added is (after #comment) printtiploess = {^M if(is.null(layout)) stop("Layout argument not specified")^M ngr <- layout$ngrid.r^M ngc <- layout$ngrid.c^M nspots <- layout$nspot.r * layout$nspot.c^M for (j in 1:narrays) {^M spots <- 1:nspots^M for (gridr in 1:ngr)^M for (gridc in 1:ngc) {^M # modified: SMD data files smaller than ngr * ngc * spots!^M if(spots[nspots] > nrow(object$M)) {^M index <- spots[1]^M spots <- index:nrow(object$M)^M }^M y <- object$M[spots,j]^M x <- object$A[spots,j]^M w <- weights[spots,j]^M object$M[spots,j] <- loessFit(y,x,w,span=span,iterations=iterations)$residuals^M spots <- spots + nspots^M }^M }^M },^M I am using limma version limma_1.3.13. Kind regards, Juerg Straubhaar, PhD Umass Med ------------------------------ Message: 6 Date: Tue, 20 Jan 2004 14:26:53 -0800 From: "Jim Breaux" <jim.breaux@vialogy.com> Subject: [BioC] Trouble installing 'makecdfenv' To: <bioconductor@stat.math.ethz.ch> Message-ID: <000001c3dfa4$81f3b950$2612d240@ViaChange.com> Content-Type: text/plain I am having trouble installing the latest release of 'makecdfenv,' and I am hoping someone can help me out. When I tried to install the source package, I got the following errors after calling "Rcmd INSTALL makecdfenv_1.4.1.tar.gz": --------- Making package makecdfenv ------------ ********************************************** WARNING: this package has a configure script It probably needs manual configuration ********************************************** adding build stamp to DESCRIPTION making DLL ... making read_cdffile.d from read_cdffile.c read_cdffile.c:52:19: warning: zlib.h: No such file or directory gcc -DHAVE_ZLIB=1 -IM:/PROGRA~1/R/rw1081/src/include -Wall -O2 -c read_cdffile.c -o read_cdffile.o read_cdffile.c:52:19: zlib.h: No such file or directory read_cdffile.c: In function `openCELfile': read_cdffile.c:581: warning: implicit declaration of function `gzopen' read_cdffile.c:581: warning: assignment makes pointer from integer without a cast read_cdffile.c:586: warning: implicit declaration of function `gzgets' read_cdffile.c:589: warning: implicit declaration of function `gzrewind' read_cdffile.c: In function `openCDFfile': read_cdffile.c:623: warning: assignment makes pointer from integer without a cast read_cdffile.c: In function `close_affy_file': read_cdffile.c:663: warning: implicit declaration of function `gzclose' read_cdffile.c: In function `readline_affy_file': read_cdffile.c:679: warning: assignment makes pointer from integer without a cast read_cdffile.c: In function `readQC': read_cdffile.c:908: warning: unused variable `param_unit' make[2]: *** [read_cdffile.o] Error 1 make[1]: *** [srcDynlib] Error 2 make: *** [pkg-makecdfenv] Error 2 *** Installation of makecdfenv failed *** Where is the file 'zlib.h' supposed to come from (it doesn't appear to be in 'makecdfenv_1.4.1.tar.gz')? Does anyone have any suggestions that might help me get this to work? Thanks in advance for your help, Jim [[alternative HTML version deleted]] ------------------------------ Message: 7 Date: Tue, 20 Jan 2004 17:47:04 -0500 From: YUK FAI LEUNG <yfleung@mcb.harvard.edu> Subject: [BioC] Proper pooling design To: bioconductor@stat.math.ethz.ch Message-ID: <400DAFE8.5030709@mcb.harvard.edu> Content-Type: text/plain; charset=us-ascii; format=flowed Hi there, I am designing a pilot microarray study on embryoic developmental mutant using affy platform. The comparison itself is very simple, the mutant vs normal at one time point. Due to various reasons (mostly funding and limited amount of tissue), I can't start with the "ideal" approach in which each sample is hybridized to an individual chip. Since I can easily rear a lot of animals, it seems that pooling is the only choice for the pilot study. However I am not sure what is the best way to allocate the pooled samples to each chip. For example if I want to do 3 array replicates each for the mutant and control. Is it better to pool enough samples for 3 arrays and then separate the pooled sample in 3 portions for hybridization or just pool different individual samples for different replicates? It seems to me that the first way is like getting a group expression average with accessment of technical variation, while the second approach can also provide some sort of evalution of biological variation, abeit an averaged one by the pooling. I suspect the latter approach is better, and would love to know the suggestions from you. Thanks! Fai -- Yuk Fai Leung Department of Molecular and Cellular Biology Harvard University BL 2079, 16 Divinity Avenue Cambridge, MA 02138 Tel: 617-495-2599 Fax: 617-496-3321 email: yfleung@mcb.harvard.edu; yfleung@genomicshome.com URL: http://genomicshome.com ------------------------------ _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor End of Bioconductor Digest, Vol 11, Issue 27 ******************************************** This e-mail is from ArraGen Ltd The e-mail and any files transmitted with it are confidential and privileged and intended solely for the use of the individual or entity to whom they are addressed. Any unauthorised direct or indirect dissemination, distribution or copying of this message and any attachments is strictly prohibited. If you have received the e-mail in error please notify helpdesk@arragen.com or telephone +44 28 38 363841 and delete the e-mail from your system. E-mail and other communications sent to this company may be reviewed or read by persons other than the intended recipient. Viruses : although we have taken steps to ensure that this e-mail and any attachments are free from any virus, you should, in keeping with good practice, ensure that they are actually virus free. ArraGen Ltd. Registration Number NI 43067 Registered Address : Almac House, Charlestown Road, Craigavon, BT63 5UA Northern Ireland

Microarray Normalization affy makecdfenv limma marray Microarray Normalization affy limma • 2.4k views

ADD COMMENT • link written 22.0 years ago by Stephen Moore ▴ 70

0

Entering edit mode

YUK FAI LEUNG ▴ 140

@yuk-fai-leung-605

Last seen 11.4 years ago

Hi, There is another issue that comes up from my discussion with some people about my earlier question on pooling design. It is very likely that I can't collect all the samples in one day, not even the pooled biological replicates. I have heard many people say that the time of collection (i.e. day) has a great effect on resulting clustering etc. data analysis. I guess this might be true in some cases, although I haven't looked into this factor seriously. I would if anyone has any experiences on this. If this effect is true in my case, then the variation among the biological replicates might be larger than it should be, which I want to avoid. Could someone suggest a sequence of sample collection based on a proper statisical design which can avoid/minimize such collection-time variation on data analysis? Thanks! Best regards, Fai -- Yuk Fai Leung Department of Molecular and Cellular Biology Harvard University BL 2079, 16 Divinity Avenue Cambridge, MA 02138 Tel: 617-495-2599 Fax: 617-496-3321 email: yfleung@mcb.harvard.edu; yfleung@genomicshome.com URL: http://genomicshome.com

ADD COMMENT • link 22.0 years ago YUK FAI LEUNG ▴ 140

0

Entering edit mode

what is the purpose of your experiment? if it is simply comparing several groups, speaking in the language of experimental design, you should treat each sample collection period (i.e. days) as blocks and use randomized block design. the blocking factor will also be included in your data analysis (one more factor in your anova, make it two way ANOVA). here is an introduction on the website of http://www.itl.nist.gov/div898/handbook/pri/section3/pri332.htm If your experiment is more complicated than one treatment factors, you need more complicated designs. please send me personal emails. Kenny Kenny Ye Assistant Professor Department of Applied Math and Statistics SUNY at Stony Brook Stony Brook, New York 11794-3600 Phone (631)632-9344 Fax (631)632-8490 On Fri, 23 Jan 2004, YUK FAI LEUNG wrote: > Hi, > > There is another issue that comes up from my discussion with some people > about my earlier question on pooling design. It is very likely that I > can't collect all the samples in one day, not even the pooled biological > replicates. I have heard many people say that the time of collection > (i.e. day) has a great effect on resulting clustering etc. data > analysis. I guess this might be true in some cases, although I haven't > looked into this factor seriously. I would if anyone has any experiences > on this. > > If this effect is true in my case, then the variation among the > biological replicates might be larger than it should be, which I want to > avoid. Could someone suggest a sequence of sample collection based on a > proper statisical design which can avoid/minimize such collection- time > variation on data analysis? > > Thanks! > > Best regards, > Fai > -- > Yuk Fai Leung > Department of Molecular and Cellular Biology > Harvard University > BL 2079, 16 Divinity Avenue > Cambridge, MA 02138 > Tel: 617-495-2599 > Fax: 617-496-3321 > email: yfleung@mcb.harvard.edu; yfleung@genomicshome.com > URL: http://genomicshome.com > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >