Re: normalisation or analysis with batch effects

0

Entering edit mode

Darlene Goldstein ▴ 230

@darlene-goldstein-1004

Last seen 9.7 years ago

Hi, I just wanted to mention that even if you do normalize all the chips together, you are still likely to see the 'batch' (or 'block') effects. To try to assess the extent of the problem, you might cluster the samples and see if you get samples from the same batch clustering together. Best regards, Darlene ------------------- Hello, the 11 tumour sampel are considered as biological replicates, or are these split into different tumour classes? We've had a similar problem. Our data was generated in three different laboratories, each having slightly different protocols, but within each lab we had the same factors (the same doses of a drug). I guess, if the tumours are considered as replicates one could include the batch as a factor (as you suggest below), but if they contain different tumour classes one could not separate the dmso effect from the "tomour" class effect. The tissues samples (normal and tumour) are probably from different subjects and will show strong differences per se. Maybe one get some estimates for the impact of the batch by using a mixed effects model with each sample as random effect and the batch as fixed effect. something like lme(response ~ batch, data=d, rand = ~ 1|sample) I'm not sure about this, it's just an idea ... Anyway, I'd pre-process (normalize) all samples together, otherwise there'll certainly be a batch effect. kind regards, Arne > -----Original Message----- > From: bioconductor-bounces at stat.math.ethz.ch > [mailto:bioconductor-bounces at stat.math.ethz.ch]On Behalf Of > Adaikalavan > Ramasamy > Sent: 30 November 2004 23:51 > To: BioConductor mailing list > Cc: Andrea Pellagatti > Subject: [BioC] normalisation or analysis with batch effects > > > Dear list, > > If the following question has been asked before, I do apologise in > advance and hope someone can point to the relevant thread. Otherwise I > would appreciate some thoughts and pointers to this problem. > Thank you. > > > Problem : My collaborator (cc-ed here) has performed hybridisation for > 11 tumour and 40 normal samples on Affymetrix HGU-133Av2 > (contains ~55k > probesets) chips. He had hybridised about half of the samples when he > realised he needed more Affymetrix chips. > > The second batch of chips arrived with the instruction to add DMSO in > the hybridisation cocktail, which he followed. The first batch did not > have such instruction. Therefore we believe that the two > batches are not > directly comparable. A posting to GeneArray mailing list had a reply > (http://bfx.kribb.re.kr/gene-array/1255.html) supporting this view. A > cross-table of batch and sample is given below : > > | normal tumour total > batch 1 (with DMSO) | 17 6 23 > batch 2 (without DMSO) | 23 5 28 > -----------------------|--------------------- > total | 40 11 51 > > > Therefore I have considered the following possible solutions : > > 1) Preprocess all arrays and compare tumour vs. normal > > 2) Preprocess the two batches separately and cbind() them. > Then compare > tumour vs. normal > > 3) Preprocess all arrays but include a batch effect in analysis ( I am > not sure how to do this - perhaps using LIMMA) > > 4) Preprocess separately and proceed as 3) > > Here, I use RMA to preprocess the arrays. I have done 1) and > 2) and the > correlation of the two gene lists, as assessed by correlation of gene > ranks, is only 0.35. I think 4) is a bit of overkill. > > Any opinions or alternative suggestions are very welcomed. Thank you. > > Regards, > -- > Adaikalavan Ramasamy ramasamy at cancer.org.uk > Centre for Statistics in Medicine http://www.ihs.ox.ac.uk/csm/ > Cancer Research UK Tel : 01865 226 677 > Old Road Campus, Headington, Oxford Fax : 01865 226 962 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- Darlene Goldstein Institute of Mathematics, EPFL Tel: +41 21 693 2552 CH-1015 Lausanne Fax: +41 21 693 4303 SWITZERLAND

Clustering Cancer Clustering Cancer • 1.2k views

ADD COMMENT • link updated 19.5 years ago by James W. MacDonald 66k • written 19.5 years ago by Darlene Goldstein ▴ 230

0

Entering edit mode

James W. MacDonald 66k

@james-w-macdonald-5106

Last seen 15 hours ago

United States

>-----Original Message----- >From: bioconductor-bounces at stat.math.ethz.ch >[mailto:bioconductor-bounces at stat.math.ethz.ch]On Behalf Of >Adaikalavan >Ramasamy >Sent: 30 November 2004 23:51 >To: BioConductor mailing list >Cc: Andrea Pellagatti >Subject: [BioC] normalisation or analysis with batch effects > > >Dear list, > >If the following question has been asked before, I do apologise in >advance and hope someone can point to the relevant thread. Otherwise I >would appreciate some thoughts and pointers to this problem. >Thank you. > > >Problem : My collaborator (cc-ed here) has performed hybridisation for >11 tumour and 40 normal samples on Affymetrix HGU-133Av2 >(contains ~55k >probesets) chips. He had hybridised about half of the samples when he >realised he needed more Affymetrix chips. > >The second batch of chips arrived with the instruction to add DMSO in >the hybridisation cocktail, which he followed. The first batch did not >have such instruction. Therefore we believe that the two >batches are not There is a much larger difference between these protocols than simply adding DMSO. If I am not mistaken, the difference here is that the old samples were processed using the Enzo IVT kit, and the new samples were processed using the Affy IVT kit. We have found that these data cannot be processed together using e.g., RMA because a large portion of the probesets have completely different patterns. In addition, the distribution of PM probes is quite different for the two protocols, so I don't think a quantile normalization is appropriate. You can check this by fitting the RMA model using rmaPLM() in the affyPLM package, and then checking the residual plots. We have shied away from combining chips that were processed using the two IVT kits, but if you have to do so, I would recommend processing each group separately and then fitting a model with a batch effect. Best, Jim >directly comparable. A posting to GeneArray mailing list had a reply >(http://bfx.kribb.re.kr/gene-array/1255.html) supporting this view. A >cross-table of batch and sample is given below : > > | normal tumour total > batch 1 (with DMSO) | 17 6 23 > batch 2 (without DMSO) | 23 5 28 > -----------------------|--------------------- > total | 40 11 51 > > >Therefore I have considered the following possible solutions : > >1) Preprocess all arrays and compare tumour vs. normal > >2) Preprocess the two batches separately and cbind() them. >Then compare >tumour vs. normal > >3) Preprocess all arrays but include a batch effect in analysis ( I am >not sure how to do this - perhaps using LIMMA) > >4) Preprocess separately and proceed as 3) > >Here, I use RMA to preprocess the arrays. I have done 1) and >2) and the >correlation of the two gene lists, as assessed by correlation of gene >ranks, is only 0.35. I think 4) is a bit of overkill. > >Any opinions or alternative suggestions are very welcomed. Thank you. > >Regards, >-- >Adaikalavan Ramasamy ramasamy at cancer.org.uk >Centre for Statistics in Medicine http://www.ihs.ox.ac.uk/csm/ >Cancer Research UK Tel : 01865 226 677 >Old Road Campus, Headington, Oxford Fax : 01865 226 962 > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor > > > -- James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109

ADD COMMENT • link 19.5 years ago James W. MacDonald 66k

Login before adding your answer.