Question

comparing different batches of data directly

0

Entering edit mode

Sabine Reichelt ▴ 10

@sabine-reichelt-1968

Last seen 10.3 years ago

Hi! What would be the most appropriate approach if I want to compare gene expression data from different laboratories (and different biological sources) directly? Assuming the data were profiled on the same chip, of course. What kind of normalization (in batches? all together?) and subsequent processing would be "least harmful"? Thanks for any answers! Sabine -- "Ein Herz f?r Kinder" - Ihre Spende hilft! Aktion: www.deutschlandsegelt.de Unser Dankesch?n: Ihr Name auf dem Segel der 1. deutschen America's Cup-Yacht!

Normalization Normalization • 1.0k views

ADD COMMENT • link updated 18.1 years ago by knaxerov@ix.urz.uni-heidelberg.de ▴ 50 • written 18.1 years ago by Sabine Reichelt ▴ 10

score 0 · Answer 1 · 2006-12-08

Hi Sabine, Sabine Reichelt wrote: > Hi! > > What would be the most appropriate approach if I want to compare gene > expression data from different laboratories (and different biological > sources) directly? Assuming the data were profiled on the same chip, > of course. What kind of normalization (in batches? all together?) and > subsequent processing would be "least harmful"? This depends on what you mean by comparing things 'directly'. If you mean that you have some controls from lab 1 and some experimentals from lab 2 that you want to compare, then it doesn't really matter what you do because you won't be able to control for the 'lab' effect. In other words, you won't ever be able to determine if a given change is due to Biological differences or simply technical variability due to being run in different labs. On the other hand, if you have microarray data for both sample types that were run in two different labs (i.e., control and experimental samples from lab 1 and control and experimental samples from lab 2), then you would want to normalize the data from each lab in separate batches and then compare using a mixed model. The GeneMeta package in the devel repository is designed to do this sort of thing. Alternatively, you could use something like lme() in the nlme package on a row-wise basis (this would be slow however). Best, Jim > > Thanks for any answers! Sabine -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

score 0 · Answer 2 · 2006-12-08

I am struggling with a similar question. I would like to include cancer profiles from different studies in a principal components analysis. Jim, what would you suggest in this case, when I am not interested in differential gene expression but in a global comparison? Thanks! Kamila > Hi Sabine, > > Sabine Reichelt wrote: > > Hi! > > > > What would be the most appropriate approach if I want to compare gene > > expression data from different laboratories (and different biological > > sources) directly? Assuming the data were profiled on the same chip, > > of course. What kind of normalization (in batches? all together?) and > > subsequent processing would be "least harmful"? > > This depends on what you mean by comparing things 'directly'. If you > mean that you have some controls from lab 1 and some experimentals from > lab 2 that you want to compare, then it doesn't really matter what you > do because you won't be able to control for the 'lab' effect. In other > words, you won't ever be able to determine if a given change is due to > Biological differences or simply technical variability due to being run > in different labs. > > On the other hand, if you have microarray data for both sample types > that were run in two different labs (i.e., control and experimental > samples from lab 1 and control and experimental samples from lab 2), > then you would want to normalize the data from each lab in separate > batches and then compare using a mixed model. The GeneMeta package in > the devel repository is designed to do this sort of thing. > Alternatively, you could use something like lme() in the nlme package on > a row-wise basis (this would be slow however). > > Best, > > Jim > > > > > > Thanks for any answers! Sabine > > > -- > James W. MacDonald, M.S. > Biostatistician > Affymetrix and cDNA Microarray Core > University of Michigan Cancer Center > 1500 E. Medical Center Drive > 7410 CCGC > Ann Arbor MI 48109 > 734-647-5623