Data sets conducted in different labs

0

Entering edit mode

Fangxin Hong ▴ 810

@fangxin-hong-912

Last seen 11.4 years ago

Hi there; I am sorry if my question doesn't qualify for BioC mail list. Have you met the situation that two labs carried out the same/similar experiment, but came out with quite different results in term of differentially expressed genes identified. Have anyone had done the studies on this problem, any reference/observations? The usual way is to identify genes based on two lab's data, respectively, then compare the results. What about make one model for the combined data from two labs which takes lab as one potential factor. In this case, how to do the pre-processing part, normalize all data together or two lab's data separately? Any recommendations? What I observed is: I observed clearly systematic difference in the data from two lab. But after I normalize all data ( I used rma )together, you still can tell the different origin of the data after normalization, and the model test (limma) that the lab factor is significant for about 50% genes. My question is: in this case (normalize all data together), should I include the lab as one factor? It seems normalizing procedure can't cancel lab effects. But if I normalize two lab's data separately, they will have different variation. Even with a lab factor, I can't use model two lab's data into one model. Any comments/suggestions will be appreciated. Bests; Fangxin

Normalization Normalization • 1.1k views

ADD COMMENT • link updated 21.2 years ago by Naomi Altman ★ 6.0k • written 21.3 years ago by Fangxin Hong ▴ 810

0

Entering edit mode

donghu@itsa.ucsf.edu ▴ 70

@donghuitsaucsfedu-91

Last seen 11.4 years ago

Since there is clear difference in data from two labs, I would explore these issue first: Did the two labs use the same type of scanner, the same reagent, the same amout of starting RNA? Is the RNA quality from the two labs comparable, etc.? Donglei Hu On Tue, 19 Oct 2004 13:55:46 -0700 (PDT) "Fangxin Hong" wrote: > Hi there; > I am sorry if my question doesn't qualify for BioC mail list. > > Have you met the situation that two labs carried out the same/similar > experiment, but came out with quite different results in term of > differentially expressed genes identified. Have anyone had done the > studies on this problem, any reference/observations? > > The usual way is to identify genes based on two lab's data, respectively, > then compare the results. What about make one model for the combined data > from two labs which takes lab as one potential factor. In this case, how > to do the pre-processing part, normalize all data together or two lab's > data separately? Any recommendations? > > What I observed is: I observed clearly systematic difference in the data > from two lab. But after I normalize all data ( I used rma )together, you > still can tell the different origin of the data after normalization, and > the model test (limma) that the lab factor is significant for about 50% > genes. My question is: in this case (normalize all data together), should > I include the lab as one factor? It seems normalizing procedure can't > cancel lab effects. > > But if I normalize two lab's data separately, they will have different > variation. Even with a lab factor, I can't use model two lab's data into > one model. > > Any comments/suggestions will be appreciated. > > Bests; > Fangxin > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >

ADD COMMENT • link 21.3 years ago donghu@itsa.ucsf.edu ▴ 70

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.8 years ago

United States

I would probably normalize together and include "lab" as a factor in the experiment. Alternatively, there is a nice result by Steve Marron at UNC about how to combine datasets which would probably work quite well in this situation. You might have a look at his web page. --Naomi At 01:55 PM 10/19/2004 -0700, you wrote: >Hi there; >I am sorry if my question doesn't qualify for BioC mail list. > >Have you met the situation that two labs carried out the same/similar >experiment, but came out with quite different results in term of >differentially expressed genes identified. Have anyone had done the >studies on this problem, any reference/observations? > >The usual way is to identify genes based on two lab's data, respectively, >then compare the results. What about make one model for the combined data >from two labs which takes lab as one potential factor. In this case, how >to do the pre-processing part, normalize all data together or two lab's >data separately? Any recommendations? > >What I observed is: I observed clearly systematic difference in the data >from two lab. But after I normalize all data ( I used rma )together, you >still can tell the different origin of the data after normalization, and >the model test (limma) that the lab factor is significant for about 50% >genes. My question is: in this case (normalize all data together), should >I include the lab as one factor? It seems normalizing procedure can't >cancel lab effects. > >But if I normalize two lab's data separately, they will have different >variation. Even with a lab factor, I can't use model two lab's data into >one model. > >Any comments/suggestions will be appreciated. > >Bests; >Fangxin > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 21.2 years ago Naomi Altman ★ 6.0k

Login before adding your answer.