Data sets conducted in different labs
2
0
Entering edit mode
Fangxin Hong ▴ 810
@fangxin-hong-912
Last seen 10.2 years ago
Hi there; I am sorry if my question doesn't qualify for BioC mail list. Have you met the situation that two labs carried out the same/similar experiment, but came out with quite different results in term of differentially expressed genes identified. Have anyone had done the studies on this problem, any reference/observations? The usual way is to identify genes based on two lab's data, respectively, then compare the results. What about make one model for the combined data from two labs which takes lab as one potential factor. In this case, how to do the pre-processing part, normalize all data together or two lab's data separately? Any recommendations? What I observed is: I observed clearly systematic difference in the data from two lab. But after I normalize all data ( I used rma )together, you still can tell the different origin of the data after normalization, and the model test (limma) that the lab factor is significant for about 50% genes. My question is: in this case (normalize all data together), should I include the lab as one factor? It seems normalizing procedure can't cancel lab effects. But if I normalize two lab's data separately, they will have different variation. Even with a lab factor, I can't use model two lab's data into one model. Any comments/suggestions will be appreciated. Bests; Fangxin
Normalization Normalization • 924 views
ADD COMMENT
0
Entering edit mode
@donghuitsaucsfedu-91
Last seen 10.2 years ago
Since there is clear difference in data from two labs, I would explore these issue first: Did the two labs use the same type of scanner, the same reagent, the same amout of starting RNA? Is the RNA quality from the two labs comparable, etc.? Donglei Hu On Tue, 19 Oct 2004 13:55:46 -0700 (PDT) "Fangxin Hong" wrote: > Hi there; > I am sorry if my question doesn't qualify for BioC mail list. > > Have you met the situation that two labs carried out the same/similar > experiment, but came out with quite different results in term of > differentially expressed genes identified. Have anyone had done the > studies on this problem, any reference/observations? > > The usual way is to identify genes based on two lab's data, respectively, > then compare the results. What about make one model for the combined data > from two labs which takes lab as one potential factor. In this case, how > to do the pre-processing part, normalize all data together or two lab's > data separately? Any recommendations? > > What I observed is: I observed clearly systematic difference in the data > from two lab. But after I normalize all data ( I used rma )together, you > still can tell the different origin of the data after normalization, and > the model test (limma) that the lab factor is significant for about 50% > genes. My question is: in this case (normalize all data together), should > I include the lab as one factor? It seems normalizing procedure can't > cancel lab effects. > > But if I normalize two lab's data separately, they will have different > variation. Even with a lab factor, I can't use model two lab's data into > one model. > > Any comments/suggestions will be appreciated. > > Bests; > Fangxin > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 3.6 years ago
United States
I would probably normalize together and include "lab" as a factor in the experiment. Alternatively, there is a nice result by Steve Marron at UNC about how to combine datasets which would probably work quite well in this situation. You might have a look at his web page. --Naomi At 01:55 PM 10/19/2004 -0700, you wrote: >Hi there; >I am sorry if my question doesn't qualify for BioC mail list. > >Have you met the situation that two labs carried out the same/similar >experiment, but came out with quite different results in term of >differentially expressed genes identified. Have anyone had done the >studies on this problem, any reference/observations? > >The usual way is to identify genes based on two lab's data, respectively, >then compare the results. What about make one model for the combined data >from two labs which takes lab as one potential factor. In this case, how >to do the pre-processing part, normalize all data together or two lab's >data separately? Any recommendations? > >What I observed is: I observed clearly systematic difference in the data >from two lab. But after I normalize all data ( I used rma )together, you >still can tell the different origin of the data after normalization, and >the model test (limma) that the lab factor is significant for about 50% >genes. My question is: in this case (normalize all data together), should >I include the lab as one factor? It seems normalizing procedure can't >cancel lab effects. > >But if I normalize two lab's data separately, they will have different >variation. Even with a lab factor, I can't use model two lab's data into >one model. > >Any comments/suggestions will be appreciated. > >Bests; >Fangxin > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT

Login before adding your answer.

Traffic: 605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6