correlation between M values of replicate arrays
1
0
Entering edit mode
João Fadista ▴ 500
@joao-fadista-1942
Last seen 10.2 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070206/ 1842fd71/attachment.pl
• 578 views
ADD COMMENT
0
Entering edit mode
Claus Mayer ▴ 340
@claus-mayer-1179
Last seen 10.2 years ago
European Union
Dear Jo?o! Most normalisation methods assume that the majority of genes are not differentially expressed, i.e. that there expected M value is 0. If this assumption is correct properly normalized data will show only weak correlation between M values from different arrays, so observing this in your normalised data is not necessarily a reason to worry. There are different reasons why the unnormalized arrays might show higher correlations. The most obvious situation that comes to my mind is if you use something like a reference design, i.e you always have a control on dye1 and the treamtment sample on dye2. The intensity depending dye bias (which you try to remove with loess normalisation) will then automatically lead to correlated M values. It is an unwanted correlation though, caused by a systematic bias, so the normalized data with less correlation are "better" in this case. There will be other scenarios where something like that happens, but without knowing details about your experiment it makes little sense to speculate about them. Hope that helps Claus Jo?o Fadista wrote: > Dear all, > > I have some questions that I would like to pose to this list. > > When I normalize microarray data (usually with the methods in > normalizeWithinArrays function in limma package) I decrease the > correlation between the M values of my replicate arrays. This > obviously has an explanation bacause if we normalize "within" arrays, > the differences between them tend to become larger. Therefore, if the > correlation between replicates decrease, it seems like if we > normalize our data we would get "worse" data. > > Is this true? Does it happen the same to you? And how do you deal > with that? > > > > Med venlig hilsen / Regards > > Jo?o Fadista Ph.d. studerende / Ph.d. student > > > AARHUS UNIVERSITET / UNIVERSITY OF AARHUS Det > Jordbrugsvidenskabelige Fakultet / Faculty of Agricultural Sciences > Forskningscenter Foulum / Research Centre Foulum Genetik og > Bioteknologi / Dept. of Genetics and Biotechnology Blichers All? 20, > P.O. BOX 50 DK-8830 Tjele Tel: +45 8999 1900 Direct: +45 8999 1900 > Mobile: +45 E-mail: Joao.Fadista at agrsci.dk > <mailto:joao.fadista at="" agrsci.dk=""> Web: www.agrsci.dk > <http: www.agrsci.dk=""/> ________________________________ > > Tilmeld dig DJF's nyhedsbrev / Subscribe Faculty of Agricultural > Sciences Newsletter <http: www.agrsci.dk="" user="" register?lan="dan-DK"> . > > > Denne email kan indeholde fortrolig information. Enhver brug eller > offentligg?relse af denne email uden skriftlig tilladelse fra DJF er > ikke tilladt. Hvis De ikke er den tilt?nkte adressat, bedes De > venligst straks underrette DJF samt slette emailen. > > This email may contain information that is confidential. Any use or > publication of this email without written permission from Faculty of > Agricultural Sciences is not allowed. If you are not the intended > recipient, please notify Faculty of Agricultural Sciences immediately > and delete this email. > > > > [[alternative HTML version deleted]] > > > > -------------------------------------------------------------------- ---- > > > _______________________________________________ Bioconductor mailing > list Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- ********************************************************************** ************* Dr Claus-D. Mayer | http://www.bioss.ac.uk Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk Rowett Research Institute | Telephone: +44 (0) 1224 716652 Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349
ADD COMMENT
0
Entering edit mode
Dear Jo?o, just to add to Claus' excellent explanation: high correlation between replicates is NOT the same as good data quality. Otherwise, a normalisation method that would replace each probe's M value by a fixed number that only depends on the gene name would be the best(*). What you want is an improvement in MSE = bias? + variance. See e.g. also http://en.wikipedia.org/wiki/Mean_squared_error (*) Btw, this is no joke, on oligonucleotide arrays the dependence of the background (unspecific) signal on GC content and other sequence features can have exactly this effect. Best wishes Wolfgang Claus Mayer wrote: > Dear Jo?o! > > Most normalisation methods assume that the majority of genes are not > differentially expressed, i.e. that there expected M value is 0. If this > assumption is correct properly normalized data will show only weak > correlation between M values from different arrays, so observing this in > your normalised data is not necessarily a reason to worry. > > There are different reasons why the unnormalized arrays might show > higher correlations. The most obvious situation that comes to my mind is > if you use something like a reference design, i.e you always have a > control on dye1 and the treamtment sample on dye2. The intensity > depending dye bias (which you try to remove with loess normalisation) > will then automatically lead to correlated M values. > It is an unwanted correlation though, caused by a systematic bias, so > the normalized data with less correlation are "better" in this case. > There will be other scenarios where something like that happens, but > without knowing details about your experiment it makes little sense to > speculate about them. > > Hope that helps > > Claus > > Jo?o Fadista wrote: >> Dear all, >> >> I have some questions that I would like to pose to this list. >> >> When I normalize microarray data (usually with the methods in >> normalizeWithinArrays function in limma package) I decrease the >> correlation between the M values of my replicate arrays. This >> obviously has an explanation bacause if we normalize "within" arrays, >> the differences between them tend to become larger. Therefore, if the >> correlation between replicates decrease, it seems like if we >> normalize our data we would get "worse" data. >> >> Is this true? Does it happen the same to you? And how do you deal >> with that? >> >> >> >> Med venlig hilsen / Regards >> >> Jo?o Fadista Ph.d. studerende / Ph.d. student >>
ADD REPLY
0
Entering edit mode
Thanks for both explanations on this subject. You guessed that my experiment was with a reference design! Ok, now I am a bit more relief and aware of what should be expected. Best regards Jo?o Fadista Ph.d. student UNIVERSITY OF AARHUS Faculty of Agricultural Sciences Research Centre Foulum Dept. of Genetics and Biotechnology Blichers All? 20, P.O. BOX 50 DK-8830 Tjele Phone: +45 8999 1900 Direct: +45 8999 1900 E-mail: Joao.Fadista at agrsci.dk Web: http://www.agrsci.org This email may contain information that is confidential. Any use or publication of this email without written permission from Faculty of Agricultural Sciences is not allowed. If you are not the intended recipient, please notify Faculty of Agricultural Sciences immediately and delete this email. -----Original Message----- From: Wolfgang Huber [mailto:huber@ebi.ac.uk] Sent: Wednesday, February 07, 2007 12:27 AM To: Claus Mayer Cc: Jo?o Fadista; bioconductor at stat.math.ethz.ch Subject: Re: [BioC] correlation between M values of replicate arrays Dear Jo?o, just to add to Claus' excellent explanation: high correlation between replicates is NOT the same as good data quality. Otherwise, a normalisation method that would replace each probe's M value by a fixed number that only depends on the gene name would be the best(*). What you want is an improvement in MSE = bias? + variance. See e.g. also http://en.wikipedia.org/wiki/Mean_squared_error (*) Btw, this is no joke, on oligonucleotide arrays the dependence of the background (unspecific) signal on GC content and other sequence features can have exactly this effect. Best wishes Wolfgang Claus Mayer wrote: > Dear Jo?o! > > Most normalisation methods assume that the majority of genes are not > differentially expressed, i.e. that there expected M value is 0. If > this assumption is correct properly normalized data will show only > weak correlation between M values from different arrays, so observing > this in your normalised data is not necessarily a reason to worry. > > There are different reasons why the unnormalized arrays might show > higher correlations. The most obvious situation that comes to my mind > is if you use something like a reference design, i.e you always have a > control on dye1 and the treamtment sample on dye2. The intensity > depending dye bias (which you try to remove with loess normalisation) > will then automatically lead to correlated M values. > It is an unwanted correlation though, caused by a systematic bias, so > the normalized data with less correlation are "better" in this case. > There will be other scenarios where something like that happens, but > without knowing details about your experiment it makes little sense to > speculate about them. > > Hope that helps > > Claus > > Jo?o Fadista wrote: >> Dear all, >> >> I have some questions that I would like to pose to this list. >> >> When I normalize microarray data (usually with the methods in >> normalizeWithinArrays function in limma package) I decrease the >> correlation between the M values of my replicate arrays. This >> obviously has an explanation bacause if we normalize "within" arrays, >> the differences between them tend to become larger. Therefore, if the >> correlation between replicates decrease, it seems like if we >> normalize our data we would get "worse" data. >> >> Is this true? Does it happen the same to you? And how do you deal >> with that? >> >> >> >> Med venlig hilsen / Regards >> >> Jo?o Fadista Ph.d. studerende / Ph.d. student >>
ADD REPLY

Login before adding your answer.

Traffic: 561 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6