M vs A plot
3
0
Entering edit mode
@richard-friedman-513
Last seen 7.7 years ago
Dear Wolfgang and Everybody, I tried the span as high as .8 but it didn't cure the dip in the curve. I believe that higher than .8 is not recommended (I saw this in either the microarray or the Loess literature, I am not sure which Thanks and best wishes, Rich On Jan 30, 2004, at 2:48 PM, Wolfgang Huber wrote: > Hi Richard > > loess normalization has a parameter "span", which determines the > so-called bandwidth of the smoothing window. Apparently the default is > too small for you (leading to a too flexible regression curve), so you > have to make it larger. > > There is a lot of literature on choosing bandwidths (most of it seems > to involve some kind of cross-validation), but I am actually not aware > on recommendations for microarrays other than "by eye". > > Best wishes > Wolfgang > > -- > ------------------------------------- > Wolfgang Huber > Division of Molecular Genome Analysis > German Cancer Research Center > Heidelberg, Germany > Phone: +49 6221 424709 > Fax: +49 6221 42524709 > Http: www.dkfz.de/abt0840/whuber > ------------------------------------- > > > Richard Friedman wrote: >> Mick, >> Thanks for the help. What concerns me however is not a single >> point being an outlier, but the whole loess fit to all the points >> leading >> the lowess curve for a few printips to deviate significantly from >> being >> a straight line practically colinear with the x-axis (abcissa). The >> two >> test cases on which I learned to use marray - the apoE data that comes >> with spot, and the swirl data that comes with marray, all had >> significantly expressed genes - however they also had flat normalized >> lowess curves. Significant curvature in the lowess curve leads me >> to be concerned that the spots associated with that region of >> the curve are improperly normalized. >> Can anyone out there give me: >> 1. Guidelines as to how flat the lowess curve should be for the >> data to be considered normalized. >> 2. Advice as to what to do if the printtip normalization option >> in marray did not remove intensity dependence. >> If anyone is willing to look at the M vs A curve, I would be grateful. >> Thanks and best wishes, >> Rich >
Microarray Normalization Regression Cancer marray Microarray Normalization Regression • 922 views
ADD COMMENT
0
Entering edit mode
@richard-friedman-513
Last seen 7.7 years ago
Dear Sean (Wolfgang, Naomi, and Everybody), The original command that I used was > ira.norm <- maNorm(ira.raw, norm ="p")\ The command that I used with the altered span is > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = "maA", + y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, span =0.8)), + Mloc = TRUE, Mscale = TRUE, echo =FALSE) This command still gave pronounced curvature at in the middle of one of the printtip blocks and at the ends of several printtip blocks. I did not use a span greater than .8 because that was counteridicated either in the micorarray or loess literature. Thank you f all for your suggestion of going to vsn. However, as this program is new to me, I ask if anyone knows a rule of thumb as to how flat the printtip loess line should be in order to be acceptable? I would prefer not to change horses unless necessary Thanks and best wishes, Rich On Jan 30, 2004, at 2:48 PM, Sean Davis wrote: > Richard, > > The print-tip-loess lines should (I think) be straight and on the > x-axis > (y=0) after print-tip-normalization. If that isn't the case, perhaps > you > could post exactly the commands you used to do your normalization. > That may > help people determine better what is going on. > > In reference to ridding you of intensity-dependent variability, > loess-normalization is designed to locally center the data but does > not, in > itself, deal with the variability that may be intensity-dependent. > For that > problem, you may need to look into something like vsn or other scaling > method. > > Sean > > > On 1/30/04 2:35 PM, "Richard Friedman" > <friedman@cancercenter.columbia.edu> > wrote: > >> Mick, >> >> Thanks for the help. What concerns me however is not a single >> point being an outlier, but the whole loess fit to all the points >> leading >> the lowess curve for a few printips to deviate significantly from >> being >> a straight line practically colinear with the x-axis (abcissa). The >> two >> test cases on which I learned to use marray - the apoE data that comes >> with spot, and the swirl data that comes with marray, all had >> significantly expressed genes - however they also had flat normalized >> lowess curves. Significant curvature in the lowess curve leads me >> to be concerned that the spots associated with that region of >> the curve are improperly normalized. >> >> Can anyone out there give me: >> >> 1. Guidelines as to how flat the lowess curve should be for the >> data to be considered normalized. >> >> 2. Advice as to what to do if the printtip normalization option >> in marray did not remove intensity dependence. >> >> If anyone is willing to look at the M vs A curve, I would be grateful. >> >> Thanks and best wishes, >> Rich >> >> >> >> On Fri, 30 Jan 2004, michael watson (IAH-C) wrote: >> >>> Richard >>> >>> The nature of any normalisation means that we will always have >>> outliers - >>> those spots that deviate from all the rest. There could be two >>> reasons - >>> that spot represents a differentially expressed gene or the spot is >>> unreliable and comes from a "bad" spot. >>> >>> I'd take the common sense approach to these outliers: >>> >>> i) Check any replicate spots - if all replicate spots are outliers >>> then you >>> have evidence that it's a differentially expressed gene. However, >>> if the >>> replicates disagree, this is evidence that the outlier comes from an >>> unreliable / bad measurement >>> >>> ii) Go take a look at the spot on the original image. Does it look >>> "good"? >>> >>> You are likely always to find outliers after normalisation. This >>> is, after >>> all, what we are looking for, isn't it? The key is to be able to >>> say, when >>> you see an outlier, if that spot is of reliable quality or not. >>> >>> Thanks >>> Mick >>> >>> -----Original Message----- >>> From: Richard Friedman [mailto:friedman@cancercenter.columbia.edu] >>> Sent: 29 January 2004 22:26 >>> To: 'Bioconductor Mail List' >>> Cc: IRA A TABAS >>> Subject: [BioC] M vs A plot >>> >>> >>> Dear Bioconductors, >>> >>> I have normalized a series of arrays using print-tip normalization. >>> Where as the systematic error in the unnormalized data was >>> pronounced, >>> The systematic error on the normalized array was reduced greatly. >>> The M vs. A curve was flat for most of the 48 print-tips. However >>> for a >>> few >>> printips, for A>12 M deviates from close to zero. in one case, M >>> rises >>> as high >>> as M=1/2. at A=15. This only involves a small fraction of the spots >>> (It >>> is hard to >>> estimate what proportion). >>> >>> Does this sound serious? >>> >>> If so, what should I do about it? >>> >>> Is anyone willing to look at the JPEg file (I did not attach it >>> because I don't >>> know if I am allowed to do so). >>> >>> Thanks and best wishes, >>> Rich >>> ------------------------------------------------------------ >>> Richard A. Friedman, PhD >>> Associate Research Scientist >>> Herbert Irving Comprehensive Cancer Center >>> Oncoinformatics Core >>> Lecturer >>> Department of Biomedical Informatics >>> Box 95, Room 130BB or P&S 1-420C >>> Columbia University Medical Center >>> 630 W. 168th St. >>> New York, NY 10032 >>> (212)305-6901 (5-6901) (voice) >>> friedman@cancercenter.columbia.edu >>> http://cancercenter.columbia.edu/~friedman/ >>> >>> "Spring, Summer, and Winter. >>> Then Fall came along, >>> and that's the end of our song, >>> and the pigeons never hibernate at all". >>> -Rose Friedman, age 7 >>> (These are the correct lyrics and supersede >>> the version previously at the end of my sig) >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@stat.math.ethz.ch >>> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >>> >> >> ------------------------------------------------------------ >> Richard A. Friedman, PhD >> Associate Research Scientist >> Herbert Irving Comprehensive Cancer Center >> Oncoinformatics Core >> Lecturer >> Department of Biomedical Informatics >> Box 95, Room 130BB or P&S 1-420C >> Columbia University Medical Center >> 630 W. 168th St. >> New York, NY 10032 >> (212)305-6901 (5-6901) (voice) >> friedman@cancercenter.columbia.edu >> http://cancercenter.columbia.edu/~friedman/ >> >> "Spring, Summer, and Winter. >> Then Fall came along, >> and that's the end of our song, >> and the pigeons never hibernate at all". >> -Rose Friedman, age 7 >> (These are the correct lyrics and supersede >> the version previously at the end of my sig) >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >> >
ADD COMMENT
0
Entering edit mode
Hi Richard, I am not aware of reasons why you shouldn't use a span of more than 0.8. Afaik, even values>1 can be useful in some cases (although straight robust linear regression (rlm) may be more appropriate in these cases). On another note, there is a trade-off between what normalization can do and data quality - are you sure your funny looking plots are curable by normalization, and are not a result of too bad a quality of the data ? Best wishes Wolfgang Richard Friedman wrote: > Dear Sean (Wolfgang, Naomi, and Everybody), > > The original command that I used was > > > ira.norm <- maNorm(ira.raw, norm ="p")\ > > The command that I used with the altered span is > > > > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = "maA", > + y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, span > =0.8)), > + Mloc = TRUE, Mscale = TRUE, echo =FALSE) > > This command still gave pronounced curvature at in the middle of one of > the printtip blocks and > at the ends of several printtip blocks. > I did not use a span greater than .8 because that was counteridicated > either in the > micorarray or loess literature. > Thank you f all for your suggestion of going to vsn. However, > as this program is new to me, I ask if anyone knows a rule of thumb as > to how flat the > printtip loess line should be in order to be acceptable? I would prefer > not to change horses > unless necessary > > Thanks and best wishes, > Rich > > On Jan 30, 2004, at 2:48 PM, Sean Davis wrote: > -- ------------------------------------- Wolfgang Huber Division of Molecular Genome Analysis German Cancer Research Center Heidelberg, Germany Phone: +49 6221 424709 Fax: +49 6221 42524709 Http: www.dkfz.de/abt0840/whuber
ADD REPLY
0
Entering edit mode
Dear Wolfgang, I thought that span was the fraction of points used in the smoothing and had a theoretical maximum of 1. I also thought that it was not a good idea to use all the points in the smoothing, lest the more extreme points dominate. As for data quality, the slides are quite dirty with an uneven background. I used Spot to read the slides to compensate for background variability, Spot irregularity, and stains as much as possible. However the non-flattening of curve may still be a data quality problem. Also, shouldn't the dependence on total intensity (A) be removed? Anyway, I will try higher span values and try successive fits later today. I will also try the 2D option. Unless you think that what I have described is such that flattening will not help. I can send you a copy of all of the printtip curves and the worst printtip curve if you are willing to look at them. Thanks and best wishes, Rich On Feb 10, 2004, at 5:43 AM, Wolfgang Huber wrote: > Hi Richard, > > I am not aware of reasons why you shouldn't use a span of more than > 0.8. Afaik, even values>1 can be useful in some cases (although > straight robust linear regression (rlm) may be more appropriate in > these cases). > > On another note, there is a trade-off between what normalization can > do and data quality - are you sure your funny looking plots are > curable by normalization, and are not a result of too bad a quality of > the data ? > > Best wishes > Wolfgang > > Richard Friedman wrote: > > >> Dear Sean (Wolfgang, Naomi, and Everybody), >> The original command that I used was >> > ira.norm <- maNorm(ira.raw, norm ="p")\ >> The command that I used with the altered span is >> > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = >> "maA", >> + y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, >> span =0.8)), >> + Mloc = TRUE, Mscale = TRUE, echo =FALSE) >> This command still gave pronounced curvature at in the middle of one >> of the printtip blocks and >> at the ends of several printtip blocks. >> I did not use a span greater than .8 because that was counteridicated >> either in the >> micorarray or loess literature. >> Thank you f all for your suggestion of going to vsn. However, >> as this program is new to me, I ask if anyone knows a rule of thumb >> as to how flat the >> printtip loess line should be in order to be acceptable? I would >> prefer not to change horses >> unless necessary >> Thanks and best wishes, >> Rich >> On Jan 30, 2004, at 2:48 PM, Sean Davis wrote: > > -- > ------------------------------------- > Wolfgang Huber > Division of Molecular Genome Analysis > German Cancer Research Center > Heidelberg, Germany > Phone: +49 6221 424709 > Fax: +49 6221 42524709 > Http: www.dkfz.de/abt0840/whuber > ------------------------------------- >
ADD REPLY
0
Entering edit mode
Richard Friedman wrote: > Dear Wolfgang, > I thought that span was the fraction of points used in the smoothing > and had a theoretical maximum of 1. I also thought that it was not a > good idea to use all the points in the smoothing, lest the more extreme > points dominate. Hi Richard, See the help page for "loess", section "Details". span>1 is quite legal and different from span=1. For a good explanation of local regression, have a look at the book by Clive/Catherine Loader. (And if you think that bandwidth choice in local regression is quite confusing, you're at a point where I've been, too ...) Best wishes Wolfgang -- ------------------------------------- Wolfgang Huber Division of Molecular Genome Analysis German Cancer Research Center Heidelberg, Germany Phone: +49 6221 424709 Fax: +49 6221 42524709 Http: www.dkfz.de/abt0840/whuber
ADD REPLY
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 13 months ago
United States
Dear Richard and other participants in this discussion, Loess uses a kernel weight to downweight the effects of more distant data values in the local regression. For bandwidth greater than 1, all of the data values are used, but the more distant values are still downweighted. As the bandwidth increases, there is less downweighting. If you increase the bandwidth during the normalization process, the curve used for normalization gets flatter, until at very high bandwidth you are just doing ordinary linear regression. The normalized values are the residuals from this curve. As a result, if there is curvature on the MvA plots, large bandwidths lead to normalized data that still have curvature (since only the linear trend is removed.) Small bandwidths lead to normalized data that are flatter. My understanding is that Richard then visualized the curvature in the normalized data also using loess. On these plots we are looking at the curve fitted to the normalized data. So, if a large bandwidth is used to fit the curves, these curves should be flat. However, if a large bandwidth is used for normalization, and the default bandwidth is used to visualize the normalized data, there will be excess curvature in the normalized data. --Naomi At 05:03 PM 2/9/2004, Richard Friedman wrote: >Dear Sean (Wolfgang, Naomi, and Everybody), > > The original command that I used was > > > ira.norm <- maNorm(ira.raw, norm ="p")\ > >The command that I used with the altered span is > > > > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = "maA", >+ y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, span >=0.8)), >+ Mloc = TRUE, Mscale = TRUE, echo =FALSE) > >This command still gave pronounced curvature at in the middle of one of >the printtip blocks and >at the ends of several printtip blocks. >I did not use a span greater than .8 because that was counteridicated >either in the >micorarray or loess literature. >Thank you f all for your suggestion of going to vsn. However, >as this program is new to me, I ask if anyone knows a rule of thumb as to >how flat the >printtip loess line should be in order to be acceptable? I would prefer >not to change horses >unless necessary > >Thanks and best wishes, >Rich > >On Jan 30, 2004, at 2:48 PM, Sean Davis wrote: > >>Richard, >> >>The print-tip-loess lines should (I think) be straight and on the x-axis >>(y=0) after print-tip-normalization. If that isn't the case, perhaps you >>could post exactly the commands you used to do your normalization. >>That may >>help people determine better what is going on. >> >>In reference to ridding you of intensity-dependent variability, >>loess-normalization is designed to locally center the data but does not, in >>itself, deal with the variability that may be intensity-dependent. >>For that >>problem, you may need to look into something like vsn or other scaling >>method. >> >>Sean >> >> >>On 1/30/04 2:35 PM, "Richard Friedman" <friedman@cancercenter.columbia.edu> >>wrote: >> >>>Mick, >>> >>>Thanks for the help. What concerns me however is not a single >>>point being an outlier, but the whole loess fit to all the points leading >>>the lowess curve for a few printips to deviate significantly from being >>>a straight line practically colinear with the x-axis (abcissa). The two >>>test cases on which I learned to use marray - the apoE data that comes >>>with spot, and the swirl data that comes with marray, all had >>>significantly expressed genes - however they also had flat normalized >>>lowess curves. Significant curvature in the lowess curve leads me >>>to be concerned that the spots associated with that region of >>>the curve are improperly normalized. >>> >>>Can anyone out there give me: >>> >>>1. Guidelines as to how flat the lowess curve should be for the >>> data to be considered normalized. >>> >>>2. Advice as to what to do if the printtip normalization option >>> in marray did not remove intensity dependence. >>> >>>If anyone is willing to look at the M vs A curve, I would be grateful. >>> >>>Thanks and best wishes, >>>Rich >>> >>> >>> >>>On Fri, 30 Jan 2004, michael watson (IAH-C) wrote: >>> >>>>Richard >>>> >>>>The nature of any normalisation means that we will always have outliers - >>>>those spots that deviate from all the rest. There could be two reasons - >>>>that spot represents a differentially expressed gene or the spot is >>>>unreliable and comes from a "bad" spot. >>>> >>>>I'd take the common sense approach to these outliers: >>>> >>>>i) Check any replicate spots - if all replicate spots are outliers then you >>>>have evidence that it's a differentially expressed gene. However, if the >>>>replicates disagree, this is evidence that the outlier comes from an >>>>unreliable / bad measurement >>>> >>>>ii) Go take a look at the spot on the original image. Does it look "good"? >>>> >>>>You are likely always to find outliers after normalisation. This is, after >>>>all, what we are looking for, isn't it? The key is to be able to say, when >>>>you see an outlier, if that spot is of reliable quality or not. >>>> >>>>Thanks >>>>Mick >>>> >>>>-----Original Message----- >>>>From: Richard Friedman [mailto:friedman@cancercenter.columbia.edu] >>>>Sent: 29 January 2004 22:26 >>>>To: 'Bioconductor Mail List' >>>>Cc: IRA A TABAS >>>>Subject: [BioC] M vs A plot >>>> >>>> >>>>Dear Bioconductors, >>>> >>>>I have normalized a series of arrays using print-tip normalization. >>>>Where as the systematic error in the unnormalized data was pronounced, >>>>The systematic error on the normalized array was reduced greatly. >>>>The M vs. A curve was flat for most of the 48 print-tips. However for a >>>>few >>>>printips, for A>12 M deviates from close to zero. in one case, M rises >>>>as high >>>>as M=1/2. at A=15. This only involves a small fraction of the spots (It >>>>is hard to >>>>estimate what proportion). >>>> >>>>Does this sound serious? >>>> >>>>If so, what should I do about it? >>>> >>>>Is anyone willing to look at the JPEg file (I did not attach it >>>>because I don't >>>>know if I am allowed to do so). >>>> >>>>Thanks and best wishes, >>>>Rich >>>>------------------------------------------------------------ >>>>Richard A. Friedman, PhD >>>>Associate Research Scientist >>>>Herbert Irving Comprehensive Cancer Center >>>>Oncoinformatics Core >>>>Lecturer >>>>Department of Biomedical Informatics >>>>Box 95, Room 130BB or P&S 1-420C >>>>Columbia University Medical Center >>>>630 W. 168th St. >>>>New York, NY 10032 >>>>(212)305-6901 (5-6901) (voice) >>>>friedman@cancercenter.columbia.edu >>>>http://cancercenter.columbia.edu/~friedman/ >>>> >>>>"Spring, Summer, and Winter. >>>>Then Fall came along, >>>>and that's the end of our song, >>>>and the pigeons never hibernate at all". >>>>-Rose Friedman, age 7 >>>>(These are the correct lyrics and supersede >>>>the version previously at the end of my sig) >>>> >>>>_______________________________________________ >>>>Bioconductor mailing list >>>>Bioconductor@stat.math.ethz.ch >>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >>> >>>------------------------------------------------------------ >>>Richard A. Friedman, PhD >>>Associate Research Scientist >>>Herbert Irving Comprehensive Cancer Center >>>Oncoinformatics Core >>>Lecturer >>>Department of Biomedical Informatics >>>Box 95, Room 130BB or P&S 1-420C >>>Columbia University Medical Center >>>630 W. 168th St. >>>New York, NY 10032 >>>(212)305-6901 (5-6901) (voice) >>>friedman@cancercenter.columbia.edu >>>http://cancercenter.columbia.edu/~friedman/ >>> >>>"Spring, Summer, and Winter. >>>Then Fall came along, >>>and that's the end of our song, >>>and the pigeons never hibernate at all". >>>-Rose Friedman, age 7 >>>(These are the correct lyrics and supersede >>>the version previously at the end of my sig) >>> >>>_______________________________________________ >>>Bioconductor mailing list >>>Bioconductor@stat.math.ethz.ch >>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >> >>Naomi S. Altman 814-865-3791 (voice) >>Associate Professor >>Bioinformatics Consulting Center >>Dept. of Statistics 814-863-7114 (fax) >>Penn State University 814-865-1348 (Statistics) >>University Park, PA 16802-2111
ADD COMMENT
0
Entering edit mode
Dear Naomi (and everybody), Thank you for your reply, Since the normalized curve is displayed with the same overall command that performed the normalization it is not clear to me why you suggest that the display curve is fitted with a different bandwith that that which was input. Also, once the data is normalized, shouldn't the default parameters yield a flat curve? In any even I achieved flattening the curve by two successive loess normalizations with default parameters. Do you see any disadvantage to that procedure? Thanks and best wishes, Rich ------------------------------------------------------------ Richard A. Friedman, PhD Associate Research Scientist Herbert Irving Comprehensive Cancer Center Oncoinformatics Core Lecturer Department of Biomedical Informatics Box 95, Room 130BB or P&S 1-420C Columbia University Medical Center 630 W. 168th St. New York, NY 10032 (212)305-6901 (5-6901) (voice) friedman@cancercenter.columbia.edu http://cancercenter.columbia.edu/~friedman/ In Memoriam, Julius Schwartz On Feb 17, 2004, at 9:11 AM, Naomi Altman wrote: > Dear Richard and other participants in this discussion, > > Loess uses a kernel weight to downweight the effects of more distant > data values in the local regression. For bandwidth greater than 1, > all of the data values are used, but the more distant values are still > downweighted. As the bandwidth increases, there is less > downweighting. > > If you increase the bandwidth during the normalization process, the > curve used for normalization gets flatter, until at very high > bandwidth you are just doing ordinary linear regression. The > normalized values are the residuals from this curve. As a result, if > there is curvature on the MvA plots, large bandwidths lead to > normalized data that still have curvature (since only the linear trend > is removed.) Small bandwidths lead to normalized data that are > flatter. > > My understanding is that Richard then visualized the curvature in the > normalized data also using loess. On these plots we are looking at > the curve fitted to the normalized data. So, if a large bandwidth is > used to fit the curves, these curves should be flat. However, if a > large bandwidth is used for normalization, and the default bandwidth > is used to visualize the normalized data, there will be excess > curvature in the normalized data. > > --Naomi > > > At 05:03 PM 2/9/2004, Richard Friedman wrote: >> Dear Sean (Wolfgang, Naomi, and Everybody), >> >> The original command that I used was >> >> > ira.norm <- maNorm(ira.raw, norm ="p")\ >> >> The command that I used with the altered span is >> >> >> > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = >> "maA", >> + y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, >> span =0.8)), >> + Mloc = TRUE, Mscale = TRUE, echo =FALSE) >> >> This command still gave pronounced curvature at in the middle of one >> of the printtip blocks and >> at the ends of several printtip blocks. >> I did not use a span greater than .8 because that was counteridicated >> either in the >> micorarray or loess literature. >> Thank you f all for your suggestion of going to vsn. However, >> as this program is new to me, I ask if anyone knows a rule of thumb >> as to how flat the >> printtip loess line should be in order to be acceptable? I would >> prefer not to change horses >> unless necessary >> >> Thanks and best wishes, >> Rich >> >> On Jan 30, 2004, at 2:48 PM, Sean Davis wrote: >> >>> Richard, >>> >>> The print-tip-loess lines should (I think) be straight and on the >>> x-axis >>> (y=0) after print-tip-normalization. If that isn't the case, >>> perhaps you >>> could post exactly the commands you used to do your normalization. >>> That may >>> help people determine better what is going on. >>> >>> In reference to ridding you of intensity-dependent variability, >>> loess-normalization is designed to locally center the data but does >>> not, in >>> itself, deal with the variability that may be intensity-dependent. >>> For that >>> problem, you may need to look into something like vsn or other >>> scaling >>> method. >>> >>> Sean >>> >>> >>> On 1/30/04 2:35 PM, "Richard Friedman" >>> <friedman@cancercenter.columbia.edu> >>> wrote: >>> >>>> Mick, >>>> >>>> Thanks for the help. What concerns me however is not a single >>>> point being an outlier, but the whole loess fit to all the points >>>> leading >>>> the lowess curve for a few printips to deviate significantly from >>>> being >>>> a straight line practically colinear with the x-axis (abcissa). The >>>> two >>>> test cases on which I learned to use marray - the apoE data that >>>> comes >>>> with spot, and the swirl data that comes with marray, all had >>>> significantly expressed genes - however they also had flat >>>> normalized >>>> lowess curves. Significant curvature in the lowess curve leads me >>>> to be concerned that the spots associated with that region of >>>> the curve are improperly normalized. >>>> >>>> Can anyone out there give me: >>>> >>>> 1. Guidelines as to how flat the lowess curve should be for the >>>> data to be considered normalized. >>>> >>>> 2. Advice as to what to do if the printtip normalization option >>>> in marray did not remove intensity dependence. >>>> >>>> If anyone is willing to look at the M vs A curve, I would be >>>> grateful. >>>> >>>> Thanks and best wishes, >>>> Rich >>>> >>>> >>>> >>>> On Fri, 30 Jan 2004, michael watson (IAH-C) wrote: >>>> >>>>> Richard >>>>> >>>>> The nature of any normalisation means that we will always have >>>>> outliers - >>>>> those spots that deviate from all the rest. There could be two >>>>> reasons - >>>>> that spot represents a differentially expressed gene or the spot is >>>>> unreliable and comes from a "bad" spot. >>>>> >>>>> I'd take the common sense approach to these outliers: >>>>> >>>>> i) Check any replicate spots - if all replicate spots are outliers >>>>> then you >>>>> have evidence that it's a differentially expressed gene. However, >>>>> if the >>>>> replicates disagree, this is evidence that the outlier comes from >>>>> an >>>>> unreliable / bad measurement >>>>> >>>>> ii) Go take a look at the spot on the original image. Does it >>>>> look "good"? >>>>> >>>>> You are likely always to find outliers after normalisation. This >>>>> is, after >>>>> all, what we are looking for, isn't it? The key is to be able to >>>>> say, when >>>>> you see an outlier, if that spot is of reliable quality or not. >>>>> >>>>> Thanks >>>>> Mick >>>>> >>>>> -----Original Message----- >>>>> From: Richard Friedman [mailto:friedman@cancercenter.columbia.edu] >>>>> Sent: 29 January 2004 22:26 >>>>> To: 'Bioconductor Mail List' >>>>> Cc: IRA A TABAS >>>>> Subject: [BioC] M vs A plot >>>>> >>>>> >>>>> Dear Bioconductors, >>>>> >>>>> I have normalized a series of arrays using print-tip normalization. >>>>> Where as the systematic error in the unnormalized data was >>>>> pronounced, >>>>> The systematic error on the normalized array was reduced greatly. >>>>> The M vs. A curve was flat for most of the 48 print-tips. However >>>>> for a >>>>> few >>>>> printips, for A>12 M deviates from close to zero. in one case, M >>>>> rises >>>>> as high >>>>> as M=1/2. at A=15. This only involves a small fraction of the >>>>> spots (It >>>>> is hard to >>>>> estimate what proportion). >>>>> >>>>> Does this sound serious? >>>>> >>>>> If so, what should I do about it? >>>>> >>>>> Is anyone willing to look at the JPEg file (I did not attach it >>>>> because I don't >>>>> know if I am allowed to do so). >>>>> >>>>> Thanks and best wishes, >>>>> Rich >>>>> ------------------------------------------------------------ >>>>> Richard A. Friedman, PhD >>>>> Associate Research Scientist >>>>> Herbert Irving Comprehensive Cancer Center >>>>> Oncoinformatics Core >>>>> Lecturer >>>>> Department of Biomedical Informatics >>>>> Box 95, Room 130BB or P&S 1-420C >>>>> Columbia University Medical Center >>>>> 630 W. 168th St. >>>>> New York, NY 10032 >>>>> (212)305-6901 (5-6901) (voice) >>>>> friedman@cancercenter.columbia.edu >>>>> http://cancercenter.columbia.edu/~friedman/ >>>>> >>>>> "Spring, Summer, and Winter. >>>>> Then Fall came along, >>>>> and that's the end of our song, >>>>> and the pigeons never hibernate at all". >>>>> -Rose Friedman, age 7 >>>>> (These are the correct lyrics and supersede >>>>> the version previously at the end of my sig) >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor@stat.math.ethz.ch >>>>> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >>>> >>>> ------------------------------------------------------------ >>>> Richard A. Friedman, PhD >>>> Associate Research Scientist >>>> Herbert Irving Comprehensive Cancer Center >>>> Oncoinformatics Core >>>> Lecturer >>>> Department of Biomedical Informatics >>>> Box 95, Room 130BB or P&S 1-420C >>>> Columbia University Medical Center >>>> 630 W. 168th St. >>>> New York, NY 10032 >>>> (212)305-6901 (5-6901) (voice) >>>> friedman@cancercenter.columbia.edu >>>> http://cancercenter.columbia.edu/~friedman/ >>>> >>>> "Spring, Summer, and Winter. >>>> Then Fall came along, >>>> and that's the end of our song, >>>> and the pigeons never hibernate at all". >>>> -Rose Friedman, age 7 >>>> (These are the correct lyrics and supersede >>>> the version previously at the end of my sig) >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@stat.math.ethz.ch >>>> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >>> >>> Naomi S. Altman 814-865-3791 (voice) >>> Associate Professor >>> Bioinformatics Consulting Center >>> Dept. of Statistics 814-863-7114 (fax) >>> Penn State University 814-865-1348 >>> (Statistics) >>> University Park, PA 16802-2111 > >
ADD REPLY
0
Entering edit mode
Naomi Altman ★ 6.0k
@naomi-altman-380
Last seen 13 months ago
United States
If the residuals are smoothed using the same bandwidth as the original curve, the result should be flat. I am puzzled as well. Smoothing twice is the same as using a larger bandwidth and a somewhat different kernel weight. It should be fine. --Naomi At 10:43 AM 2/17/2004, Richard Friedman wrote: >Dear Naomi (and everybody), > > Thank you for your reply, > Since the normalized curve is displayed with the same overall > command that performed the >normalization it is not clear to me why you suggest that the display curve >is fitted with a different >bandwith that that which was input. Also, once the data is normalized, >shouldn't the default >parameters yield a flat curve? > In any even I achieved flattening the curve by two successive > loess normalizations with default parameters. Do you see any > disadvantage to that procedure? > >Thanks and best wishes, >Rich >------------------------------------------------------------ >Richard A. Friedman, PhD >Associate Research Scientist >Herbert Irving Comprehensive Cancer Center >Oncoinformatics Core >Lecturer >Department of Biomedical Informatics >Box 95, Room 130BB or P&S 1-420C >Columbia University Medical Center >630 W. 168th St. >New York, NY 10032 >(212)305-6901 (5-6901) (voice) >friedman@cancercenter.columbia.edu >http://cancercenter.columbia.edu/~friedman/ > >In Memoriam, Julius Schwartz > >On Feb 17, 2004, at 9:11 AM, Naomi Altman wrote: > >>Dear Richard and other participants in this discussion, >> >>Loess uses a kernel weight to downweight the effects of more distant data >>values in the local regression. For bandwidth greater than 1, all of the >>data values are used, but the more distant values are still >>downweighted. As the bandwidth increases, there is less downweighting. >> >>If you increase the bandwidth during the normalization process, the curve >>used for normalization gets flatter, until at very high bandwidth you are >>just doing ordinary linear regression. The normalized values are the >>residuals from this curve. As a result, if there is curvature on the MvA >>plots, large bandwidths lead to normalized data that still have curvature >>(since only the linear trend is removed.) Small bandwidths lead to >>normalized data that are flatter. >> >>My understanding is that Richard then visualized the curvature in the >>normalized data also using loess. On these plots we are looking at the >>curve fitted to the normalized data. So, if a large bandwidth is used to >>fit the curves, these curves should be flat. However, if a large >>bandwidth is used for normalization, and the default bandwidth is used to >>visualize the normalized data, there will be excess curvature in the >>normalized data. >> >>--Naomi >> >> >>At 05:03 PM 2/9/2004, Richard Friedman wrote: >>>Dear Sean (Wolfgang, Naomi, and Everybody), >>> >>> The original command that I used was >>> >>> > ira.norm <- maNorm(ira.raw, norm ="p")\ >>> >>>The command that I used with the altered span is >>> >>> >>> > ira.f8.norm <- maNormMain(ira.raw, f.loc = list(maNormLoess(x = "maA", >>>+ y = "maM", z= "maPrintTip", w = NULL, subset =TRUE, span >>>=0.8)), >>>+ Mloc = TRUE, Mscale = TRUE, echo =FALSE) >>> >>>This command still gave pronounced curvature at in the middle of one of >>>the printtip blocks and >>>at the ends of several printtip blocks. >>>I did not use a span greater than .8 because that was counteridicated >>>either in the >>>micorarray or loess literature. >>>Thank you f all for your suggestion of going to vsn. However, >>>as this program is new to me, I ask if anyone knows a rule of thumb as >>>to how flat the >>>printtip loess line should be in order to be acceptable? I would prefer >>>not to change horses >>>unless necessary >>> >>>Thanks and best wishes, >>>Rich >>> >>>On Jan 30, 2004, at 2:48 PM, Sean Davis wrote: >>> >>>>Richard, >>>> >>>>The print-tip-loess lines should (I think) be straight and on the x-axis >>>>(y=0) after print-tip-normalization. If that isn't the case, perhaps you >>>>could post exactly the commands you used to do your normalization. >>>>That may >>>>help people determine better what is going on. >>>> >>>>In reference to ridding you of intensity-dependent variability, >>>>loess-normalization is designed to locally center the data but does not, in >>>>itself, deal with the variability that may be intensity-dependent. >>>>For that >>>>problem, you may need to look into something like vsn or other scaling >>>>method. >>>> >>>>Sean >>>> >>>> >>>>On 1/30/04 2:35 PM, "Richard Friedman" <friedman@cancercenter.columbia.edu> >>>>wrote: >>>> >>>>>Mick, >>>>> >>>>>Thanks for the help. What concerns me however is not a single >>>>>point being an outlier, but the whole loess fit to all the points leading >>>>>the lowess curve for a few printips to deviate significantly from being >>>>>a straight line practically colinear with the x-axis (abcissa). The two >>>>>test cases on which I learned to use marray - the apoE data that comes >>>>>with spot, and the swirl data that comes with marray, all had >>>>>significantly expressed genes - however they also had flat normalized >>>>>lowess curves. Significant curvature in the lowess curve leads me >>>>>to be concerned that the spots associated with that region of >>>>>the curve are improperly normalized. >>>>> >>>>>Can anyone out there give me: >>>>> >>>>>1. Guidelines as to how flat the lowess curve should be for the >>>>> data to be considered normalized. >>>>> >>>>>2. Advice as to what to do if the printtip normalization option >>>>> in marray did not remove intensity dependence. >>>>> >>>>>If anyone is willing to look at the M vs A curve, I would be grateful. >>>>> >>>>>Thanks and best wishes, >>>>>Rich >>>>> >>>>> >>>>> >>>>>On Fri, 30 Jan 2004, michael watson (IAH-C) wrote: >>>>> >>>>>>Richard >>>>>> >>>>>>The nature of any normalisation means that we will always have outliers - >>>>>>those spots that deviate from all the rest. There could be two reasons - >>>>>>that spot represents a differentially expressed gene or the spot is >>>>>>unreliable and comes from a "bad" spot. >>>>>> >>>>>>I'd take the common sense approach to these outliers: >>>>>> >>>>>>i) Check any replicate spots - if all replicate spots are outliers >>>>>>then you >>>>>>have evidence that it's a differentially expressed gene. However, if the >>>>>>replicates disagree, this is evidence that the outlier comes from an >>>>>>unreliable / bad measurement >>>>>> >>>>>>ii) Go take a look at the spot on the original image. Does it look >>>>>>"good"? >>>>>> >>>>>>You are likely always to find outliers after normalisation. This is, >>>>>>after >>>>>>all, what we are looking for, isn't it? The key is to be able to >>>>>>say, when >>>>>>you see an outlier, if that spot is of reliable quality or not. >>>>>> >>>>>>Thanks >>>>>>Mick >>>>>> >>>>>>-----Original Message----- >>>>>>From: Richard Friedman [mailto:friedman@cancercenter.columbia.edu] >>>>>>Sent: 29 January 2004 22:26 >>>>>>To: 'Bioconductor Mail List' >>>>>>Cc: IRA A TABAS >>>>>>Subject: [BioC] M vs A plot >>>>>> >>>>>> >>>>>>Dear Bioconductors, >>>>>> >>>>>>I have normalized a series of arrays using print-tip normalization. >>>>>>Where as the systematic error in the unnormalized data was pronounced, >>>>>>The systematic error on the normalized array was reduced greatly. >>>>>>The M vs. A curve was flat for most of the 48 print-tips. However for a >>>>>>few >>>>>>printips, for A>12 M deviates from close to zero. in one case, M rises >>>>>>as high >>>>>>as M=1/2. at A=15. This only involves a small fraction of the spots (It >>>>>>is hard to >>>>>>estimate what proportion). >>>>>> >>>>>>Does this sound serious? >>>>>> >>>>>>If so, what should I do about it? >>>>>> >>>>>>Is anyone willing to look at the JPEg file (I did not attach it >>>>>>because I don't >>>>>>know if I am allowed to do so). >>>>>> >>>>>>Thanks and best wishes, >>>>>>Rich >>>>>>------------------------------------------------------------ >>>>>>Richard A. Friedman, PhD >>>>>>Associate Research Scientist >>>>>>Herbert Irving Comprehensive Cancer Center >>>>>>Oncoinformatics Core >>>>>>Lecturer >>>>>>Department of Biomedical Informatics >>>>>>Box 95, Room 130BB or P&S 1-420C >>>>>>Columbia University Medical Center >>>>>>630 W. 168th St. >>>>>>New York, NY 10032 >>>>>>(212)305-6901 (5-6901) (voice) >>>>>>friedman@cancercenter.columbia.edu >>>>>>http://cancercenter.columbia.edu/~friedman/ >>>>>> >>>>>>"Spring, Summer, and Winter. >>>>>>Then Fall came along, >>>>>>and that's the end of our song, >>>>>>and the pigeons never hibernate at all". >>>>>>-Rose Friedman, age 7 >>>>>>(These are the correct lyrics and supersede >>>>>>the version previously at the end of my sig) >>>>>> >>>>>>_______________________________________________ >>>>>>Bioconductor mailing list >>>>>>Bioconductor@stat.math.ethz.ch >>>>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >>>>> >>>>>------------------------------------------------------------ >>>>>Richard A. Friedman, PhD >>>>>Associate Research Scientist >>>>>Herbert Irving Comprehensive Cancer Center >>>>>Oncoinformatics Core >>>>>Lecturer >>>>>Department of Biomedical Informatics >>>>>Box 95, Room 130BB or P&S 1-420C >>>>>Columbia University Medical Center >>>>>630 W. 168th St. >>>>>New York, NY 10032 >>>>>(212)305-6901 (5-6901) (voice) >>>>>friedman@cancercenter.columbia.edu >>>>>http://cancercenter.columbia.edu/~friedman/ >>>>> >>>>>"Spring, Summer, and Winter. >>>>>Then Fall came along, >>>>>and that's the end of our song, >>>>>and the pigeons never hibernate at all". >>>>>-Rose Friedman, age 7 >>>>>(These are the correct lyrics and supersede >>>>>the version previously at the end of my sig) >>>>> >>>>>_______________________________________________ >>>>>Bioconductor mailing list >>>>>Bioconductor@stat.math.ethz.ch >>>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >>>> >>>>Naomi S. Altman 814-865-3791 (voice) >>>>Associate Professor >>>>Bioinformatics Consulting Center >>>>Dept. of Statistics 814-863-7114 (fax) >>>>Penn State University 814-865-1348 (Statistics) >>>>University Park, PA 16802-2111 >> > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Bioinformatics Consulting Center Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111
ADD COMMENT

Login before adding your answer.

Traffic: 776 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6