Defining Weights in marrayNorm.
15
0
Entering edit mode
Josef Walker ▴ 20
@josef-walker-397
Last seen 10.2 years ago
Hi all, My name is Joe Walker and I am a final year PhD student attempting to use Bioconductor to analyse a large amount of cDNA microarray data from my thesis experiments. For the normalisation stage, there is the option to use weights previously assigned to the genes. I wish to normalise my genes based on a quality controlled subset that changes fro each hybridisation, I think one way to do this is to use the weights option during normalistion. The "slot" for the weights (maW) is assigned/loaded during the marrayInput stage using the read.marrayRaw command (along with name.Gf etc). What I am unclear of is: 1) What form do these weights take i.e does 1 = use this gene and 0 = do not use this gene, are they graded, or do they have to be defined elsewhere? 2) Do you use these weights by simply using maW = TRUE, during the normalisation stage? Am I at least on the right track? If anyone has advice for me it would be great. Thanks in advance, Joe Josef Walker BSc (Hons) PhD Student Memory Group The Edward Jenner Institute for Vaccine Research Compton Nr Newbury Berkshire RG20 7NN Tel: 01635 577905 Fax: 01635 577901 E-mail: Josef.walker@jenner.ac.uk [[alternative HTML version deleted]]
Microarray Microarray • 2.9k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States
>From perusing the functions (particularly maNorm), it appears that the weights are used by all normalization procedures except for "median". By definition, a weight is in the range [0,1], so if you use 0 and 1, it will effectively be the same as saying "don't use this" or "use this". You can also use some more moderate values rather than completely eliminating the 'bad' spots (e.g., simply down-weight spots that look sketchy). I think you pass the weights using the additional argument w="maW" in your call to maNorm. HTH, Jim James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> Hi all, My name is Joe Walker and I am a final year PhD student attempting to use Bioconductor to analyse a large amount of cDNA microarray data from my thesis experiments. For the normalisation stage, there is the option to use weights previously assigned to the genes. I wish to normalise my genes based on a quality controlled subset that changes fro each hybridisation, I think one way to do this is to use the weights option during normalistion. The "slot" for the weights (maW) is assigned/loaded during the marrayInput stage using the read.marrayRaw command (along with name.Gf etc). What I am unclear of is: 1) What form do these weights take i.e does 1 = use this gene and 0 = do not use this gene, are they graded, or do they have to be defined elsewhere? 2) Do you use these weights by simply using maW = TRUE, during the normalisation stage? Am I at least on the right track? If anyone has advice for me it would be great. Thanks in advance, Joe Josef Walker BSc (Hons) PhD Student Memory Group The Edward Jenner Institute for Vaccine Research Compton Nr Newbury Berkshire RG20 7NN Tel: 01635 577905 Fax: 01635 577901 E-mail: Josef.walker@jenner.ac.uk [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
Dear James and Jim, Actually the maNorm function doesn't make use of weights, even though weights might be set in the marrayRaw object. If you look at the code for maNorm you will see that the weights are set to NULL when the call is main to maNormMain. If you want to use weights for normalization you need either to use the lower level function maNormMain (which appears to use weights) or use the normalization routines in the limma package instead. In limma you use read.maimages to read the data into, perhaps picking up the quality weights from genepix or quantarray in the process. If you have made your own weights, you can simply assign them to the weights component, e.g., RG <- read.maimages(files, source=your image analysis program) RG$weights <- your.weights RG$printer <- info about array layout, e.g., list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) MA <- normalizeWithinArrays(RG) Gordon At 03:26 AM 5/08/2003, James MacDonald wrote: > >From perusing the functions (particularly maNorm), it appears that the >weights are used by all normalization procedures except for "median". By >definition, a weight is in the range [0,1], so if you use 0 and 1, it >will effectively be the same as saying "don't use this" or "use this". >You can also use some more moderate values rather than completely >eliminating the 'bad' spots (e.g., simply down-weight spots that look >sketchy). > > >I think you pass the weights using the additional argument w="maW" in >your call to maNorm. > >HTH, > >Jim > > > >James W. MacDonald >Affymetrix and cDNA Microarray Core >University of Michigan Cancer Center >1500 E. Medical Center Drive >7410 CCGC >Ann Arbor MI 48109 >734-647-5623 > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> >Hi all, > > > >My name is Joe Walker and I am a final year PhD student attempting to >use Bioconductor to analyse a large amount of cDNA microarray data >from >my thesis experiments. > > > >For the normalisation stage, there is the option to use weights >previously assigned to the genes. > >I wish to normalise my genes based on a quality controlled subset that >changes fro each hybridisation, I think one way to do this is to use >the >weights option during normalistion. > >The "slot" for the weights (maW) is assigned/loaded during the >marrayInput stage using the read.marrayRaw command (along with name.Gf >etc). > >What I am unclear of is: > >1) What form do these weights take i.e does 1 = use this gene >and >0 = do not use this gene, are they graded, or do they have to be >defined >elsewhere? > >2) Do you use these weights by simply using maW = TRUE, during >the >normalisation stage? > > > >Am I at least on the right track? > >If anyone has advice for me it would be great. > > > >Thanks in advance, > > > >Joe > > > > > >Josef Walker BSc (Hons) > >PhD Student > >Memory Group > >The Edward Jenner Institute for Vaccine Research > >Compton > >Nr Newbury > >Berkshire > >RG20 7NN > > > >Tel: 01635 577905 > >Fax: 01635 577901 > >E-mail: Josef.walker@jenner.ac.uk > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD REPLY
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.2 years ago
Hi I think the problem that both Jo and myself are having is that we want to know how to subset data, either in limma or the marray* classes, such that we only use good quality spots in the normalisation process. The problem is, the spots that are "good quality" differ from array to array, so it's not something we can set in the layout object unless we create a different layout object for each array. So we started looking at the concept of using "weights", but really, the problem of not being able to subset our data successfully still remains. So as a more generalised question, how can I use Bioconductor to normalise microarray data based only on a subset of good quality spots, the location of which will differ from array to array? Thanks M -----Original Message----- From: Gordon Smyth [mailto:smyth@wehi.edu.au] Sent: 05 August 2003 01:26 To: James MacDonald Cc: bioconductor@stat.math.ethz.ch; josef.walker@jenner.ac.uk Subject: Re: [BioC] Defining Weights in marrayNorm. Dear James and Jim, Actually the maNorm function doesn't make use of weights, even though weights might be set in the marrayRaw object. If you look at the code for maNorm you will see that the weights are set to NULL when the call is main to maNormMain. If you want to use weights for normalization you need either to use the lower level function maNormMain (which appears to use weights) or use the normalization routines in the limma package instead. In limma you use read.maimages to read the data into, perhaps picking up the quality weights from genepix or quantarray in the process. If you have made your own weights, you can simply assign them to the weights component, e.g., RG <- read.maimages(files, source=your image analysis program) RG$weights <- your.weights RG$printer <- info about array layout, e.g., list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) MA <- normalizeWithinArrays(RG) Gordon At 03:26 AM 5/08/2003, James MacDonald wrote: > >From perusing the functions (particularly maNorm), it appears that the >weights are used by all normalization procedures except for "median". By >definition, a weight is in the range [0,1], so if you use 0 and 1, it >will effectively be the same as saying "don't use this" or "use this". >You can also use some more moderate values rather than completely >eliminating the 'bad' spots (e.g., simply down-weight spots that look >sketchy). > > >I think you pass the weights using the additional argument w="maW" in >your call to maNorm. > >HTH, > >Jim > > > >James W. MacDonald >Affymetrix and cDNA Microarray Core >University of Michigan Cancer Center >1500 E. Medical Center Drive >7410 CCGC >Ann Arbor MI 48109 >734-647-5623 > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> >Hi all, > > > >My name is Joe Walker and I am a final year PhD student attempting to >use Bioconductor to analyse a large amount of cDNA microarray data >from >my thesis experiments. > > > >For the normalisation stage, there is the option to use weights >previously assigned to the genes. > >I wish to normalise my genes based on a quality controlled subset that >changes fro each hybridisation, I think one way to do this is to use >the >weights option during normalistion. > >The "slot" for the weights (maW) is assigned/loaded during the >marrayInput stage using the read.marrayRaw command (along with name.Gf >etc). > >What I am unclear of is: > >1) What form do these weights take i.e does 1 = use this gene >and >0 = do not use this gene, are they graded, or do they have to be >defined >elsewhere? > >2) Do you use these weights by simply using maW = TRUE, during >the >normalisation stage? > > > >Am I at least on the right track? > >If anyone has advice for me it would be great. > > > >Thanks in advance, > > > >Joe > > > > > >Josef Walker BSc (Hons) > >PhD Student > >Memory Group > >The Edward Jenner Institute for Vaccine Research > >Compton > >Nr Newbury > >Berkshire > >RG20 7NN > > > >Tel: 01635 577905 > >Fax: 01635 577901 > >E-mail: Josef.walker@jenner.ac.uk > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 6 minutes ago
WEHI, Melbourne, Australia
Dear Michael, I think you are not understanding exactly how the weights work. What you want to do really is accomplished using weights and cannot be accomplished by any subsetting operation. Subsetting operations have to, by their very nature, apply the same to every array, and this isn't what you want. 1. Let me say first of all that we generally do not recommend restricting normalisation only to "good" spots. The normalisation routines are written so that they are robust, i.e., they are able to ignore groups of poor quality or differentially expressed genes if they don't follow the trend of the rest of the data. This means that a minority of poor quality spots is unlikely to do much harm. Very often there is some information even in the poorer quality spots and it is best to leave them in. This also saves lots of time. There are exceptions of course ... 2. How are you choosing the "good quality" spots? Programs like genepix flag spots which they think are of questionable quality. If you are using flags provided by the image analysis program, then you can read in the weights as you read in the data. For example, if you have genepix data then RG <- read.maimages(files, source="genepix", wt.fun=wtflags(0)) will give zero weight to any spot flagged by genepix as being questionable. When you normalise the data using MA <- normalizeWithinArrays(RG) the normalisation regressions will use only those spots which have weights greater than zero. This will vary between arrays and is exactly what you want to achieve. All the spots will be normalized, whether "good" or "bad" quality, but only the "good" spots will have any influence on the normalisation functions. The normalisation of the "good" spots will be exactly as if the "bad" spots where not there. 3. If you have constructed the spot flags yourself, then you'll have to proceed something like this. Suppose you have two arrays in two genepix output files. Suppose the flags for the first array are stored in a vector called 'flag1' with 1 for good spots and 0 for bad. Suppose the flags for the second array are stored in a vector 'flag2'. You will read in the intensity data using RG <- read.maimages(files, source="genepix") Then you'll have to assemble the flags into a matrix with rows for genes and columns for arrays using 'cbind(flag1, flag2)'. Then you put this into the weight component: RG$weights <- cbind( flag1, flag2 ) Now you can use MA <- normalizeWithinArrays(RG) and normalisation will use, for each array, only those spots for which the flags are equal to 1. 4. If you have somehow constructed the flags externally to R, you will need to read them into R. Suppose you have the flags in a tab-delimited text file with one row for each gene and columns corresponding to arrays. Then you read them in: w <- as.matrix(read.table("myfile")) RG$weights <- w and then proceed as before. Hope this helps Gordon At 06:28 PM 5/08/2003, michael watson (IAH-C) wrote: >Hi > >I think the problem that both Jo and myself are having is that we want to >know how to subset data, either in limma or the marray* classes, such that >we only use good quality spots in the normalisation process. > >The problem is, the spots that are "good quality" differ from array to >array, so it's not something we can set in the layout object unless we >create a different layout object for each array. So we started looking at >the concept of using "weights", but really, the problem of not being able >to subset our data successfully still remains. > >So as a more generalised question, how can I use Bioconductor to normalise >microarray data based only on a subset of good quality spots, the location >of which will differ from array to array? > >Thanks >M > >-----Original Message----- >From: Gordon Smyth [mailto:smyth@wehi.edu.au] >Sent: 05 August 2003 01:26 >To: James MacDonald >Cc: bioconductor@stat.math.ethz.ch; josef.walker@jenner.ac.uk >Subject: Re: [BioC] Defining Weights in marrayNorm. > > >Dear James and Jim, > >Actually the maNorm function doesn't make use of weights, even though >weights might be set in the marrayRaw object. If you look at the code for >maNorm you will see that the weights are set to NULL when the call is main >to maNormMain. > >If you want to use weights for normalization you need either to use the >lower level function maNormMain (which appears to use weights) or use the >normalization routines in the limma package instead. > >In limma you use read.maimages to read the data into, perhaps picking up >the quality weights from genepix or quantarray in the process. If you have >made your own weights, you can simply assign them to the weights component, >e.g., > >RG <- read.maimages(files, source=your image analysis program) >RG$weights <- your.weights >RG$printer <- info about array layout, e.g., >list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) >MA <- normalizeWithinArrays(RG) > >Gordon > >At 03:26 AM 5/08/2003, James MacDonald wrote: > > >From perusing the functions (particularly maNorm), it appears that the > >weights are used by all normalization procedures except for "median". By > >definition, a weight is in the range [0,1], so if you use 0 and 1, it > >will effectively be the same as saying "don't use this" or "use this". > >You can also use some more moderate values rather than completely > >eliminating the 'bad' spots (e.g., simply down-weight spots that look > >sketchy). > > > > > >I think you pass the weights using the additional argument w="maW" in > >your call to maNorm. > > > >HTH, > > > >Jim > > > >James W. MacDonald > >Affymetrix and cDNA Microarray Core > >University of Michigan Cancer Center > >1500 E. Medical Center Drive > >7410 CCGC > >Ann Arbor MI 48109 > >734-647-5623 > > > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> > >Hi all, > > > > > > > >My name is Joe Walker and I am a final year PhD student attempting to > >use Bioconductor to analyse a large amount of cDNA microarray data > >from > >my thesis experiments. > > > > > > > >For the normalisation stage, there is the option to use weights > >previously assigned to the genes. > > > >I wish to normalise my genes based on a quality controlled subset that > >changes fro each hybridisation, I think one way to do this is to use > >the > >weights option during normalistion. > > > >The "slot" for the weights (maW) is assigned/loaded during the > >marrayInput stage using the read.marrayRaw command (along with name.Gf > >etc). > > > >What I am unclear of is: > > > >1) What form do these weights take i.e does 1 = use this gene > >and > >0 = do not use this gene, are they graded, or do they have to be > >defined > >elsewhere? > > > >2) Do you use these weights by simply using maW = TRUE, during > >the > >normalisation stage? > > > >Am I at least on the right track? > > > >If anyone has advice for me it would be great. > > > >Thanks in advance, > > > >Joe > > > >Josef Walker BSc (Hons) > > > >PhD Student > > > >Memory Group > > > >The Edward Jenner Institute for Vaccine Research > > > >Compton > > > >Nr Newbury > > > >Berkshire > > > >RG20 7NN > > > > > > > >Tel: 01635 577905 > > > >Fax: 01635 577901 > > > >E-mail: Josef.walker@jenner.ac.uk
ADD COMMENT
0
Entering edit mode
Josef Walker ▴ 20
@josef-walker-397
Last seen 10.2 years ago
Dear Gordon, The flags we (Michael Watson and I) use are self-defined and attached as an extra column, tagged on to the end of the rest of the data in each individual .gpr/.txt file. It would probably also be prudent to explain more clearly our definitions of "Good" and "Bad". Spots could be considered as good if they fit into a number of different categories, based on the fact the data is derived from two separate images, representing the two different channels. Our definition of a GOOD spot, is one that passes all of our QC criteria and has a signal intensity above the thresholds we set for defining whether or not a spot is considered "expressed". A single spot could be considered good if it passed QC elements for both channels. A single spot could be good in one channel and bad in the other channel, or BAD in both, ending up with an overall assignment of BAD. However, if the signal in one channel is below the thresholds that define whether or not a spot is considered to be expressed or not, then the definitions would change i.e. good signal in one channel, below threshold signal in the other channel (which might be considered BAD according to some of the QC criteria); overall this spot would be considered GOOD. The decision to use only those genes considered GOOD i.e. expressed in both channels, is based on the fact that only these genes provide good reliable signal from both of the channels, and that ratios derived from these spots would also be reliable. Ratios derived from BAD spots are definitely unreliable. Ratios derived from those spots with only background signal in both channels (unexpressed genes), or good in one channel and unexpressed in the other channel, are unreliable as they do not contain fluorescence intensity data derived from labelled cDNA in both channels and so could not account for any gross differences in signal intensity arising from this source. So just to re-cap, am I correct in thinking that if we define the slot/column (w), in which our self-constructed "Flags" are contained, in our marrayRaw objects (read into R using read.marrayRaw or read.GenePix and taken from the .gpr or .txt files derived from the raw images) and then use the maNormMain function in the marrayNorm Library (not maNorm), setting maW = TRUE, then these weights WILL be used for calculating the normalised vaules. Thankyou for your help so far, this mailing list is a real life-saver. Unfortunately I am a away for the next 7 days so won't be able to access my messages, but will be looking forward to checking them when I get back. Best Wishes Joe Josef Walker BSc (Hons) PhD Student Memory Group The Edward Jenner Institute for Vaccine Research Compton Nr Newbury Berkshire RG20 7NN Tel: 01635 577905 Fax: 01635 577901 E-mail: Josef.walker@jenner.ac.uk -----Original Message----- From: Gordon Smyth [mailto:smyth@wehi.edu.au] Sent: 05 August 2003 10:43 To: michael watson (IAH-C) Cc: James MacDonald; bioconductor@stat.math.ethz.ch; Josef Walker Subject: RE: [BioC] Defining Weights in marrayNorm. Dear Michael, I think you are not understanding exactly how the weights work. What you want to do really is accomplished using weights and cannot be accomplished by any subsetting operation. Subsetting operations have to, by their very nature, apply the same to every array, and this isn't what you want. 1. Let me say first of all that we generally do not recommend restricting normalisation only to "good" spots. The normalisation routines are written so that they are robust, i.e., they are able to ignore groups of poor quality or differentially expressed genes if they don't follow the trend of the rest of the data. This means that a minority of poor quality spots is unlikely to do much harm. Very often there is some information even in the poorer quality spots and it is best to leave them in. This also saves lots of time. There are exceptions of course ... 2. How are you choosing the "good quality" spots? Programs like genepix flag spots which they think are of questionable quality. If you are using flags provided by the image analysis program, then you can read in the weights as you read in the data. For example, if you have genepix data then RG <- read.maimages(files, source="genepix", wt.fun=wtflags(0)) will give zero weight to any spot flagged by genepix as being questionable. When you normalise the data using MA <- normalizeWithinArrays(RG) the normalisation regressions will use only those spots which have weights greater than zero. This will vary between arrays and is exactly what you want to achieve. All the spots will be normalized, whether "good" or "bad" quality, but only the "good" spots will have any influence on the normalisation functions. The normalisation of the "good" spots will be exactly as if the "bad" spots where not there. 3. If you have constructed the spot flags yourself, then you'll have to proceed something like this. Suppose you have two arrays in two genepix output files. Suppose the flags for the first array are stored in a vector called 'flag1' with 1 for good spots and 0 for bad. Suppose the flags for the second array are stored in a vector 'flag2'. You will read in the intensity data using RG <- read.maimages(files, source="genepix") Then you'll have to assemble the flags into a matrix with rows for genes and columns for arrays using 'cbind(flag1, flag2)'. Then you put this into the weight component: RG$weights <- cbind( flag1, flag2 ) Now you can use MA <- normalizeWithinArrays(RG) and normalisation will use, for each array, only those spots for which the flags are equal to 1. 4. If you have somehow constructed the flags externally to R, you will need to read them into R. Suppose you have the flags in a tab-delimited text file with one row for each gene and columns corresponding to arrays. Then you read them in: w <- as.matrix(read.table("myfile")) RG$weights <- w and then proceed as before. Hope this helps Gordon At 06:28 PM 5/08/2003, michael watson (IAH-C) wrote: >Hi > >I think the problem that both Jo and myself are having is that we want to >know how to subset data, either in limma or the marray* classes, such that >we only use good quality spots in the normalisation process. > >The problem is, the spots that are "good quality" differ from array to >array, so it's not something we can set in the layout object unless we >create a different layout object for each array. So we started looking at >the concept of using "weights", but really, the problem of not being able >to subset our data successfully still remains. > >So as a more generalised question, how can I use Bioconductor to normalise >microarray data based only on a subset of good quality spots, the location >of which will differ from array to array? > >Thanks >M > >-----Original Message----- >From: Gordon Smyth [mailto:smyth@wehi.edu.au] >Sent: 05 August 2003 01:26 >To: James MacDonald >Cc: bioconductor@stat.math.ethz.ch; josef.walker@jenner.ac.uk >Subject: Re: [BioC] Defining Weights in marrayNorm. > > >Dear James and Jim, > >Actually the maNorm function doesn't make use of weights, even though >weights might be set in the marrayRaw object. If you look at the code for >maNorm you will see that the weights are set to NULL when the call is main >to maNormMain. > >If you want to use weights for normalization you need either to use the >lower level function maNormMain (which appears to use weights) or use the >normalization routines in the limma package instead. > >In limma you use read.maimages to read the data into, perhaps picking up >the quality weights from genepix or quantarray in the process. If you have >made your own weights, you can simply assign them to the weights component, >e.g., > >RG <- read.maimages(files, source=your image analysis program) >RG$weights <- your.weights >RG$printer <- info about array layout, e.g., >list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) >MA <- normalizeWithinArrays(RG) > >Gordon > >At 03:26 AM 5/08/2003, James MacDonald wrote: > > >From perusing the functions (particularly maNorm), it appears that the > >weights are used by all normalization procedures except for "median". By > >definition, a weight is in the range [0,1], so if you use 0 and 1, it > >will effectively be the same as saying "don't use this" or "use this". > >You can also use some more moderate values rather than completely > >eliminating the 'bad' spots (e.g., simply down-weight spots that look > >sketchy). > > > > > >I think you pass the weights using the additional argument w="maW" in > >your call to maNorm. > > > >HTH, > > > >Jim > > > >James W. MacDonald > >Affymetrix and cDNA Microarray Core > >University of Michigan Cancer Center > >1500 E. Medical Center Drive > >7410 CCGC > >Ann Arbor MI 48109 > >734-647-5623 > > > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> > >Hi all, > > > > > > > >My name is Joe Walker and I am a final year PhD student attempting to > >use Bioconductor to analyse a large amount of cDNA microarray data > >from > >my thesis experiments. > > > > > > > >For the normalisation stage, there is the option to use weights > >previously assigned to the genes. > > > >I wish to normalise my genes based on a quality controlled subset that > >changes fro each hybridisation, I think one way to do this is to use > >the > >weights option during normalistion. > > > >The "slot" for the weights (maW) is assigned/loaded during the > >marrayInput stage using the read.marrayRaw command (along with name.Gf > >etc). > > > >What I am unclear of is: > > > >1) What form do these weights take i.e does 1 = use this gene > >and > >0 = do not use this gene, are they graded, or do they have to be > >defined > >elsewhere? > > > >2) Do you use these weights by simply using maW = TRUE, during > >the > >normalisation stage? > > > >Am I at least on the right track? > > > >If anyone has advice for me it would be great. > > > >Thanks in advance, > > > >Joe > > > >Josef Walker BSc (Hons) > > > >PhD Student > > > >Memory Group > > > >The Edward Jenner Institute for Vaccine Research > > > >Compton > > > >Nr Newbury > > > >Berkshire > > > >RG20 7NN > > > > > > > >Tel: 01635 577905 > > > >Fax: 01635 577901 > > > >E-mail: Josef.walker@jenner.ac.uk
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States
Hey Gordon, I see that weights are set to null in maNormMain, but if you input something like w="maW" in maNorm, doesn't that get passed as a variable (using the ... portion of the function call), and override the w=NULL in maNormMain? Jim James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 >>> Gordon Smyth <smyth@wehi.edu.au> 08/04/03 08:25PM >>> Dear James and Jim, Actually the maNorm function doesn't make use of weights, even though weights might be set in the marrayRaw object. If you look at the code for maNorm you will see that the weights are set to NULL when the call is main to maNormMain. If you want to use weights for normalization you need either to use the lower level function maNormMain (which appears to use weights) or use the normalization routines in the limma package instead. In limma you use read.maimages to read the data into, perhaps picking up the quality weights from genepix or quantarray in the process. If you have made your own weights, you can simply assign them to the weights component, e.g., RG <- read.maimages(files, source=your image analysis program) RG$weights <- your.weights RG$printer <- info about array layout, e.g., list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) MA <- normalizeWithinArrays(RG) Gordon At 03:26 AM 5/08/2003, James MacDonald wrote: > >From perusing the functions (particularly maNorm), it appears that the >weights are used by all normalization procedures except for "median". By >definition, a weight is in the range [0,1], so if you use 0 and 1, it >will effectively be the same as saying "don't use this" or "use this". >You can also use some more moderate values rather than completely >eliminating the 'bad' spots (e.g., simply down-weight spots that look >sketchy). > > >I think you pass the weights using the additional argument w="maW" in >your call to maNorm. > >HTH, > >Jim > > > >James W. MacDonald >Affymetrix and cDNA Microarray Core >University of Michigan Cancer Center >1500 E. Medical Center Drive >7410 CCGC >Ann Arbor MI 48109 >734-647-5623 > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> >Hi all, > > > >My name is Joe Walker and I am a final year PhD student attempting to >use Bioconductor to analyse a large amount of cDNA microarray data >from >my thesis experiments. > > > >For the normalisation stage, there is the option to use weights >previously assigned to the genes. > >I wish to normalise my genes based on a quality controlled subset that >changes fro each hybridisation, I think one way to do this is to use >the >weights option during normalistion. > >The "slot" for the weights (maW) is assigned/loaded during the >marrayInput stage using the read.marrayRaw command (along with name.Gf >etc). > >What I am unclear of is: > >1) What form do these weights take i.e does 1 = use this gene >and >0 = do not use this gene, are they graded, or do they have to be >defined >elsewhere? > >2) Do you use these weights by simply using maW = TRUE, during >the >normalisation stage? > > > >Am I at least on the right track? > >If anyone has advice for me it would be great. > > > >Thanks in advance, > > > >Joe > > > > > >Josef Walker BSc (Hons) > >PhD Student > >Memory Group > >The Edward Jenner Institute for Vaccine Research > >Compton > >Nr Newbury > >Berkshire > >RG20 7NN > > > >Tel: 01635 577905 > >Fax: 01635 577901 > >E-mail: Josef.walker@jenner.ac.uk > > > > > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 6 minutes ago
WEHI, Melbourne, Australia
At 11:07 PM 5/08/2003, James MacDonald wrote: >Hey Gordon, > >I see that weights are set to null in maNormMain, but if you input >something like w="maW" in maNorm, doesn't that get passed as a variable >(using the ... portion of the function call), and override the w=NULL in >maNormMain? No. Why would you think that? Gordon >Jim > > > >James W. MacDonald >Affymetrix and cDNA Microarray Core >University of Michigan Cancer Center >1500 E. Medical Center Drive >7410 CCGC >Ann Arbor MI 48109 >734-647-5623 > > >>> Gordon Smyth <smyth@wehi.edu.au> 08/04/03 08:25PM >>> >Dear James and Jim, > >Actually the maNorm function doesn't make use of weights, even though >weights might be set in the marrayRaw object. If you look at the code >for >maNorm you will see that the weights are set to NULL when the call is >main >to maNormMain. > >If you want to use weights for normalization you need either to use the > >lower level function maNormMain (which appears to use weights) or use >the >normalization routines in the limma package instead. > >In limma you use read.maimages to read the data into, perhaps picking >up >the quality weights from genepix or quantarray in the process. If you >have >made your own weights, you can simply assign them to the weights >component, >e.g., > >RG <- read.maimages(files, source=your image analysis program) >RG$weights <- your.weights >RG$printer <- info about array layout, e.g., >list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) >MA <- normalizeWithinArrays(RG) > >Gordon > >At 03:26 AM 5/08/2003, James MacDonald wrote: > > >From perusing the functions (particularly maNorm), it appears that >the > >weights are used by all normalization procedures except for "median". >By > >definition, a weight is in the range [0,1], so if you use 0 and 1, it > >will effectively be the same as saying "don't use this" or "use >this". > >You can also use some more moderate values rather than completely > >eliminating the 'bad' spots (e.g., simply down-weight spots that look > >sketchy). > > > > > >I think you pass the weights using the additional argument w="maW" in > >your call to maNorm. > > > >HTH, > > > >Jim > > > > > > > >James W. MacDonald > >Affymetrix and cDNA Microarray Core > >University of Michigan Cancer Center > >1500 E. Medical Center Drive > >7410 CCGC > >Ann Arbor MI 48109 > >734-647-5623 > > > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> > >Hi all, > > > > > > > >My name is Joe Walker and I am a final year PhD student attempting to > >use Bioconductor to analyse a large amount of cDNA microarray data > >from > >my thesis experiments. > > > > > > > >For the normalisation stage, there is the option to use weights > >previously assigned to the genes. > > > >I wish to normalise my genes based on a quality controlled subset >that > >changes fro each hybridisation, I think one way to do this is to use > >the > >weights option during normalistion. > > > >The "slot" for the weights (maW) is assigned/loaded during the > >marrayInput stage using the read.marrayRaw command (along with >name.Gf > >etc). > > > >What I am unclear of is: > > > >1) What form do these weights take i.e does 1 = use this gene > >and > >0 = do not use this gene, are they graded, or do they have to be > >defined > >elsewhere? > > > >2) Do you use these weights by simply using maW = TRUE, during > >the > >normalisation stage? > > > > > > > >Am I at least on the right track? > > > >If anyone has advice for me it would be great. > > > > > > > >Thanks in advance, > > > > > > > >Joe > > > > > > > > > > > >Josef Walker BSc (Hons) > > > >PhD Student > > > >Memory Group > > > >The Edward Jenner Institute for Vaccine Research > > > >Compton > > > >Nr Newbury > > > >Berkshire > > > >RG20 7NN > > > > > > > >Tel: 01635 577905 > > > >Fax: 01635 577901 > > > >E-mail: Josef.walker@jenner.ac.uk > > > > > > > > > > [[alternative HTML version deleted]] > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 6 minutes ago
WEHI, Melbourne, Australia
At 09:40 PM 5/08/2003, Josef Walker wrote: >Dear Gordon, > >The flags we (Michael Watson and I) use are self-defined and attached as >an extra column, tagged on to the end of the rest of the data in each >individual .gpr/.txt file. > >It would probably also be prudent to explain more clearly our >definitions of "Good" and "Bad". Spots could be considered as good if >they fit into a number of different categories, based on the fact the >data is derived from two separate images, representing the two different >channels. >Our definition of a GOOD spot, is one that passes all of our QC criteria >and has a signal intensity above the thresholds we set for defining >whether or not a spot is considered "expressed". > >A single spot could be considered good if it passed QC elements for both >channels. >A single spot could be good in one channel and bad in the other channel, >or BAD in both, ending up with an overall assignment of BAD. >However, if the signal in one channel is below the thresholds that >define whether or not a spot is considered to be expressed or not, then >the definitions would change i.e. good signal in one channel, below >threshold signal in the other channel (which might be considered BAD >according to some of the QC criteria); overall this spot would be >considered GOOD. > >The decision to use only those genes considered GOOD i.e. expressed in >both channels, is based on the fact that only these genes provide good >reliable signal from both of the channels, and that ratios derived from >these spots would also be reliable. Ratios derived from BAD spots are >definitely unreliable. Ratios derived from those spots with only >background signal in both channels (unexpressed genes), or good in one >channel and unexpressed in the other channel, are unreliable as they do >not contain fluorescence intensity data derived from labelled cDNA in >both channels and so could not account for any gross differences in >signal intensity arising from this source. I don't want to get into an argument on this topic, but there is absolutely no reason to filter out low intensity spots before using loess normalisation. Loess normalisation is intensity-based, it is designed to accept the whole range of intensities. >So just to re-cap, am I correct in thinking that if we define the >slot/column (w), in which our self-constructed "Flags" are contained, in >our marrayRaw objects (read into R using read.marrayRaw or read.GenePix >and taken from the .gpr or .txt files derived from the raw images) and >then use the maNormMain function in the marrayNorm Library (not >maNorm), setting maW = TRUE, then these weights WILL be used for >calculating the normalised vaules. Where did you get the piece of code "maW = TRUE" from? As far as I know, there is no such command in bioconductor. Also 'w' isn't a slot for an marrayRaw object. You need to read the documentation carefully ... In principle, if you put data into the 'maW' slot of a marrayRaw object, then the values should be used as weights by maNormMain, and this should happen automatically without you having to tell it to do so. However I haven't tried it out to check that it works and I'm not an author of that software. I will leave others to help you with maNormMain ... Gordon >Thankyou for your help so far, this mailing list is a real life- saver. > >Unfortunately I am a away for the next 7 days so won't be able to access >my messages, but will be looking forward to checking them when I get >back. > >Best Wishes > > >Joe > > >Josef Walker BSc (Hons) >PhD Student >Memory Group >The Edward Jenner Institute for Vaccine Research >Compton >Nr Newbury >Berkshire >RG20 7NN > >Tel: 01635 577905 >Fax: 01635 577901 >E-mail: Josef.walker@jenner.ac.uk > > >-----Original Message----- >From: Gordon Smyth [mailto:smyth@wehi.edu.au] >Sent: 05 August 2003 10:43 >To: michael watson (IAH-C) >Cc: James MacDonald; bioconductor@stat.math.ethz.ch; Josef Walker >Subject: RE: [BioC] Defining Weights in marrayNorm. > >Dear Michael, > >I think you are not understanding exactly how the weights work. What you > >want to do really is accomplished using weights and cannot be >accomplished >by any subsetting operation. Subsetting operations have to, by their >very >nature, apply the same to every array, and this isn't what you want. > >1. Let me say first of all that we generally do not recommend >restricting >normalisation only to "good" spots. The normalisation routines are >written >so that they are robust, i.e., they are able to ignore groups of poor >quality or differentially expressed genes if they don't follow the trend >of >the rest of the data. This means that a minority of poor quality spots >is >unlikely to do much harm. Very often there is some information even in >the >poorer quality spots and it is best to leave them in. This also saves >lots >of time. There are exceptions of course ... > >2. How are you choosing the "good quality" spots? Programs like genepix >flag spots which they think are of questionable quality. If you are >using >flags provided by the image analysis program, then you can read in the >weights as you read in the data. For example, if you have genepix data >then > >RG <- read.maimages(files, source="genepix", wt.fun=wtflags(0)) > >will give zero weight to any spot flagged by genepix as being >questionable. >When you normalise the data using > >MA <- normalizeWithinArrays(RG) > >the normalisation regressions will use only those spots which have >weights >greater than zero. This will vary between arrays and is exactly what you > >want to achieve. All the spots will be normalized, whether "good" or >"bad" >quality, but only the "good" spots will have any influence on the >normalisation functions. The normalisation of the "good" spots will be >exactly as if the "bad" spots where not there. > >3. If you have constructed the spot flags yourself, then you'll have to >proceed something like this. Suppose you have two arrays in two genepix >output files. Suppose the flags for the first array are stored in a >vector >called 'flag1' with 1 for good spots and 0 for bad. Suppose the flags >for >the second array are stored in a vector 'flag2'. You will read in the >intensity data using > >RG <- read.maimages(files, source="genepix") > >Then you'll have to assemble the flags into a matrix with rows for genes > >and columns for arrays using 'cbind(flag1, flag2)'. Then you put this >into >the weight component: > >RG$weights <- cbind( flag1, flag2 ) > >Now you can use > >MA <- normalizeWithinArrays(RG) > >and normalisation will use, for each array, only those spots for which >the >flags are equal to 1. > >4. If you have somehow constructed the flags externally to R, you will >need >to read them into R. Suppose you have the flags in a tab-delimited text >file with one row for each gene and columns corresponding to arrays. >Then >you read them in: > >w <- as.matrix(read.table("myfile")) >RG$weights <- w > >and then proceed as before. > >Hope this helps >Gordon > >At 06:28 PM 5/08/2003, michael watson (IAH-C) wrote: > >Hi > > > >I think the problem that both Jo and myself are having is that we want >to > >know how to subset data, either in limma or the marray* classes, such >that > >we only use good quality spots in the normalisation process. > > > >The problem is, the spots that are "good quality" differ from array to > >array, so it's not something we can set in the layout object unless we > >create a different layout object for each array. So we started looking >at > >the concept of using "weights", but really, the problem of not being >able > >to subset our data successfully still remains. > > > >So as a more generalised question, how can I use Bioconductor to >normalise > >microarray data based only on a subset of good quality spots, the >location > >of which will differ from array to array? > > > >Thanks > >M > > > >-----Original Message----- > >From: Gordon Smyth [mailto:smyth@wehi.edu.au] > >Sent: 05 August 2003 01:26 > >To: James MacDonald > >Cc: bioconductor@stat.math.ethz.ch; josef.walker@jenner.ac.uk > >Subject: Re: [BioC] Defining Weights in marrayNorm. > > > > > >Dear James and Jim, > > > >Actually the maNorm function doesn't make use of weights, even though > >weights might be set in the marrayRaw object. If you look at the code >for > >maNorm you will see that the weights are set to NULL when the call is >main > >to maNormMain. > > > >If you want to use weights for normalization you need either to use the > >lower level function maNormMain (which appears to use weights) or use >the > >normalization routines in the limma package instead. > > > >In limma you use read.maimages to read the data into, perhaps picking >up > >the quality weights from genepix or quantarray in the process. If you >have > >made your own weights, you can simply assign them to the weights >component, > >e.g., > > > >RG <- read.maimages(files, source=your image analysis program) > >RG$weights <- your.weights > >RG$printer <- info about array layout, e.g., > >list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) > >MA <- normalizeWithinArrays(RG) > > > >Gordon > > > >At 03:26 AM 5/08/2003, James MacDonald wrote: > > > >From perusing the functions (particularly maNorm), it appears that >the > > >weights are used by all normalization procedures except for "median". >By > > >definition, a weight is in the range [0,1], so if you use 0 and 1, it > > >will effectively be the same as saying "don't use this" or "use >this". > > >You can also use some more moderate values rather than completely > > >eliminating the 'bad' spots (e.g., simply down-weight spots that look > > >sketchy). > > > > > > > > >I think you pass the weights using the additional argument w="maW" in > > >your call to maNorm. > > > > > >HTH, > > > > > >Jim > > > > > >James W. MacDonald > > >Affymetrix and cDNA Microarray Core > > >University of Michigan Cancer Center > > >1500 E. Medical Center Drive > > >7410 CCGC > > >Ann Arbor MI 48109 > > >734-647-5623 > > > > > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> > > >Hi all, > > > > > > > > > > > >My name is Joe Walker and I am a final year PhD student attempting to > > >use Bioconductor to analyse a large amount of cDNA microarray data > > >from > > >my thesis experiments. > > > > > > > > > > > >For the normalisation stage, there is the option to use weights > > >previously assigned to the genes. > > > > > >I wish to normalise my genes based on a quality controlled subset >that > > >changes fro each hybridisation, I think one way to do this is to use > > >the > > >weights option during normalistion. > > > > > >The "slot" for the weights (maW) is assigned/loaded during the > > >marrayInput stage using the read.marrayRaw command (along with >name.Gf > > >etc). > > > > > >What I am unclear of is: > > > > > >1) What form do these weights take i.e does 1 = use this gene > > >and > > >0 = do not use this gene, are they graded, or do they have to be > > >defined > > >elsewhere? > > > > > >2) Do you use these weights by simply using maW = TRUE, during > > >the > > >normalisation stage? > > > > > >Am I at least on the right track? > > > > > >If anyone has advice for me it would be great. > > > > > >Thanks in advance, > > > > > >Joe > > > > > >Josef Walker BSc (Hons) > > > > > >PhD Student > > > > > >Memory Group > > > > > >The Edward Jenner Institute for Vaccine Research > > > > > >Compton > > > > > >Nr Newbury > > > > > >Berkshire > > > > > >RG20 7NN > > > > > > > > > > > >Tel: 01635 577905 > > > > > >Fax: 01635 577901 > > > > > >E-mail: Josef.walker@jenner.ac.uk
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.2 years ago
Hi Gordon First of all, thanks for all your help on this matter. I think we're reaching a point where we are capable of taking this forward, though probably using limma now rather than the marray* classes :-) I only have one question.... >I don't want to get into an argument on this topic, but there is absolutely >no reason to filter out low intensity spots before using loess >normalisation. Loess normalisation is intensity-based, it is designed to >accept the whole range of intensities. One case in particular I would want to avoid - the case where in one channel BG is above signal, and in another channel it is not. In this case, we have very useful data (gene is switched on in one channel, and off in another) yet we do not have a reliable ratio - we either have a negative ratio, an infinite ratio (both utterly meaningless) or we set the negative channel intensity to say, 1, and then the ratio is highly skewed. In all cases the ratio for that spot is very unreliable and would surely alter, even if only in a small way, the Lowess fit - or am I wrong in thinking that? I read somewhere that Lowess gives less weight to outliers.... Regards Mick >Thankyou for your help so far, this mailing list is a real life- saver. > >Unfortunately I am a away for the next 7 days so won't be able to access >my messages, but will be looking forward to checking them when I get >back. > >Best Wishes > > >Joe > > >Josef Walker BSc (Hons) >PhD Student >Memory Group >The Edward Jenner Institute for Vaccine Research >Compton >Nr Newbury >Berkshire >RG20 7NN > >Tel: 01635 577905 >Fax: 01635 577901 >E-mail: Josef.walker@jenner.ac.uk > > >-----Original Message----- >From: Gordon Smyth [mailto:smyth@wehi.edu.au] >Sent: 05 August 2003 10:43 >To: michael watson (IAH-C) >Cc: James MacDonald; bioconductor@stat.math.ethz.ch; Josef Walker >Subject: RE: [BioC] Defining Weights in marrayNorm. > >Dear Michael, > >I think you are not understanding exactly how the weights work. What you > >want to do really is accomplished using weights and cannot be >accomplished >by any subsetting operation. Subsetting operations have to, by their >very >nature, apply the same to every array, and this isn't what you want. > >1. Let me say first of all that we generally do not recommend >restricting >normalisation only to "good" spots. The normalisation routines are >written >so that they are robust, i.e., they are able to ignore groups of poor >quality or differentially expressed genes if they don't follow the trend >of >the rest of the data. This means that a minority of poor quality spots >is >unlikely to do much harm. Very often there is some information even in >the >poorer quality spots and it is best to leave them in. This also saves >lots >of time. There are exceptions of course ... > >2. How are you choosing the "good quality" spots? Programs like genepix >flag spots which they think are of questionable quality. If you are >using >flags provided by the image analysis program, then you can read in the >weights as you read in the data. For example, if you have genepix data >then > >RG <- read.maimages(files, source="genepix", wt.fun=wtflags(0)) > >will give zero weight to any spot flagged by genepix as being >questionable. >When you normalise the data using > >MA <- normalizeWithinArrays(RG) > >the normalisation regressions will use only those spots which have >weights >greater than zero. This will vary between arrays and is exactly what you > >want to achieve. All the spots will be normalized, whether "good" or >"bad" >quality, but only the "good" spots will have any influence on the >normalisation functions. The normalisation of the "good" spots will be >exactly as if the "bad" spots where not there. > >3. If you have constructed the spot flags yourself, then you'll have to >proceed something like this. Suppose you have two arrays in two genepix >output files. Suppose the flags for the first array are stored in a >vector >called 'flag1' with 1 for good spots and 0 for bad. Suppose the flags >for >the second array are stored in a vector 'flag2'. You will read in the >intensity data using > >RG <- read.maimages(files, source="genepix") > >Then you'll have to assemble the flags into a matrix with rows for genes > >and columns for arrays using 'cbind(flag1, flag2)'. Then you put this >into >the weight component: > >RG$weights <- cbind( flag1, flag2 ) > >Now you can use > >MA <- normalizeWithinArrays(RG) > >and normalisation will use, for each array, only those spots for which >the >flags are equal to 1. > >4. If you have somehow constructed the flags externally to R, you will >need >to read them into R. Suppose you have the flags in a tab-delimited text >file with one row for each gene and columns corresponding to arrays. >Then >you read them in: > >w <- as.matrix(read.table("myfile")) >RG$weights <- w > >and then proceed as before. > >Hope this helps >Gordon > >At 06:28 PM 5/08/2003, michael watson (IAH-C) wrote: > >Hi > > > >I think the problem that both Jo and myself are having is that we want >to > >know how to subset data, either in limma or the marray* classes, such >that > >we only use good quality spots in the normalisation process. > > > >The problem is, the spots that are "good quality" differ from array to > >array, so it's not something we can set in the layout object unless we > >create a different layout object for each array. So we started looking >at > >the concept of using "weights", but really, the problem of not being >able > >to subset our data successfully still remains. > > > >So as a more generalised question, how can I use Bioconductor to >normalise > >microarray data based only on a subset of good quality spots, the >location > >of which will differ from array to array? > > > >Thanks > >M > > > >-----Original Message----- > >From: Gordon Smyth [mailto:smyth@wehi.edu.au] > >Sent: 05 August 2003 01:26 > >To: James MacDonald > >Cc: bioconductor@stat.math.ethz.ch; josef.walker@jenner.ac.uk > >Subject: Re: [BioC] Defining Weights in marrayNorm. > > > > > >Dear James and Jim, > > > >Actually the maNorm function doesn't make use of weights, even though > >weights might be set in the marrayRaw object. If you look at the code >for > >maNorm you will see that the weights are set to NULL when the call is >main > >to maNormMain. > > > >If you want to use weights for normalization you need either to use the > >lower level function maNormMain (which appears to use weights) or use >the > >normalization routines in the limma package instead. > > > >In limma you use read.maimages to read the data into, perhaps picking >up > >the quality weights from genepix or quantarray in the process. If you >have > >made your own weights, you can simply assign them to the weights >component, > >e.g., > > > >RG <- read.maimages(files, source=your image analysis program) > >RG$weights <- your.weights > >RG$printer <- info about array layout, e.g., > >list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) > >MA <- normalizeWithinArrays(RG) > > > >Gordon > > > >At 03:26 AM 5/08/2003, James MacDonald wrote: > > > >From perusing the functions (particularly maNorm), it appears that >the > > >weights are used by all normalization procedures except for "median". >By > > >definition, a weight is in the range [0,1], so if you use 0 and 1, it > > >will effectively be the same as saying "don't use this" or "use >this". > > >You can also use some more moderate values rather than completely > > >eliminating the 'bad' spots (e.g., simply down-weight spots that look > > >sketchy). > > > > > > > > >I think you pass the weights using the additional argument w="maW" in > > >your call to maNorm. > > > > > >HTH, > > > > > >Jim > > > > > >James W. MacDonald > > >Affymetrix and cDNA Microarray Core > > >University of Michigan Cancer Center > > >1500 E. Medical Center Drive > > >7410 CCGC > > >Ann Arbor MI 48109 > > >734-647-5623 > > > > > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> > > >Hi all, > > > > > > > > > > > >My name is Joe Walker and I am a final year PhD student attempting to > > >use Bioconductor to analyse a large amount of cDNA microarray data > > >from > > >my thesis experiments. > > > > > > > > > > > >For the normalisation stage, there is the option to use weights > > >previously assigned to the genes. > > > > > >I wish to normalise my genes based on a quality controlled subset >that > > >changes fro each hybridisation, I think one way to do this is to use > > >the > > >weights option during normalistion. > > > > > >The "slot" for the weights (maW) is assigned/loaded during the > > >marrayInput stage using the read.marrayRaw command (along with >name.Gf > > >etc). > > > > > >What I am unclear of is: > > > > > >1) What form do these weights take i.e does 1 = use this gene > > >and > > >0 = do not use this gene, are they graded, or do they have to be > > >defined > > >elsewhere? > > > > > >2) Do you use these weights by simply using maW = TRUE, during > > >the > > >normalisation stage? > > > > > >Am I at least on the right track? > > > > > >If anyone has advice for me it would be great. > > > > > >Thanks in advance, > > > > > >Joe > > > > > >Josef Walker BSc (Hons) > > > > > >PhD Student > > > > > >Memory Group > > > > > >The Edward Jenner Institute for Vaccine Research > > > > > >Compton > > > > > >Nr Newbury > > > > > >Berkshire > > > > > >RG20 7NN > > > > > > > > > > > >Tel: 01635 577905 > > > > > >Fax: 01635 577901 > > > > > >E-mail: Josef.walker@jenner.ac.uk
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States
I would think that because it appears to be true. An example: > tmp <- function(x, ...) tst(x, y=NULL) > tst <- function(x, y=NULL) x + y > tst(4, y=2) [1] 6 Isn't the y=NULL being overwritten by the fact that I pass the variable y=2? Jim James W. MacDonald Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 >>> Gordon Smyth <smyth@wehi.edu.au> 08/05/03 10:19AM >>> At 11:07 PM 5/08/2003, James MacDonald wrote: >Hey Gordon, > >I see that weights are set to null in maNormMain, but if you input >something like w="maW" in maNorm, doesn't that get passed as a variable >(using the ... portion of the function call), and override the w=NULL in >maNormMain? No. Why would you think that? Gordon >Jim > > > >James W. MacDonald >Affymetrix and cDNA Microarray Core >University of Michigan Cancer Center >1500 E. Medical Center Drive >7410 CCGC >Ann Arbor MI 48109 >734-647-5623 > > >>> Gordon Smyth <smyth@wehi.edu.au> 08/04/03 08:25PM >>> >Dear James and Jim, > >Actually the maNorm function doesn't make use of weights, even though >weights might be set in the marrayRaw object. If you look at the code >for >maNorm you will see that the weights are set to NULL when the call is >main >to maNormMain. > >If you want to use weights for normalization you need either to use the > >lower level function maNormMain (which appears to use weights) or use >the >normalization routines in the limma package instead. > >In limma you use read.maimages to read the data into, perhaps picking >up >the quality weights from genepix or quantarray in the process. If you >have >made your own weights, you can simply assign them to the weights >component, >e.g., > >RG <- read.maimages(files, source=your image analysis program) >RG$weights <- your.weights >RG$printer <- info about array layout, e.g., >list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) >MA <- normalizeWithinArrays(RG) > >Gordon > >At 03:26 AM 5/08/2003, James MacDonald wrote: > > >From perusing the functions (particularly maNorm), it appears that >the > >weights are used by all normalization procedures except for "median". >By > >definition, a weight is in the range [0,1], so if you use 0 and 1, it > >will effectively be the same as saying "don't use this" or "use >this". > >You can also use some more moderate values rather than completely > >eliminating the 'bad' spots (e.g., simply down-weight spots that look > >sketchy). > > > > > >I think you pass the weights using the additional argument w="maW" in > >your call to maNorm. > > > >HTH, > > > >Jim > > > > > > > >James W. MacDonald > >Affymetrix and cDNA Microarray Core > >University of Michigan Cancer Center > >1500 E. Medical Center Drive > >7410 CCGC > >Ann Arbor MI 48109 > >734-647-5623 > > > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> > >Hi all, > > > > > > > >My name is Joe Walker and I am a final year PhD student attempting to > >use Bioconductor to analyse a large amount of cDNA microarray data > >from > >my thesis experiments. > > > > > > > >For the normalisation stage, there is the option to use weights > >previously assigned to the genes. > > > >I wish to normalise my genes based on a quality controlled subset >that > >changes fro each hybridisation, I think one way to do this is to use > >the > >weights option during normalistion. > > > >The "slot" for the weights (maW) is assigned/loaded during the > >marrayInput stage using the read.marrayRaw command (along with >name.Gf > >etc). > > > >What I am unclear of is: > > > >1) What form do these weights take i.e does 1 = use this gene > >and > >0 = do not use this gene, are they graded, or do they have to be > >defined > >elsewhere? > > > >2) Do you use these weights by simply using maW = TRUE, during > >the > >normalisation stage? > > > > > > > >Am I at least on the right track? > > > >If anyone has advice for me it would be great. > > > > > > > >Thanks in advance, > > > > > > > >Joe > > > > > > > > > > > >Josef Walker BSc (Hons) > > > >PhD Student > > > >Memory Group > > > >The Edward Jenner Institute for Vaccine Research > > > >Compton > > > >Nr Newbury > > > >Berkshire > > > >RG20 7NN > > > > > > > >Tel: 01635 577905 > > > >Fax: 01635 577901 > > > >E-mail: Josef.walker@jenner.ac.uk > > > > > > > > > > [[alternative HTML version deleted]] > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > >_______________________________________________ > >Bioconductor mailing list > >Bioconductor@stat.math.ethz.ch > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD COMMENT
0
Entering edit mode
I'm not quite sure, if it's true. I changed your example a little bit: > tst <- function(x, y=NULL) x + y > tst(4, y=2) [1] 6 > tst(4) numeric(0) > tmp <- function(x, ...) tst(x, y=NULL) > tmp(4, y=2) numeric(0) > tmp <- function(x, ...) tst(x, y=3) > tmp(4, y=2) [1] 7 Apparently, y is not overwritten when you call tmp(). However, the following works fine: > tmp <- function(x, ...) tst(x, ...) > tmp(4, y=3) [1] 7 Johannes Zitiere James MacDonald <jmacdon@med.umich.edu>: > I would think that because it appears to be true. > > An example: > > > tmp <- function(x, ...) tst(x, y=NULL) > > tst <- function(x, y=NULL) x + y > > tst(4, y=2) > [1] 6 > > Isn't the y=NULL being overwritten by the fact that I pass the variable > y=2? > > Jim > > > > James W. MacDonald > Affymetrix and cDNA Microarray Core > University of Michigan Cancer Center > 1500 E. Medical Center Drive > 7410 CCGC > Ann Arbor MI 48109 > 734-647-5623 > > >>> Gordon Smyth <smyth@wehi.edu.au> 08/05/03 10:19AM >>> > At 11:07 PM 5/08/2003, James MacDonald wrote: > >Hey Gordon, > > > >I see that weights are set to null in maNormMain, but if you input > >something like w="maW" in maNorm, doesn't that get passed as a > variable > >(using the ... portion of the function call), and override the w=NULL > in > >maNormMain? > > No. Why would you think that? > > Gordon > > >Jim > > > > > > > >James W. MacDonald > >Affymetrix and cDNA Microarray Core > >University of Michigan Cancer Center > >1500 E. Medical Center Drive > >7410 CCGC > >Ann Arbor MI 48109 > >734-647-5623 > > > > >>> Gordon Smyth <smyth@wehi.edu.au> 08/04/03 08:25PM >>> > >Dear James and Jim, > > > >Actually the maNorm function doesn't make use of weights, even though > >weights might be set in the marrayRaw object. If you look at the code > >for > >maNorm you will see that the weights are set to NULL when the call is > >main > >to maNormMain. > > > >If you want to use weights for normalization you need either to use > the > > > >lower level function maNormMain (which appears to use weights) or use > >the > >normalization routines in the limma package instead. > > > >In limma you use read.maimages to read the data into, perhaps picking > >up > >the quality weights from genepix or quantarray in the process. If you > >have > >made your own weights, you can simply assign them to the weights > >component, > >e.g., > > > >RG <- read.maimages(files, source=your image analysis program) > >RG$weights <- your.weights > >RG$printer <- info about array layout, e.g., > >list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) > >MA <- normalizeWithinArrays(RG) > > > >Gordon > > > >At 03:26 AM 5/08/2003, James MacDonald wrote: > > > >From perusing the functions (particularly maNorm), it appears > that > >the > > >weights are used by all normalization procedures except for > "median". > >By > > >definition, a weight is in the range [0,1], so if you use 0 and 1, > it > > >will effectively be the same as saying "don't use this" or "use > >this". > > >You can also use some more moderate values rather than completely > > >eliminating the 'bad' spots (e.g., simply down-weight spots that > look > > >sketchy). > > > > > > > > >I think you pass the weights using the additional argument w="maW" > in > > >your call to maNorm. > > > > > >HTH, > > > > > >Jim > > > > > > > > > > > >James W. MacDonald > > >Affymetrix and cDNA Microarray Core > > >University of Michigan Cancer Center > > >1500 E. Medical Center Drive > > >7410 CCGC > > >Ann Arbor MI 48109 > > >734-647-5623 > > > > > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM > >>> > > >Hi all, > > > > > > > > > > > >My name is Joe Walker and I am a final year PhD student attempting > to > > >use Bioconductor to analyse a large amount of cDNA microarray data > > >from > > >my thesis experiments. > > > > > > > > > > > >For the normalisation stage, there is the option to use weights > > >previously assigned to the genes. > > > > > >I wish to normalise my genes based on a quality controlled subset > >that > > >changes fro each hybridisation, I think one way to do this is to > use > > >the > > >weights option during normalistion. > > > > > >The "slot" for the weights (maW) is assigned/loaded during the > > >marrayInput stage using the read.marrayRaw command (along with > >name.Gf > > >etc). > > > > > >What I am unclear of is: > > > > > >1) What form do these weights take i.e does 1 = use this gene > > >and > > >0 = do not use this gene, are they graded, or do they have to be > > >defined > > >elsewhere? > > > > > >2) Do you use these weights by simply using maW = TRUE, > during > > >the > > >normalisation stage? > > > > > > > > > > > >Am I at least on the right track? > > > > > >If anyone has advice for me it would be great. > > > > > > > > > > > >Thanks in advance, > > > > > > > > > > > >Joe > > > > > > > > > > > > > > > > > >Josef Walker BSc (Hons) > > > > > >PhD Student > > > > > >Memory Group > > > > > >The Edward Jenner Institute for Vaccine Research > > > > > >Compton > > > > > >Nr Newbury > > > > > >Berkshire > > > > > >RG20 7NN > > > > > > > > > > > >Tel: 01635 577905 > > > > > >Fax: 01635 577901 > > > > > >E-mail: Josef.walker@jenner.ac.uk > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > >_______________________________________________ > > >Bioconductor mailing list > > >Bioconductor@stat.math.ethz.ch > > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > > > > >_______________________________________________ > > >Bioconductor mailing list > > >Bioconductor@stat.math.ethz.ch > > >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor >
0
Entering edit mode
@gordon-smyth
Last seen 6 minutes ago
WEHI, Melbourne, Australia
At 01:24 AM 6/08/2003, michael watson (IAH-C) wrote: >Hi Gordon > >First of all, thanks for all your help on this matter. I think we're >reaching a point where we are capable of taking this forward, though >probably using limma now rather than the marray* classes :-) As I understand it, your spot flags are 0/1 and they are added as an extra column to your image analysis output file. If this is so and if the name of the extra column is "MyFlag" say, then doing what you want in limma is very simple. You simply add the argument wt.fun = function(x) x$MyFlag when you read the data into R using read.maimages. That's it. Everything will work automatically - the weights be read in and will be used automatically for normalisation and for other analyses. >I only have one question.... > > >I don't want to get into an argument on this topic, but there is absolutely > >no reason to filter out low intensity spots before using loess > >normalisation. Loess normalisation is intensity-based, it is designed to > >accept the whole range of intensities. > >One case in particular I would want to avoid - the case where in one >channel BG is above signal, and in another channel it is not. In this >case, we have very useful data (gene is switched on in one channel, and >off in another) yet we do not have a reliable ratio - we either have a >negative ratio, an infinite ratio (both utterly meaningless) or we set the >negative channel intensity to say, 1, and then the ratio is highly >skewed. In all cases the ratio for that spot is very unreliable and would >surely alter, even if only in a small way, the Lowess fit - or am I wrong >in thinking that? I read somewhere that Lowess gives less weight to >outliers.... You questions are sensible, but you're asking me to get involved in an ongoing discussion on filtering of spots. I hope you don't consider this unfriendly but this is a big topic and life is much too short to debate it by email. Regards Gordon >Regards >Mick > > >Thankyou for your help so far, this mailing list is a real life- saver. > > > >Unfortunately I am a away for the next 7 days so won't be able to access > >my messages, but will be looking forward to checking them when I get > >back. > > > >Best Wishes > > > > > >Joe > > > > > >Josef Walker BSc (Hons) > >PhD Student > >Memory Group > >The Edward Jenner Institute for Vaccine Research > >Compton > >Nr Newbury > >Berkshire > >RG20 7NN > > > >Tel: 01635 577905 > >Fax: 01635 577901 > >E-mail: Josef.walker@jenner.ac.uk > > > > > >-----Original Message----- > >From: Gordon Smyth [mailto:smyth@wehi.edu.au] > >Sent: 05 August 2003 10:43 > >To: michael watson (IAH-C) > >Cc: James MacDonald; bioconductor@stat.math.ethz.ch; Josef Walker > >Subject: RE: [BioC] Defining Weights in marrayNorm. > > > >Dear Michael, > > > >I think you are not understanding exactly how the weights work. What you > > > >want to do really is accomplished using weights and cannot be > >accomplished > >by any subsetting operation. Subsetting operations have to, by their > >very > >nature, apply the same to every array, and this isn't what you want. > > > >1. Let me say first of all that we generally do not recommend > >restricting > >normalisation only to "good" spots. The normalisation routines are > >written > >so that they are robust, i.e., they are able to ignore groups of poor > >quality or differentially expressed genes if they don't follow the trend > >of > >the rest of the data. This means that a minority of poor quality spots > >is > >unlikely to do much harm. Very often there is some information even in > >the > >poorer quality spots and it is best to leave them in. This also saves > >lots > >of time. There are exceptions of course ... > > > >2. How are you choosing the "good quality" spots? Programs like genepix > >flag spots which they think are of questionable quality. If you are > >using > >flags provided by the image analysis program, then you can read in the > >weights as you read in the data. For example, if you have genepix data > >then > > > >RG <- read.maimages(files, source="genepix", wt.fun=wtflags(0)) > > > >will give zero weight to any spot flagged by genepix as being > >questionable. > >When you normalise the data using > > > >MA <- normalizeWithinArrays(RG) > > > >the normalisation regressions will use only those spots which have > >weights > >greater than zero. This will vary between arrays and is exactly what you > > > >want to achieve. All the spots will be normalized, whether "good" or > >"bad" > >quality, but only the "good" spots will have any influence on the > >normalisation functions. The normalisation of the "good" spots will be > >exactly as if the "bad" spots where not there. > > > >3. If you have constructed the spot flags yourself, then you'll have to > >proceed something like this. Suppose you have two arrays in two genepix > >output files. Suppose the flags for the first array are stored in a > >vector > >called 'flag1' with 1 for good spots and 0 for bad. Suppose the flags > >for > >the second array are stored in a vector 'flag2'. You will read in the > >intensity data using > > > >RG <- read.maimages(files, source="genepix") > > > >Then you'll have to assemble the flags into a matrix with rows for genes > > > >and columns for arrays using 'cbind(flag1, flag2)'. Then you put this > >into > >the weight component: > > > >RG$weights <- cbind( flag1, flag2 ) > > > >Now you can use > > > >MA <- normalizeWithinArrays(RG) > > > >and normalisation will use, for each array, only those spots for which > >the > >flags are equal to 1. > > > >4. If you have somehow constructed the flags externally to R, you will > >need > >to read them into R. Suppose you have the flags in a tab-delimited text > >file with one row for each gene and columns corresponding to arrays. > >Then > >you read them in: > > > >w <- as.matrix(read.table("myfile")) > >RG$weights <- w > > > >and then proceed as before. > > > >Hope this helps > >Gordon > > > >At 06:28 PM 5/08/2003, michael watson (IAH-C) wrote: > > >Hi > > > > > >I think the problem that both Jo and myself are having is that we want > >to > > >know how to subset data, either in limma or the marray* classes, such > >that > > >we only use good quality spots in the normalisation process. > > > > > >The problem is, the spots that are "good quality" differ from array to > > >array, so it's not something we can set in the layout object unless we > > >create a different layout object for each array. So we started looking > >at > > >the concept of using "weights", but really, the problem of not being > >able > > >to subset our data successfully still remains. > > > > > >So as a more generalised question, how can I use Bioconductor to > >normalise > > >microarray data based only on a subset of good quality spots, the > >location > > >of which will differ from array to array? > > > > > >Thanks > > >M > > > > > >-----Original Message----- > > >From: Gordon Smyth [mailto:smyth@wehi.edu.au] > > >Sent: 05 August 2003 01:26 > > >To: James MacDonald > > >Cc: bioconductor@stat.math.ethz.ch; josef.walker@jenner.ac.uk > > >Subject: Re: [BioC] Defining Weights in marrayNorm. > > > > > > > > >Dear James and Jim, > > > > > >Actually the maNorm function doesn't make use of weights, even though > > >weights might be set in the marrayRaw object. If you look at the code > >for > > >maNorm you will see that the weights are set to NULL when the call is > >main > > >to maNormMain. > > > > > >If you want to use weights for normalization you need either to use the > > >lower level function maNormMain (which appears to use weights) or use > >the > > >normalization routines in the limma package instead. > > > > > >In limma you use read.maimages to read the data into, perhaps picking > >up > > >the quality weights from genepix or quantarray in the process. If you > >have > > >made your own weights, you can simply assign them to the weights > >component, > > >e.g., > > > > > >RG <- read.maimages(files, source=your image analysis program) > > >RG$weights <- your.weights > > >RG$printer <- info about array layout, e.g., > > >list=(ngrid.c=4,ngrid.r=4,nspot.r=20,nspot.c=20) > > >MA <- normalizeWithinArrays(RG) > > > > > >Gordon > > > > > >At 03:26 AM 5/08/2003, James MacDonald wrote: > > > > >From perusing the functions (particularly maNorm), it appears that > >the > > > >weights are used by all normalization procedures except for "median". > >By > > > >definition, a weight is in the range [0,1], so if you use 0 and 1, it > > > >will effectively be the same as saying "don't use this" or "use > >this". > > > >You can also use some more moderate values rather than completely > > > >eliminating the 'bad' spots (e.g., simply down-weight spots that look > > > >sketchy). > > > > > > > > > > > >I think you pass the weights using the additional argument w="maW" in > > > >your call to maNorm. > > > > > > > >HTH, > > > > > > > >Jim > > > > > > > >James W. MacDonald > > > >Affymetrix and cDNA Microarray Core > > > >University of Michigan Cancer Center > > > >1500 E. Medical Center Drive > > > >7410 CCGC > > > >Ann Arbor MI 48109 > > > >734-647-5623 > > > > > > > > >>> "Josef Walker" <josef.walker@jenner.ac.uk> 08/04/03 12:31PM >>> > > > >Hi all, > > > > > > > > > > > > > > > >My name is Joe Walker and I am a final year PhD student attempting to > > > >use Bioconductor to analyse a large amount of cDNA microarray data > > > >from > > > >my thesis experiments. > > > > > > > > > > > > > > > >For the normalisation stage, there is the option to use weights > > > >previously assigned to the genes. > > > > > > > >I wish to normalise my genes based on a quality controlled subset > >that > > > >changes fro each hybridisation, I think one way to do this is to use > > > >the > > > >weights option during normalistion. > > > > > > > >The "slot" for the weights (maW) is assigned/loaded during the > > > >marrayInput stage using the read.marrayRaw command (along with > >name.Gf > > > >etc). > > > > > > > >What I am unclear of is: > > > > > > > >1) What form do these weights take i.e does 1 = use this gene > > > >and > > > >0 = do not use this gene, are they graded, or do they have to be > > > >defined > > > >elsewhere? > > > > > > > >2) Do you use these weights by simply using maW = TRUE, during > > > >the > > > >normalisation stage? > > > > > > > >Am I at least on the right track? > > > > > > > >If anyone has advice for me it would be great. > > > > > > > >Thanks in advance, > > > > > > > >Joe > > > > > > > >Josef Walker BSc (Hons) > > > > > > > >PhD Student > > > > > > > >Memory Group > > > > > > > >The Edward Jenner Institute for Vaccine Research > > > > > > > >Compton > > > > > > > >Nr Newbury > > > > > > > >Berkshire > > > > > > > >RG20 7NN > > > > > > > > > > > > > > > >Tel: 01635 577905 > > > > > > > >Fax: 01635 577901 > > > > > > > >E-mail: Josef.walker@jenner.ac.uk
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.2 years ago
Hi Guys Almost finished with this thread, thanks alot for your help :-) >From the documentation on maNormMain and maNormLoess, I should be able to use something like this to get my weights considered: mydata.norm <- maNormMain(mydata, f.loc = list(maNormLoess(x="maA",y="maM",z=NULL, w=weights))) where weights is my vector of weights. However, if I try that, I get: Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : invalid variable type The documentation states that w should be: w: An optional numeric vector of weights. I have weights as a vector, 19200 long, all ones and zeros. My marrayRaw objects have 19200 spots. So I do wonder what could be going wrong, I am sure it is me doing something wrong rather than a bug, however the documentation doesn't really give many clues away :-( Finally, if I try (without the weights): mydata.norm <- maNormMain(mydata, f.loc = list(maNormLoess(x="maA",y="maM",z=NULL))) I get no errors, so it is the weights that are causing the problems! Sorry to go on about this, I know I should use limma but I really would like to get this working! Cheers Mick
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.2 years ago
Hi In my continuing search to get this working, I have made progress :-D But I think I found a bug/feature... OK here is what I did. I took a GenePix file. I made a copy of it. I added column "SpotWeight" to both. In one of the files I set the weights all to 1. In the other, I set all of the weights to be between 0 and 0.5 (random numbers). I just wanted to see if I could get it working. So: > data = read.GenePix(fnames = files, name.Gf = "F532 Median", name.Gb = "B532 Median", name.Rf = "F635 Median", name.Rb = "B635 Median", name.W = "SpotWeight", layout=layout) > maW(data) produces a nice lovely vector of my weights, so far so good. By chance, the first column was the one with all 1's - I think this is significant. > data.norm = maNorm(data, norm = "printTipLoess") This works great and just produces normalised data as if maW didn't exist - we expect this from the code, maNorm() function does not use weights. Now: > data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))) is my big hope. And it doesn't throw any errors :-D. However, it does just produce M values as if maW doesn't exist. I am about to throw in the towel when I think I should try something. So I try: > data.weight.norm = maNormMain(data[,1], f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,1]@maW))) This again turns up the now familiar M values, unaffected by maW. But of course, in my first data set maW is all set to 1, so of course thats what it SHOULD produce. So i try: > data.weight.norm = maNormMain(data[,2], f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,2]@maW))) and guess what? It works! Hurrah! My M values have been affected by maW, they are different to normal and I can only assume maNormMain is calculating weighted normalised M values according to maW. But wait - isn't this a little incorrect? The marrayRaw class allows me to have different weights for different spots for all of my arrays. So why when I normalise using maNormMain() do I have to do it on an array-by-array basis? Surely: > data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))) should work in that when it is normalising the nth marrayRaw data set in "data", it should use the nth set of weights in data@maW...? Instead what it appears to have done is take the first column of data@maW, and by chance in my case that was all 1's so I noticed. If those hadn't been 1's but had been legitimate weights for the first array, I don't think I would have noticed.... and had all of my arrays normalised according to weights for the first array.... :-( Anyway, I believe I have cracked it now in that I can weight normalise all ninety of my arrays. The fact that i have to make 90 calls to maNormMain and prodcue 90 normalised data sets is a nuisance rather than anything else, though I do believe what i have said above makes sense, I hope someone agrees :-) In most other respects the marray* classes, and bioconductor in general, are fantastic, so I hope I don't appear unappreciative ;-) Thanks Mick
ADD COMMENT
0
Entering edit mode
Marcus Davy ▴ 680
@marcus-davy-374
Last seen 10.2 years ago
Hi, For multiple arrays as far as I can see there is currently no way that you can make a call to maNormMain that will allow maLoess to utilise weights from EACH column of your maW matrix. maNormMain uses a controlling function maNormLoess specified as a LIST of calls in the arguement: f.loc=list(maNormLoess(x="maA",y="maM", z="maPrintTip", w=Flagweights, ...) maNormLoess calls maLoess. Inside the code for maLoess the actual loess fit has an arguement weights=args$w, where args$w will be the vector of Flagweights. fit <- loess(y ~ x, weights = args$w, subset = args$subset, span = args$span, na.action = args$na.action, degree = args$degree, family = args$family, control = args$control) What was probably happening in your call data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))) was that only the first column of data@maW was being used as weights for the the normalization vectors x="maA data for each array", and y="maM data for each array" in the loess smoother. e.g. data(cars) cars.lo <- loess(dist ~ speed, cars) plot(cars.lo$x, cars.lo$y) lines(cars.lo$x,cars.lo$fit) set.seed(1) weights <- matrix(rbinom(100,1,0.4),nc=2) cars.lo <- loess(dist ~ speed, cars, weights=weights) lines(cars.lo$x,cars.lo$fit, col="red") cars.lo <- loess(dist ~ speed, cars, weights=weights[1:50,1]) lines(cars.lo$x,cars.lo$fit, lty=8, col="green") marcus >>> "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> 14/08/2003 2:01:44 AM >>> Hi In my continuing search to get this working, I have made progress :-D But I think I found a bug/feature... OK here is what I did. I took a GenePix file. I made a copy of it. I added column "SpotWeight" to both. In one of the files I set the weights all to 1. In the other, I set all of the weights to be between 0 and 0.5 (random numbers). I just wanted to see if I could get it working. So: > data = read.GenePix(fnames = files, name.Gf = "F532 Median", name.Gb = "B532 Median", name.Rf = "F635 Median", name.Rb = "B635 Median", name.W = "SpotWeight", layout=layout) > maW(data) produces a nice lovely vector of my weights, so far so good. By chance, the first column was the one with all 1's - I think this is significant. > data.norm = maNorm(data, norm = "printTipLoess") This works great and just produces normalised data as if maW didn't exist - we expect this from the code, maNorm() function does not use weights. Now: > data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))) is my big hope. And it doesn't throw any errors :-D. However, it does just produce M values as if maW doesn't exist. I am about to throw in the towel when I think I should try something. So I try: > data.weight.norm = maNormMain(data[,1], f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,1]@maW))) This again turns up the now familiar M values, unaffected by maW. But of course, in my first data set maW is all set to 1, so of course thats what it SHOULD produce. So i try: > data.weight.norm = maNormMain(data[,2], f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,2]@maW))) and guess what? It works! Hurrah! My M values have been affected by maW, they are different to normal and I can only assume maNormMain is calculating weighted normalised M values according to maW. But wait - isn't this a little incorrect? The marrayRaw class allows me to have different weights for different spots for all of my arrays. So why when I normalise using maNormMain() do I have to do it on an array-by-array basis? Surely: > data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))) should work in that when it is normalising the nth marrayRaw data set in "data", it should use the nth set of weights in data@maW...? Instead what it appears to have done is take the first column of data@maW, and by chance in my case that was all 1's so I noticed. If those hadn't been 1's but had been legitimate weights for the first array, I don't think I would have noticed.... and had all of my arrays normalised according to weights for the first array.... :-( Anyway, I believe I have cracked it now in that I can weight normalise all ninety of my arrays. The fact that i have to make 90 calls to maNormMain and prodcue 90 normalised data sets is a nuisance rather than anything else, though I do believe what i have said above makes sense, I hope someone agrees :-) In most other respects the marray* classes, and bioconductor in general, are fantastic, so I hope I don't appear unappreciative ;-) Thanks Mick _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor ______________________________________________________ The contents of this e-mail are privileged and/or confidential to the named recipient and are not to be used by any other person and/or organisation. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail.
ADD COMMENT
0
Entering edit mode
@michael-watson-iah-c-378
Last seen 10.2 years ago
Thank you, that is pretty much as I expected :-D My only point being that the very nature of an marrayRaw object allows each individual array to have different weights to every other array, so it is a little confusing that the maNormMain function applies only one set of weights to multiple arrays. Anyway, I have taken enough help from this list, so here is some contributed code that *should* do what I want: # raw data should be in an marrayRaw object called "data" # layout object should be in an marrayLayout object called "layout" # the marrayRaw object must have a maW slot # number of arrays x = 3 #create layout here if none exists #layout = read.marrayLayout(ngr = 12, ngc = 4, nsr = 14, nsc = 15) # type of normalisation - 'maPrintTip' will give print-tip Loess, NULL will # give Loess z = "maPrintTip" # define matrices to hold the weighted data # 100080 is the number of spots on my array mam = matrix(nrow = 10080, ncol = x) maa = matrix(nrow = 10080, ncol = x) maw = matrix(nrow = 10080, ncol = x) # create individual weighted for (i in seq(1,x)) { temp = maNormMain(data[,i], f.loc = list(maNormLoess(x="maA", y="maM", z=z, w=data[,i]@maW))) mam[,i] = maM(temp) maa[,i] = maA(temp) maw[,i] = data[,i]@maW } # create a marray Norm object my.weight.norm = new('marrayNorm', maA = maa, maM = mam, maW = maw, maLayout = layout, maNormCall = maNormCall(maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))))) -----Original Message----- From: Marcus Davy [mailto:MDavy@hortresearch.co.nz] Sent: 19 August 2003 04:16 To: michael.watson@bbsrc.ac.uk; jean@biostat.ucsf.edu; bioconductor@stat.math.ethz.ch Subject: RE: [BioC] Defining Weights in marrayNorm. Hi, For multiple arrays as far as I can see there is currently no way that you can make a call to maNormMain that will allow maLoess to utilise weights from EACH column of your maW matrix. maNormMain uses a controlling function maNormLoess specified as a LIST of calls in the arguement: f.loc=list(maNormLoess(x="maA",y="maM", z="maPrintTip", w=Flagweights, ...) maNormLoess calls maLoess. Inside the code for maLoess the actual loess fit has an arguement weights=args$w, where args$w will be the vector of Flagweights. fit <- loess(y ~ x, weights = args$w, subset = args$subset, span = args$span, na.action = args$na.action, degree = args$degree, family = args$family, control = args$control) What was probably happening in your call data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))) was that only the first column of data@maW was being used as weights for the the normalization vectors x="maA data for each array", and y="maM data for each array" in the loess smoother. e.g. data(cars) cars.lo <- loess(dist ~ speed, cars) plot(cars.lo$x, cars.lo$y) lines(cars.lo$x,cars.lo$fit) set.seed(1) weights <- matrix(rbinom(100,1,0.4),nc=2) cars.lo <- loess(dist ~ speed, cars, weights=weights) lines(cars.lo$x,cars.lo$fit, col="red") cars.lo <- loess(dist ~ speed, cars, weights=weights[1:50,1]) lines(cars.lo$x,cars.lo$fit, lty=8, col="green") marcus >>> "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> 14/08/2003 2:01:44 AM >>> Hi In my continuing search to get this working, I have made progress :-D But I think I found a bug/feature... OK here is what I did. I took a GenePix file. I made a copy of it. I added column "SpotWeight" to both. In one of the files I set the weights all to 1. In the other, I set all of the weights to be between 0 and 0.5 (random numbers). I just wanted to see if I could get it working. So: > data = read.GenePix(fnames = files, name.Gf = "F532 Median", name.Gb = "B532 Median", name.Rf = "F635 Median", name.Rb = "B635 Median", name.W = "SpotWeight", layout=layout) > maW(data) produces a nice lovely vector of my weights, so far so good. By chance, the first column was the one with all 1's - I think this is significant. > data.norm = maNorm(data, norm = "printTipLoess") This works great and just produces normalised data as if maW didn't exist - we expect this from the code, maNorm() function does not use weights. Now: > data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))) is my big hope. And it doesn't throw any errors :-D. However, it does just produce M values as if maW doesn't exist. I am about to throw in the towel when I think I should try something. So I try: > data.weight.norm = maNormMain(data[,1], f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,1]@maW))) This again turns up the now familiar M values, unaffected by maW. But of course, in my first data set maW is all set to 1, so of course thats what it SHOULD produce. So i try: > data.weight.norm = maNormMain(data[,2], f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data[,2]@maW))) and guess what? It works! Hurrah! My M values have been affected by maW, they are different to normal and I can only assume maNormMain is calculating weighted normalised M values according to maW. But wait - isn't this a little incorrect? The marrayRaw class allows me to have different weights for different spots for all of my arrays. So why when I normalise using maNormMain() do I have to do it on an array-by-array basis? Surely: > data.weight.norm = maNormMain(data, f.loc = list(maNormLoess(x="maA", y="maM", z="maPrintTip", w=data@maW))) should work in that when it is normalising the nth marrayRaw data set in "data", it should use the nth set of weights in data@maW...? Instead what it appears to have done is take the first column of data@maW, and by chance in my case that was all 1's so I noticed. If those hadn't been 1's but had been legitimate weights for the first array, I don't think I would have noticed.... and had all of my arrays normalised according to weights for the first array.... :-( Anyway, I believe I have cracked it now in that I can weight normalise all ninety of my arrays. The fact that i have to make 90 calls to maNormMain and prodcue 90 normalised data sets is a nuisance rather than anything else, though I do believe what i have said above makes sense, I hope someone agrees :-) In most other respects the marray* classes, and bioconductor in general, are fantastic, so I hope I don't appear unappreciative ;-) Thanks Mick _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor ______________________________________________________ The contents of this e-mail are privileged and/or confidential to the named recipient and are not to be used by any other person and/or organisation. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail.
ADD COMMENT
0
Entering edit mode
Yuk Fai Leung ▴ 100
@yuk-fai-leung-416
Last seen 10.2 years ago
Hi there, I have encountered the same problem of using different weights to normalize my data recently. As a na?ve biologist with limited programming skills, I modified the maNormMain, maNormLoess and maNormMAD in a "dirty way" so that I may use the respective weights for each array. maNormMain <- function ( ... some codes omitted... for (i in 1:ncol(maM(mbatch))) { globali <<- i ... more codes omitted } maNormLoess <- function (x = "maA", y = "maM", z = "maPrintTip", w = NULL, subset = TRUE, span = 0.4, ...) { function(m) { if (is.character(z)) maLoess(x = eval(call(x, m)), y = eval(call(y, m)), z = eval(call(z, m)), w = w, subset = subset[,globali], span = span, ...) else maLoess(x = eval(call(x, m)), y = eval(call(y, m)), z = TRUE, w = w, subset = subset[,globali], span = span, ...) } } maNormMAD <- function (x = NULL, y = "maM", geo = TRUE, subset = TRUE) { function(m) { if (is.character(x)) maMAD(x = eval(call(x, m)), y = eval(call(y, m)), geo = geo, subset = subset[,globali]) else maMAD(x = TRUE, y = eval(call(y, m)), geo = geo, subset = subset[,globali]) } } After setting these commands, I can do my normalization using those non-control spots and weight >=0 flag <- maControls(array.raw) == "N" & maW(array.raw) >= 0 array.norm.ps <- maNormMain(array.raw, f.loc = list(maNormLoess(subset = flag)), f.scale = list(maNormMAD(x = "maPrintTip", subset = flag)), echo = TRUE) Best regards, Fai ________ Yuk Fai Leung Bauer Center for Genomics Research Harvard University 7 Divinity Avenue Cambridge, MA 02138 Tel: 617-496-7134 Fax: 617-495-2196 email: yfleung@cgr.harvard.edu; yfleung@genomicshome.com URL: http://genomicshome.com
ADD COMMENT

Login before adding your answer.

Traffic: 649 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6