Question

spot filtering

0

Entering edit mode

Brooke-Powell, Elizabeth ▴ 170

@brooke-powell-elizabeth-1185

Last seen 9.6 years ago

Michael, I was interested in how you flag your data, when you load your files do you read in your flag column as part of a standard GenePix type output file, so limma uses it when the linear model is fit? I use BlueFuse and its flag column is quite different from GenePix and the like and at present not able to be used in limma. I am wondering how to mark (flag) the bad data and either leave it in or what to put in the data file to get the data ignored i.e. can you put NA in place of the data point and have it ignored? Is it as simple as creating a new flag column converting the BlueFuse flags into GenePix like flags? If I load the data file using the other file type option in LimmaGUI it doesn't allow me to tell it where there is a flag column. Is this something that could be fixed assuming the flag column conforms to the GenePix style of 0, +1 and -1 calls? Thanks for the help and insight, Liz ------------------------------ Message: 12 Date: Thu, 21 Jul 2005 10:56:09 +0100 From: "michael watson \(IAH-C\)" < > Subject: Re: [BioC] Gene filtering for differential expression (limma) To: "Spela Baebler" <spela.baebler at="" nib.si=""> Cc: bioconductor at stat.math.ethz.ch Message-ID: <8975119BCD0AC5419D61A9CF1A923E950172DA58 at iahce2knas1.iah.bbsrc.reserved> Content-Type: text/plain; charset="us-ascii" Gordon's right, if you post an example of your targets file we will have a better idea of what you mean >>Initial quality control and >>spot filtering are performed in image analysis program. Personally, I wouldn't recommend doing this. The way R works, it's better to have all your data points present in all files. I leave all data in and flag up bad spots, and remove them at the end of the analysis, not the beginning :-)

limma limmaGUI limma limmaGUI • 1.2k views

ADD COMMENT • link updated 18.8 years ago by michael watson IAH-C ★ 3.4k • written 18.8 years ago by Brooke-Powell, Elizabeth ▴ 170

score 0 · Answer 1 · 2005-07-22

Actually, I don't use any of the bioconductor functions for reading in flags or weighting values depending on quality of the spot etc. Generally, what I do is create a table of flags - with spots as the rows and array as the column. These flags are sometimes genepix flags, sometimes composite flags I made up. Then I do all of my analysis in limma, using all data, I don't weight anything, and I don't convert anything into NAs. At the end, I output the data from topTable() into a text file, load it into MySQL or MS Access, link it to the flags data and decide which, out of my list from topTable, I believe according to the flags. Note you *could* do this linking in R using the merge() function too. -----Original Message----- From: Brooke-Powell, Elizabeth [mailto:etbp2 at borcim.wustl.edu] Sent: Fri 22/07/2005 5:41 PM To: bioconductor at stat.math.ethz.ch Cc: michael watson (IAH-C) Subject: spot filtering Michael, I was interested in how you flag your data, when you load your files do you read in your flag column as part of a standard GenePix type output file, so limma uses it when the linear model is fit? I use BlueFuse and its flag column is quite different from GenePix and the like and at present not able to be used in limma. I am wondering how to mark (flag) the bad data and either leave it in or what to put in the data file to get the data ignored i.e. can you put NA in place of the data point and have it ignored? Is it as simple as creating a new flag column converting the BlueFuse flags into GenePix like flags? If I load the data file using the other file type option in LimmaGUI it doesn't allow me to tell it where there is a flag column. Is this something that could be fixed assuming the flag column conforms to the GenePix style of 0, +1 and -1 calls? Thanks for the help and insight, Liz ------------------------------ Message: 12 Date: Thu, 21 Jul 2005 10:56:09 +0100 From: "michael watson \(IAH-C\)" < > Subject: Re: [BioC] Gene filtering for differential expression (limma) To: "Spela Baebler" <spela.baebler at="" nib.si=""> Cc: bioconductor at stat.math.ethz.ch Message-ID: <8975119BCD0AC5419D61A9CF1A923E950172DA58 at iahce2knas1.iah.bbsrc.reserved> Content-Type: text/plain; charset="us-ascii" Gordon's right, if you post an example of your targets file we will have a better idea of what you mean >>Initial quality control and >>spot filtering are performed in image analysis program. Personally, I wouldn't recommend doing this. The way R works, it's better to have all your data points present in all files. I leave all data in and flag up bad spots, and remove them at the end of the analysis, not the beginning :-)

score 0 · Answer 2 · 2005-07-25

Yes - a colleague of mine filtered out all bad spots prior to normalisation (loess) and found that after a lot of hard work the end result was very, very small differences. Sure, if you have a huge amount od bad spots, leaving them in may be a bad idea, but if you have *that* many bad spots, perhaps analysing the data is not such a good idea anyway... -----Original Message----- From: Brooke-Powell, Elizabeth [mailto:etbp2 at borcim.wustl.edu] Sent: Fri 22/07/2005 7:04 PM To: michael watson (IAH-C) Cc: bioconductor at stat.math.ethz.ch Subject: RE: spot filtering Thank you for replying. That is very interesting I am not a statistician, but when I told some people I used a similar approach of leaving all data in and filtering later people heavily criticized it (mainly biologists). They said that if you put junk into the system you'll get junk out.. In my opinion this would be more important if you have a lot of bad spots, but how many is too many? Have you looked at the effect of leaving the "bad" data in particularly the data and make up of the lists you get out? Liz -----Original Message----- From: michael watson (IAH-C) [mailto:michael.watson@bbsrc.ac.uk] Sent: Friday, July 22, 2005 12:01 PM To: Brooke-Powell, Elizabeth; bioconductor at stat.math.ethz.ch Subject: RE: spot filtering Actually, I don't use any of the bioconductor functions for reading in flags or weighting values depending on quality of the spot etc. Generally, what I do is create a table of flags - with spots as the rows and array as the column. These flags are sometimes genepix flags, sometimes composite flags I made up. Then I do all of my analysis in limma, using all data, I don't weight anything, and I don't convert anything into NAs. At the end, I output the data from topTable() into a text file, load it into MySQL or MS Access, link it to the flags data and decide which, out of my list from topTable, I believe according to the flags. Note you *could* do this linking in R using the merge() function too. -----Original Message----- From: Brooke-Powell, Elizabeth [mailto:etbp2 at borcim.wustl.edu] Sent: Fri 22/07/2005 5:41 PM To: bioconductor at stat.math.ethz.ch Cc: michael watson (IAH-C) Subject: spot filtering Michael, I was interested in how you flag your data, when you load your files do you read in your flag column as part of a standard GenePix type output file, so limma uses it when the linear model is fit? I use BlueFuse and its flag column is quite different from GenePix and the like and at present not able to be used in limma. I am wondering how to mark (flag) the bad data and either leave it in or what to put in the data file to get the data ignored i.e. can you put NA in place of the data point and have it ignored? Is it as simple as creating a new flag column converting the BlueFuse flags into GenePix like flags? If I load the data file using the other file type option in LimmaGUI it doesn't allow me to tell it where there is a flag column. Is this something that could be fixed assuming the flag column conforms to the GenePix style of 0, +1 and -1 calls? Thanks for the help and insight, Liz