Is flagging the same as filtering? In my Limma script it takes only
those ESTs with a flag of 0 (which are good spots).
myfun <- function(x) as.numeric ( x$Flag >0)
Is this not the same as filtering? If I actually remove the absent
spots from my Imagene files, then the files each have different
lengths and the order of genes is not the same in each file.
Sally
[[alternative HTML version deleted]]
Hi Sally,
Your script is transferring the flags to weights, and in your script,
only ESTs with a flag of 0 get a weight of 1, and all other spots get
a weight of 0, which means they are not used at all in the analysis.
So yes, you are in effect "filtering" out these individual spots by
setting the weights to 0, which is exactly what Gordon said you
should not do. I second this opinion for the following reason: a
"bad" spot is a spot where you had no information whatsoever on what
the expression level might have been, so the number that you get
(because you always get a number) has no relationship at all to what
the real value was and so you should throw it out by giving it a
weight of 0. However, if you don't measure anything above background
for a particular spot (which GenePix will flag -50), it's not a "bad"
spot, because you do have useful information that the expression
level is below detection, and the number that you get will be
relatively valid compared to other spots that had detectable
expression. Would you throw out values of 0 if you got them in any
other scientific measurement? Likely not, so why throw them out here?
Cheers,
Jenny
At 11:30 AM 1/15/2009, Sally wrote:
>Is flagging the same as filtering? In my Limma script it takes only
>those ESTs with a flag of 0 (which are good spots).
>
>myfun <- function(x) as.numeric ( x$Flag >0)
>
> Is this not the same as filtering? If I actually remove the
> absent spots from my Imagene files, then the files each have
> different lengths and the order of genes is not the same in each
file.
>
>Sally
> [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA
ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu
Hi Sally,
I'll answer both your e-mails here and post it back to the list, so
the answers can become part of the archives.
At 07:34 PM 1/17/2009, Sally wrote:
>Hi Jenny,
>
>Do you mean I should not have a myfun script? Should I be giving
>weights to spots at all? I'm abit confused as to what I should do.
That depends on what the flags are in your data files...
>Thanks for the reply. I used Imagene to scan my slides. Imagene is
>fairly primative. It only has 3 flags: 0, 1, 2. 0 is a 'good'
>spot, 1 is a spot which was marked as no good by the user during
>gridding, and 2 is a bad spot. Are you saying I should not 'filter
>out' these spots (before computing DE genes) using the script
>
>myfun <- function(x) as.numeric ( x$Flag >0)?
>
>When you say Gordon says not to flag out [?filter out] spots (in my
>case with Imagene) using
>
>myfun <- function(x) as.numeric ( x$Flag >0).
>
>How should I re-write this script to include all spots?
You should filter out spots that the user marked as no good during
the gridding (dust spots, scratches, etc.), but not the spots the
program automatically marked as bad (usually not above background).
So if you want to give a weight of 1 to all the spots that have flags
not equal to 1, then your function is:
myfun <- function(x) as.numeric(x$Flag != 1)
Good luck,
Jenny
>Take care,
>
>Sally
>----- Original Message ----- From: "Jenny Drnevich" <drnevich at="" illinois.edu="">
>To: "Sally" <sagoldes at="" shaw.ca="">; <bioconductor at="" stat.math.ethz.ch="">
>Cc: "Sally" <sagoldes at="" shaw.ca="">
>Sent: Friday, January 16, 2009 7:55 AM
>Subject: Re: [BioC] Filtering before differential analysis
>
>
>>Hi Sally,
>>
>>Your script is transferring the flags to weights, and in your
>>script, only ESTs with a flag of 0 get a weight of 1, and all other
>>spots get a weight of 0, which means they are not used at all in
>>the analysis. So yes, you are in effect "filtering" out these
>>individual spots by setting the weights to 0, which is exactly what
>>Gordon said you should not do. I second this opinion for the
>>following reason: a "bad" spot is a spot where you had no
>>information whatsoever on what the expression level might have
>>been, so the number that you get (because you always get a number)
>>has no relationship at all to what the real value was and so you
>>should throw it out by giving it a weight of 0. However, if you
>>don't measure anything above background for a particular spot
>>(which GenePix will flag -50), it's not a "bad" spot, because you
>>do have useful information that the expression level is below
>>detection, and the number that you get will be relatively valid
>>compared to other spots that had detectable expression. Would you
>>throw out values of 0 if you got them in any other scientific
>>measurement? Likely not, so why throw them out here?
>>
>>Cheers,
>>Jenny
>>
>>At 11:30 AM 1/15/2009, Sally wrote:
>>>Is flagging the same as filtering? In my Limma script it takes
>>>only those ESTs with a flag of 0 (which are good spots).
>>>
>>>myfun <- function(x) as.numeric ( x$Flag >0)
>>>
>>> Is this not the same as filtering? If I actually remove the
>>> absent spots from my Imagene files, then the files each have
>>> different lengths and the order of genes is not the same in each
file.
>>>
>>>Sally
>>> [[alternative HTML version deleted]]
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>Search the archives:
>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>Jenny Drnevich, Ph.D.
>>
>>Functional Genomics Bioinformatics Specialist
>>W.M. Keck Center for Comparative and Functional Genomics
>>Roy J. Carver Biotechnology Center
>>University of Illinois, Urbana-Champaign
>>
>>330 ERML
>>1201 W. Gregory Dr.
>>Urbana, IL 61801
>>USA
>>
>>ph: 217-244-7355
>>fax: 217-265-5066
>>e-mail: drnevich at illinois.edu
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
Ph: 217-244-7355
FAX: 217-265-5066
Email: drnevich at uiuc.edu