Dear all,
I have a burning question wrt the use of gene expression pattern
standardization (i.e. subtracting mean or meadian and dividing by SD)
as is commonly done prior to clustering.
During gene selection steps I typically use a fold-change cut off on a
list of differentially expressed genes to further enrich the list for
candidate genes with larger fold changes. I have noticed a very strong
positive correlation between fold-change and variance (i.e. genes with
a big fold change have big variances). However, when I plot the fold
changes of standardized patterns, the correlation between fold-change
and variance has obviosuly been elimated. I was wondering if a fold-
change filter would still be meaningful once the patterns have been
standardized and if not what would the reasons for this be (i.e. why
is it not meaningful anymore)?
Any feedback is much appreciated!
Sincerely,
Johan
Department of Molecular and Cell Biology
University of Cape Town
South Africa
On Fri, Sep 19, 2008 at 9:56 AM, Johan van Heerden <jvhn1 at="" yahoo.com=""> wrote:
>
> Dear all,
>
> I have a burning question wrt the use of gene expression pattern
standardization (i.e. subtracting mean or meadian and dividing by SD)
as is commonly done prior to clustering.
>
> During gene selection steps I typically use a fold-change cut off on
a list of differentially expressed genes to further enrich the list
for candidate genes with larger fold changes. I have noticed a very
strong positive correlation between fold-change and variance (i.e.
genes with a big fold change have big variances). However, when I plot
the fold changes of standardized patterns, the correlation between
fold-change and variance has obviosuly been elimated. I was wondering
if a fold-change filter would still be meaningful once the patterns
have been standardized and if not what would the reasons for this be
(i.e. why is it not meaningful anymore)?
Fold change is related to the variance, yes. More highly variable
genes can (but might not) show higher fold changes, while genes with
lower variance will necessarily show lower fold changes on the average
(but these changes could still be highly significant). When gene
expression is standardized, the variance across the genes is set to be
equal; the fold changes will be similarly affected, on average.
When clustering standardized gene expression, the clustering no longer
reflects the original relative changes. This can be fine for making a
heatmap look nice, but it has the disadvantage of distorting how
"good" a gene looks (depending on your definition of "good").
Hope that helps.
Sean