Question: Hierarchical clustering of RMA data
0
gravatar for Ryan Kirkbride
11.2 years ago by
Ryan Kirkbride10 wrote:
Hello all! I have a basic conceptual question: I have a set of RMA normalized data that I am looking to carry out hierarchical clustering. In the past we've usually been working with MAS5 data which we import into dCHIP to carry out the clustering. I'm now looking to do the same with RMA data, and I'm wondering if I should transform to a linear scale or leave it the typical log2 scale. dChip does a per gene normalization (subtracts the mean and then divides by the standard deviation), and it appears that linear or log2 scale affects the results. I'm assuming most people just leave it log2 scale, am I overthinking the whole issue? Thanks, _________________________ Ryan Kirkbride Plant Biology Graduate Student Harada Lab UC Davis [[alternative HTML version deleted]]
normalization clustering • 528 views
ADD COMMENTlink modified 11.2 years ago by Deanne Taylor50 • written 11.2 years ago by Ryan Kirkbride10
Answer: Hierarchical clustering of RMA data
0
gravatar for Deanne Taylor
11.2 years ago by
Deanne Taylor50 wrote:
Ryan: This might be a naive question as I'm not sure how dChip is doing the normalization, but is there a setting in dChip to let it know it's a log2 scale? Otherwise the mathematics between log and linear scale would be much different... and that might be the source of the difference, as subtracting log2 data is akin to dividing at the linear scale. --- Deanne Taylor PhD Executive Director, Bioinformatics Core Department of Biostatistics Harvard School of Public Health 655 Huntington Avenue Boston, MA 02115 dtaylor at hsph.harvard.edu >>> Ryan Kirkbride <rkirkbride at="" ucdavis.edu=""> 08/28/08 8:27 PM >>> Hello all! I have a basic conceptual question: I have a set of RMA normalized data that I am looking to carry out hierarchical clustering. In the past we've usually been working with MAS5 data which we import into dCHIP to carry out the clustering. I'm now looking to do the same with RMA data, and I'm wondering if I should transform to a linear scale or leave it the typical log2 scale. dChip does a per gene normalization (subtracts the mean and then divides by the standard deviation), and it appears that linear or log2 scale affects the results. I'm assuming most people just leave it log2 scale, am I overthinking the whole issue? Thanks, _________________________ Ryan Kirkbride Plant Biology Graduate Student Harada Lab UC Davis [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENTlink written 11.2 years ago by Deanne Taylor50
RESENDING WITHOUT ATTACHMENT: Scaling is well known to cause different hierarchical (and non- hierarchical) clustering results.  The decision to transform the data has to be considered in terms of how the transformation will impact the distance calculations.  We are very comfortable with transforming to induce things such as normality or homoscedasticty, however, this is not why we would necessarily do it in a clustering problem. I have attached a review article (in a previous post) on clustering microarray data that shows a simple example of how scaling results in different clusters, and why one would be used over the other. (Pharmacogenomics, 2003, Vol 4(1), pps. 41-52). Bill Shannon Associate Professor of Biostatistics in Medicine Washington University School of Medicine St Louis  President-Elect, Classification Society --- On Fri, 8/29/08, Deanne Taylor <dtaylor@hsph.harvard.edu> wrote: From: Deanne Taylor <dtaylor@hsph.harvard.edu> Subject: Re: [BioC] Hierarchical clustering of RMA data To: bioconductor@stat.math.ethz.ch, rkirkbride@ucdavis.edu Date: Friday, August 29, 2008, 6:35 AM Ryan: This might be a naive question as I'm not sure how dChip is doing the normalization, but is there a setting in dChip to let it know it's a log2 scale? Otherwise the mathematics between log and linear scale would be much different... and that might be the source of the difference, as subtracting log2 data is akin to dividing at the linear scale. --- Deanne Taylor PhD Executive Director, Bioinformatics Core Department of Biostatistics Harvard School of Public Health 655 Huntington Avenue Boston, MA 02115 dtaylor@hsph.harvard.edu >>> Ryan Kirkbride <rkirkbride@ucdavis.edu> 08/28/08 8:27 PM >>> Hello all! I have a basic conceptual question: I have a set of RMA normalized data that I am looking to carry out hierarchical clustering. In the past we've usually been working with MAS5 data which we import into dCHIP to carry out the clustering. I'm now looking to do the same with RMA data, and I'm wondering if I should transform to a linear scale or leave it the typical log2 scale. dChip does a per gene normalization (subtracts the mean and then divides by the standard deviation), and it appears that linear or log2 scale affects the results. I'm assuming most people just leave it log2 scale, am I overthinking the whole issue? Thanks, _________________________ Ryan Kirkbride Plant Biology Graduate Student Harada Lab UC Davis [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLYlink written 11.2 years ago by William Shannon280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 417 users visited in the last hour