Fwd: Re: Fwd: RE: Heatmap function
0
0
Entering edit mode
Marcus ▴ 150
@marcus-410
Last seen 9.6 years ago
>Hmm, your answer left me thinking about how to measure distances. Why >doesnt a distace function just calculate the distance between the values >that are there and leave out the NA:s? I have filtered away with the >B-test the spots that are supposedly not to be differentially expressed >and have only a subset of the total number of spots. Three slides of my 18 >have many NA:s. Should I exclude them therefor because the distance is to >affected? > >/ Marcus > > >At 08:21 2003-10-30 +0100, you wrote: > >>>Subject: RE: [BioC] Heatmap function >>>Date: Wed, 29 Oct 2003 13:55:43 -0500 >>>X-MS-Has-Attach: >>>X-MS-TNEF-Correlator: >>>Thread-Topic: [BioC] Heatmap function >>>Thread-Index: AcOeQFWAqheKzfUtT1WkYP+4GKRURgACytsA >>>From: "Furge, Kyle" <kyle.furge@vai.org> >>>To: "Marcus" <marcusb@biotech.kth.se> >>>Cc: <bioconductor@stat.math.ethz.ch> >>>X-MIME-Autoconverted: from quoted-printable to 8bit by >>>kiev.biotech.kth.se id h9TJK8Pr030736 >>> >>>This question has come up often at our institute... so here goes for an >>>brief, unformal, explanation. >>> >>>Clustering depends on some type of distance metric to determine how the >>>samples are related >>> >>>If you have a set of vectors : >>>x <- c(1,NA,2,NA) >>>y <- c(NA,2,NA,1) >>> >>>The pattern of missing values make computing any type of distance >>>between the vectors incomprehensible. The dist functions return NA for >>>these types of comparisons. >>> >>>Clustering functions don't like NA because their job is just to organize >>>the data. A distance of NA is not understandable. >>> >>>What some programs do (like Eisen's cluster) is "threshold" the NA >>>values to some arbitrary "large" distance. >>> >>>The following dist function computes distances and then replaces any NA >>>values with an arbitrarily large distance (10% greater then the largest >>>actually distance). This function may be helpful for input into hclust >>>because NA values are replaced >>> >>>na.dist <- function(x,...) { >>> t.dist <- dist(x,...) >>> t.dist <- as.matrix(t.dist) >>> t.limit <- 1.1*max(t.dist,na.rm=T) >>> t.dist[is.na(t.dist)] <- t.limit >>> t.dist <- as.dist(t.dist) >>> return(t.dist) >>>} >>> >>>I typed this from memory, so it may contain typo's, but you see the idea. >>> >>>I hope this helps, >>> >>>-kyle ********************************************************************** ********************* Marcus Gry Bj?rklund Royal Institute of Technology AlbaNova University Center Stockholm Center for Physics, Astronomy and Biotechnology Department of Molecular Biotechnology 106 91 Stockholm, Sweden Phone (office): +46 8 553 783 39 Fax: + 46 8 553 784 81 Visiting adress: Roslagstullsbacken 21, Floor 3 Delivery adress: Roslagsv?gen 30B
• 678 views
ADD COMMENT

Login before adding your answer.

Traffic: 831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6