Clustering like samples with a lot of data
Entering edit mode
Last seen 3.6 years ago
University of Arizona

I have a very large dataframe/matrix with >100 samples and >50,000 datapoints for each sample.  I'd like to cluster the samples to get an idea of which have similar patterns.  The data looks like this:


Row.names Region1   Region2 .....  Region N
  P1                 3             3                   2
  P2                 4             4                   2
  P3                 4             4                   2


What I'd like is a graphical representation of the values in the cells while clustering for individuals that are similar, then I'd like to do it again clustering both the samples and the regions.  I've tried using this as a guide, but R has been choking.  I assume there's a package out there that can do this based on a table/matrix of data.


Any help would be appreciated.

clustering • 569 views
Entering edit mode

What do you mean by choking? If it is too slow, sometimes you need to reduce the dimensionality of your data via variance filtering for example. I use the aheatmap function from package NMF, it is very good, but again it won't handle loads of features.


Login before adding your answer.

Traffic: 1276 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6