Clustering like samples with a lot of data
0
0
Entering edit mode
@gaiusjaugustus-10041
Last seen 5.6 years ago
University of Arizona

I have a very large dataframe/matrix with >100 samples and >50,000 datapoints for each sample.  I'd like to cluster the samples to get an idea of which have similar patterns.  The data looks like this:

 

Row.names Region1   Region2 .....  Region N
  P1                 3             3                   2
  P2                 4             4                   2
  P3                 4             4                   2

 

What I'd like is a graphical representation of the values in the cells while clustering for individuals that are similar, then I'd like to do it again clustering both the samples and the regions.  I've tried using this as a guide, but R has been choking.  I assume there's a package out there that can do this based on a table/matrix of data.

 

Any help would be appreciated.

clustering • 843 views
ADD COMMENT
1
Entering edit mode

What do you mean by choking? If it is too slow, sometimes you need to reduce the dimensionality of your data via variance filtering for example. I use the aheatmap function from package NMF, it is very good, but again it won't handle loads of features.

ADD REPLY

Login before adding your answer.

Traffic: 745 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6