Advice on cluster hardware
1
0
Entering edit mode
Ross Boylan ▴ 10
@ross-boylan-552
Last seen 9.7 years ago
The group I am in is about to purchase a cluster. If anyone on this list has any advice on what type of hardware (or software) would be best, I'd appreciate it. I didn't find any discussion of this in the archives, but I thought some people on this list might have relevant experience. We will have two broad types of uses: simulation studies for epidemiology (with people or cases as the units) and genetic and protein studies, whose details I don't know but you all probably do. The simulation studies are likely to make heavy use of R. I suspect that the twp uses have much different characteristics, e.g., in terms of the size of the datasets to manipulate and the best tradeoffs outlined below. Other uses are possible. Among other issues we are wondering about: *Tradeoffs between CPU speed, memory, internode communication speed, disk size, and disk speed. As a first cut, I expect the simulations suggest emphasizing processor power and ensuring adequate memory. On the other hand, the fact that it's easy to upgrade CPUs suggests putting more money into the network supporting the CPUs. And I suspect the genomics emphasizes more the ability to move large amounts of data around quickly (across network and to disk). *Appropriate disk architecture (e.g., local disks vs shared netword disks or SANS). 32 vs 64 bit; Intel vs AMD. We assume it will be some kind of Linux OS (we like Debian, but vendors tend to supply RH and Debian lacks support for 64 bit AMD in any official way, unlike Suse or RH). If there's a good reason, we could use something else. Our budget is relatively modest, enough perhaps for 10-15 dual- processor nodes. We hope to expand later. As a side issue, more a personal curiosity, why do clusters all seem to be built on dual-processor nodes? Why not more CPU's per node? Thanks for any help you can offer. -- Ross Boylan wk: (415) 502-4031 530 Parnassus Avenue (Library) rm 115-4 ross@biostat.ucsf.edu Dept of Epidemiology and Biostatistics fax: (415) 476-9856 University of California, San Francisco San Francisco, CA 94143-0840 hm: (415) 550-1062
• 950 views
ADD COMMENT
0
Entering edit mode
Ramon Diaz ★ 1.1k
@ramon-diaz-159
Last seen 9.7 years ago
Dear Ross, I don't have any relevant experience, but we are in the process of also building a cluster, so I'll share some of my confusion with you. Our cluster will be an openMosix cluster, probably also with LVS, for applications that do not migrate well. The cluster will have about 30 nodes now, possibly going up to 50. Because we wanted to ensure painless use of applications we already use and/or develop and we are also very interested in using Debian (ease of administration, use, and upgrading), we decided to go for a 32 bit platform (dual Xeon machines ---dual CPU machines, among other things, decrease somewhat the load of the network, because every pair of CPUs is already connected by the machine bus). We have considered HP, Dell, and IBM (including their blades). The 64 bit with AMD seemed a bit risky at this moment. But, if you can consider things other than GNU/Linux, I've heard that clusters built with G5 processors can be a great idea; a lot of bung for the buck (and 64 bit, and I think much larger amounts of RAM per processes). We were concerned with potential network problems, and asked about it on the openMosix list (see the openMosix-general list, the thread "openMosix cluster with 50 nodes: network issues", starting about 2 weeks ago). After those answers and some additional research, it seems that a Gigabit solution will be enough for our needs with, for instance, a Cisco 4500 switch. The problem seems to be that scaling can be poor, and if your network grows large (e.g., > 48 nodes) you might need to change switches, which becomes expensive, etc. But I understand this is not a likely problem in your case in the near future. Disk size and disk speed did not seemed critical for our intended uses; we will combine local disks of moderate size (about 60 GB) with a "master node" of with about 400 GB in several disks. The oMFS seems to work well, and machines with local disks give us more flexibility, such as if we want to remove a few from the cluster and use them standalone for something else. Once things are up and running (or trying to run), I will be glad to provide more details of our experience. Best, R. On Wednesday 03 December 2003 22:50, Ross Boylan wrote: > The group I am in is about to purchase a cluster. If anyone on this > list has any advice on what type of hardware (or software) would be > best, I'd appreciate it. I didn't find any discussion of this in the > archives, but I thought some people on this list might have relevant > experience. > > We will have two broad types of uses: simulation studies for > epidemiology (with people or cases as the units) and genetic and protein > studies, whose details I don't know but you all probably do. The > simulation studies are likely to make heavy use of R. I suspect that > the twp uses have much different characteristics, e.g., in terms of the > size of the datasets to manipulate and the best tradeoffs outlined > below. > > Other uses are possible. > > Among other issues we are wondering about: > *Tradeoffs between CPU speed, memory, internode communication speed, > disk size, and disk speed. > > As a first cut, I expect the simulations suggest emphasizing processor > power and ensuring adequate memory. On the other hand, the fact that > it's easy to upgrade CPUs suggests putting more money into the network > supporting the CPUs. And I suspect the genomics emphasizes more the > ability to move large amounts of data around quickly (across network and > to disk). > > *Appropriate disk architecture (e.g., local disks vs shared netword > disks or SANS). > > 32 vs 64 bit; Intel vs AMD. > > We assume it will be some kind of Linux OS (we like Debian, but vendors > tend to supply RH and Debian lacks support for 64 bit AMD in any > official way, unlike Suse or RH). If there's a good reason, we could > use something else. > > Our budget is relatively modest, enough perhaps for 10-15 dual- processor > nodes. We hope to expand later. > > As a side issue, more a personal curiosity, why do clusters all seem to > be built on dual-processor nodes? Why not more CPU's per node? > > Thanks for any help you can offer. -- Ram?n D?az-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz PGP KeyID: 0xE89B3462 (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc)
ADD COMMENT
0
Entering edit mode
Ramon: That sounds like a tremendously exciting project! You seem to have done much research, and your solution should be both scalable and reasonably priced. I have seen reports that R is "compatible" with openMosix, but I have not found reports of real-world experience. Does R use shared memory? Does it spawn processes that migrate efficiently? What kind of performance enhancement do you anticipate with additional nodes? Best wishes! Michael Benjamin, MD Emory University, Winship Cancer Institute Atlanta, GA USA -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Ramon Diaz-Uriarte Sent: Thursday, December 04, 2003 5:58 AM To: Ross Boylan; bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Advice on cluster hardware Dear Ross, I don't have any relevant experience, but we are in the process of also building a cluster, so I'll share some of my confusion with you. Our cluster will be an openMosix cluster, probably also with LVS, for applications that do not migrate well. The cluster will have about 30 nodes now, possibly going up to 50. Because we wanted to ensure painless use of applications we already use and/or develop and we are also very interested in using Debian (ease of administration, use, and upgrading), we decided to go for a 32 bit platform (dual Xeon machines ---dual CPU machines, among other things, decrease somewhat the load of the network, because every pair of CPUs is already connected by the machine bus). We have considered HP, Dell, and IBM (including their blades). The 64 bit with AMD seemed a bit risky at this moment. But, if you can consider things other than GNU/Linux, I've heard that clusters built with G5 processors can be a great idea; a lot of bung for the buck (and 64 bit, and I think much larger amounts of RAM per processes). We were concerned with potential network problems, and asked about it on the openMosix list (see the openMosix-general list, the thread "openMosix cluster with 50 nodes: network issues", starting about 2 weeks ago). After those answers and some additional research, it seems that a Gigabit solution will be enough for our needs with, for instance, a Cisco 4500 switch. The problem seems to be that scaling can be poor, and if your network grows large (e.g., > 48 nodes) you might need to change switches, which becomes expensive, etc. But I understand this is not a likely problem in your case in the near future. Disk size and disk speed did not seemed critical for our intended uses; we will combine local disks of moderate size (about 60 GB) with a "master node" of with about 400 GB in several disks. The oMFS seems to work well, and machines with local disks give us more flexibility, such as if we want to remove a few from the cluster and use them standalone for something else. Once things are up and running (or trying to run), I will be glad to provide more details of our experience. Best, R. On Wednesday 03 December 2003 22:50, Ross Boylan wrote: > The group I am in is about to purchase a cluster. If anyone on this > list has any advice on what type of hardware (or software) would be > best, I'd appreciate it. I didn't find any discussion of this in the > archives, but I thought some people on this list might have relevant > experience. > > We will have two broad types of uses: simulation studies for > epidemiology (with people or cases as the units) and genetic and protein > studies, whose details I don't know but you all probably do. The > simulation studies are likely to make heavy use of R. I suspect that > the twp uses have much different characteristics, e.g., in terms of the > size of the datasets to manipulate and the best tradeoffs outlined > below. > > Other uses are possible. > > Among other issues we are wondering about: > *Tradeoffs between CPU speed, memory, internode communication speed, > disk size, and disk speed. > > As a first cut, I expect the simulations suggest emphasizing processor > power and ensuring adequate memory. On the other hand, the fact that > it's easy to upgrade CPUs suggests putting more money into the network > supporting the CPUs. And I suspect the genomics emphasizes more the > ability to move large amounts of data around quickly (across network and > to disk). > > *Appropriate disk architecture (e.g., local disks vs shared netword > disks or SANS). > > 32 vs 64 bit; Intel vs AMD. > > We assume it will be some kind of Linux OS (we like Debian, but vendors > tend to supply RH and Debian lacks support for 64 bit AMD in any > official way, unlike Suse or RH). If there's a good reason, we could > use something else. > > Our budget is relatively modest, enough perhaps for 10-15 dual-processor > nodes. We hope to expand later. > > As a side issue, more a personal curiosity, why do clusters all seem to > be built on dual-processor nodes? Why not more CPU's per node? > > Thanks for any help you can offer. -- Ram?n D?az-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://bioinfo.cnio.es/~rdiaz PGP KeyID: 0xE89B3462 (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc) _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD REPLY
0
Entering edit mode
Dear All, regarding Apple G5 clustering, there is a rather much spoken of cluster of 1100 dual 64-bit G5:s at Virginia Tech, which can studied to some depth (including design documents etc, if I recall correctly) at their homepage http://computing.vt.edu/research_computing/terascale/ . Should be interesting to take a look at. The G5:s are supposed to run several distros of Linux, but I don't know the details. I think Virginia Tech runs OS X 10.3 on them. Best regards Anders Sj?gren PhD Student Dept. of Mathematical Statistics Chalmers University of Technology G?teborg, Sweden --- On Dec 4, 2003, at 11:58 AM, Ramon Diaz-Uriarte wrote: > Dear Ross, > > I don't have any relevant experience, but we are in the process of also > building a cluster, so I'll share some of my confusion with you. Our > cluster > will be an openMosix cluster, probably also with LVS, for applications > that > do not migrate well. The cluster will have about 30 nodes now, > possibly going > up to 50. > > Because we wanted to ensure painless use of applications we already > use and/or > develop and we are also very interested in using Debian (ease of > administration, use, and upgrading), we decided to go for a 32 bit > platform > (dual Xeon machines ---dual CPU machines, among other things, decrease > somewhat the load of the network, because every pair of CPUs is already > connected by the machine bus). We have considered HP, Dell, and IBM > (including their blades). The 64 bit with AMD seemed a bit risky at > this > moment. But, if you can consider things other than GNU/Linux, I've > heard that > clusters built with G5 processors can be a great idea; a lot of bung > for the > buck (and 64 bit, and I think much larger amounts of RAM per > processes). > > We were concerned with potential network problems, and asked about it > on the > openMosix list (see the openMosix-general list, the thread "openMosix > cluster > with 50 nodes: network issues", starting about 2 weeks ago). After > those > answers and some additional research, it seems that a Gigabit solution > will > be enough for our needs with, for instance, a Cisco 4500 switch. The > problem > seems to be that scaling can be poor, and if your network grows large > (e.g., >> 48 nodes) you might need to change switches, which becomes expensive, >> etc. > But I understand this is not a likely problem in your case in the near > future. > > Disk size and disk speed did not seemed critical for our intended > uses; we > will combine local disks of moderate size (about 60 GB) with a "master > node" > of with about 400 GB in several disks. The oMFS seems to work well, and > machines with local disks give us more flexibility, such as if we want > to > remove a few from the cluster and use them standalone for something > else. > > Once things are up and running (or trying to run), I will be glad to > provide > more details of our experience. > > Best, > > R. > > > On Wednesday 03 December 2003 22:50, Ross Boylan wrote: >> The group I am in is about to purchase a cluster. If anyone on this >> list has any advice on what type of hardware (or software) would be >> best, I'd appreciate it. I didn't find any discussion of this in the >> archives, but I thought some people on this list might have relevant >> experience. >> >> We will have two broad types of uses: simulation studies for >> epidemiology (with people or cases as the units) and genetic and >> protein >> studies, whose details I don't know but you all probably do. The >> simulation studies are likely to make heavy use of R. I suspect that >> the twp uses have much different characteristics, e.g., in terms of >> the >> size of the datasets to manipulate and the best tradeoffs outlined >> below. >> >> Other uses are possible. >> >> Among other issues we are wondering about: >> *Tradeoffs between CPU speed, memory, internode communication speed, >> disk size, and disk speed. >> >> As a first cut, I expect the simulations suggest emphasizing processor >> power and ensuring adequate memory. On the other hand, the fact that >> it's easy to upgrade CPUs suggests putting more money into the network >> supporting the CPUs. And I suspect the genomics emphasizes more the >> ability to move large amounts of data around quickly (across network >> and >> to disk). >> >> *Appropriate disk architecture (e.g., local disks vs shared netword >> disks or SANS). >> >> 32 vs 64 bit; Intel vs AMD. >> >> We assume it will be some kind of Linux OS (we like Debian, but >> vendors >> tend to supply RH and Debian lacks support for 64 bit AMD in any >> official way, unlike Suse or RH). If there's a good reason, we could >> use something else. >> >> Our budget is relatively modest, enough perhaps for 10-15 >> dual-processor >> nodes. We hope to expand later. >> >> As a side issue, more a personal curiosity, why do clusters all seem >> to >> be built on dual-processor nodes? Why not more CPU's per node? >> >> Thanks for any help you can offer. > > -- > Ram?n D?az-Uriarte > Bioinformatics Unit > Centro Nacional de Investigaciones Oncol?gicas (CNIO) > (Spanish National Cancer Center) > Melchor Fern?ndez Almagro, 3 > 28029 Madrid (Spain) > Fax: +-34-91-224-6972 > Phone: +-34-91-224-6900 > > http://bioinfo.cnio.es/~rdiaz > PGP KeyID: 0xE89B3462 > (http://bioinfo.cnio.es/~rdiaz/0xE89B3462.asc) > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
ADD REPLY

Login before adding your answer.

Traffic: 571 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6