Entering edit mode
A.J. Rossini
▴
810
@aj-rossini-209
Last seen 10.3 years ago
"Michael Benjamin" <msb1129@bellsouth.net> writes:
> Progress update (summarized from my forum for such matters at
> http://www.theschedule.net/forum/gforum.cgi?forum=20&do=forum_view):
>
> Briefly, I created a four-node cluster out of Pentium-III boxes and
> Debian Linux/openMosix. I saw no significant performance boost of
> ReadAffy or expresso using the set of 165 .CEL files from Harvard.
None
> of the processes migrated, as they say in the world of high-
performance
> computing. R.bin runs in one process, and everything it does seems
to
> stay in that process. No real opportunity for parallelization here,
at
> least not on openMosix.
>
> I'd like to analyze these chips in a reasonable amount of time,
without
> paying Dell $45,000 for 4-Xeon SMP server.
>
> I worry what we'll do with 1,000 .CEL files. The analytical
techniques
> work well, but pretty slow even if your amp "goes to 11."
>
> Any thoughts?
Explicitly parallelize the routine. OpenMOSIX is nice, but it's still
not a production environment with R.
That's why Michael Li and I wrote RPVM/RSPRNG as well as worked with
Luke Tierney on SNOW. The tools are there, but someone has to do the
programming. That means that you can hire someone with the money you
won't spend on software or hardware, or you can wait.
That being said, the 4-way Xeon server isn't going to help with
parallelization of a single process, and you'd get the same work done
with a remote execution shell (i.e. firing off R BATCH or using
Emacs/ESS-Elsewhere on other machines.
The data-shareing/locale problem is an interesting one that will need
to be solved. Not sure how we'll go about that. See our tech report
for an anecdotal example of how one can naively end up twice as slow
on the parallel system (later pathological examples that I've
constructed show slowness increasing a bit in the number of
processors) due to sending data "over the wire" being machines.
best,
-tony
--
rossini@u.washington.edu
http://www.analytics.washington.edu/
Biomedical and Health Informatics University of Washington
Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research
Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email
CONFIDENTIALITY NOTICE: This e-mail message and any
attachme...{{dropped}}