Computer for the analysis of high-throughput genomic data
2
0
Entering edit mode
@capurro-alberto-dr-5673
Last seen 11.3 years ago
Hello, I would need to purchase a desktop computer for the analysis of high- throughput genomic data using R/Bioconductor. I want to ask you the specifications that I should look for (as for example ram memory, processor type, number of cores). Can you mention specific examples of state of the art machines fitted for this purpose? Thank you very much. Alberto Capurro Marie Curie Research Fellow Department of Cell Physiology and Pharmacology College of Medicine, Biological Sciences and Psychology Maurice Shock Medical Sciences Building Room 319 University of Leicester Leicester LE1 9HN United Kingdom https://sites.google.com/site/albertocapurro/
• 1.7k views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 6 weeks ago
United States
Hi, On Thu, Dec 27, 2012 at 11:14 AM, Capurro, Alberto (Dr.) <ac331 at="" leicester.ac.uk=""> wrote: > Hello, > > I would need to purchase a desktop computer for the analysis of high-throughput genomic data using R/Bioconductor. I want to ask you the specifications that I should look for (as for example ram memory, processor type, number of cores). Can you mention specific examples of state of the art machines fitted for this purpose? This is a hard question to answer. Is the machine for yourself, or a lab, more users? What types of analysis will you be doing? Will you be working with raw NGS data -- if so, just sequence alignment, assembly, etc? Do you have a compute cluster to push those types of jobs over to? Anyway -- absent any of that information, for us mere mortals who shop at "pro-sumer" levels, I wouldn't complain too much if I had, say, a 12 core, 32-64gb ram osx or linux machine sitting underneath my desk. If you're working w/ NGS data, no matter how much HD space you get, you will always run out, so you will constantly have to be adding upgrades there, but start with several terabytes ... also ... you'll have to think about how you plan to backup your data, but I'll leave that as another topic. But, as I said, the more (of everything) the merrier. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Thank you very much. I will do microarray analysis at first but in the future we are also interested in sequencing. The computer is for the lab, I will be in charge of the processing, I have experience in computational neuroscience but not in genomics, so I am learning now. I think that the Uni usually buys windows machines. Regarding the operating system, is there an important reason to use linux instead of windows 7 to run bioconductor and R?. I can use linux if it is better. I can get 10 T and backup in and external disk and in space provided by the Uni network. Thank you very much. Best, Alberto Alberto Capurro Marie Curie Research Fellow Department of Cell Physiology and Pharmacology College of Medicine, Biological Sciences and Psychology Maurice Shock Medical Sciences Building Room 319 University of Leicester Leicester LE1 9HN United Kingdom Tel +44 (0)116 252 2673 E-mail: ac331 at le.ac.uk https://sites.google.com/site/albertocapurro/ ________________________________________ From: Steve Lianoglou [mailinglist.honeypot@gmail.com] Sent: Thursday, December 27, 2012 8:55 PM To: Capurro, Alberto (Dr.) Cc: bioconductor at r-project.org Subject: Re: [BioC] Computer for the analysis of high-throughput genomic data Hi, On Thu, Dec 27, 2012 at 11:14 AM, Capurro, Alberto (Dr.) <ac331 at="" leicester.ac.uk=""> wrote: > Hello, > > I would need to purchase a desktop computer for the analysis of high-throughput genomic data using R/Bioconductor. I want to ask you the specifications that I should look for (as for example ram memory, processor type, number of cores). Can you mention specific examples of state of the art machines fitted for this purpose? This is a hard question to answer. Is the machine for yourself, or a lab, more users? What types of analysis will you be doing? Will you be working with raw NGS data -- if so, just sequence alignment, assembly, etc? Do you have a compute cluster to push those types of jobs over to? Anyway -- absent any of that information, for us mere mortals who shop at "pro-sumer" levels, I wouldn't complain too much if I had, say, a 12 core, 32-64gb ram osx or linux machine sitting underneath my desk. If you're working w/ NGS data, no matter how much HD space you get, you will always run out, so you will constantly have to be adding upgrades there, but start with several terabytes ... also ... you'll have to think about how you plan to backup your data, but I'll leave that as another topic. But, as I said, the more (of everything) the merrier. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY
0
Entering edit mode
@steve-lianoglou-2771
Last seen 6 weeks ago
United States
Hi, On Fri, Dec 28, 2012 at 4:36 AM, Capurro, Alberto (Dr.) <ac331 at="" leicester.ac.uk=""> wrote: > Thank you very much. I will do microarray analysis at first but in the future we are also interested in sequencing. The computer is for the lab, I will be in charge of the processing, I have experience in computational neuroscience but not in genomics, so I am learning now. I think that the Uni usually buys windows machines. Regarding the operating system, is there an important reason to use linux instead of windows 7 to run bioconductor and R?. I can use linux if it is better. I can get 10 T and backup in and external disk and in space provided by the Uni network. Without inciting a flamewar, I don't think it's too controversial to say that most scientific tools in this space are written for linux first, then tweaked to run on osx (us osx folks are, by default, stuck on an older version of gcc, so some tweaks are harder than others), and likely windows is the after thought. Look at, for example, some of the aligners out there. * Bowtie provides compiled binaries for linux and osx, no windows: http://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.0.4/ * The STAR aligner runs on linux, and recently was tweaked to run on osx (not sure if it's entirely working). * bwa's SF page suggests it only runs on linux and BSD (osx). * "A unix system" is listed as a prerequisite for installing GSNAP. For the most part, however, this isn't true for the R/bioconductor packages you will likely be using. AFAIK, the majority of the bioc packages work just fine on unix, osx, and windows. Also, if you're planning on having several people log into the machine to do work, then I think a *nix is likely going to be your best bet. So, to be honest, even though I have a slight osx bent, if I were in your shoes and was put in a position to buy a workhorse machine, I'd go linux. I assume you, and the other members in the lab, will have their own desktops/laptops to do downstream analysis -- which can be the OS of your choosing. After doing some of the heavy lifting on a compute-server (I'm thinking of alignment/assembly), you can likely do most all of your work on a lower powered machine -- especially if we're talking about more "canned"/routinary analysis. I've done lots of downstream analysis on my 8gb ram, dual core macbook pro, for instance, although having access to some big iron to do some heavy computing at times is totally necessary. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
Thank you very much Steve, I will go for a linux operating system then. Best, Alberto Alberto Capurro Marie Curie Research Fellow Department of Cell Physiology and Pharmacology College of Medicine, Biological Sciences and Psychology Maurice Shock Medical Sciences Building Room 319 University of Leicester Leicester LE1 9HN United Kingdom Tel +44 (0)116 252 2673 E-mail: ac331 at le.ac.uk https://sites.google.com/site/albertocapurro/ ________________________________________ From: Steve Lianoglou [mailinglist.honeypot@gmail.com] Sent: Friday, December 28, 2012 3:52 PM To: Capurro, Alberto (Dr.) Cc: bioconductor at r-project.org Subject: Re: [BioC] Computer for the analysis of high-throughput genomic data Hi, On Fri, Dec 28, 2012 at 4:36 AM, Capurro, Alberto (Dr.) <ac331 at="" leicester.ac.uk=""> wrote: > Thank you very much. I will do microarray analysis at first but in the future we are also interested in sequencing. The computer is for the lab, I will be in charge of the processing, I have experience in computational neuroscience but not in genomics, so I am learning now. I think that the Uni usually buys windows machines. Regarding the operating system, is there an important reason to use linux instead of windows 7 to run bioconductor and R?. I can use linux if it is better. I can get 10 T and backup in and external disk and in space provided by the Uni network. Without inciting a flamewar, I don't think it's too controversial to say that most scientific tools in this space are written for linux first, then tweaked to run on osx (us osx folks are, by default, stuck on an older version of gcc, so some tweaks are harder than others), and likely windows is the after thought. Look at, for example, some of the aligners out there. * Bowtie provides compiled binaries for linux and osx, no windows: http://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.0.4/ * The STAR aligner runs on linux, and recently was tweaked to run on osx (not sure if it's entirely working). * bwa's SF page suggests it only runs on linux and BSD (osx). * "A unix system" is listed as a prerequisite for installing GSNAP. For the most part, however, this isn't true for the R/bioconductor packages you will likely be using. AFAIK, the majority of the bioc packages work just fine on unix, osx, and windows. Also, if you're planning on having several people log into the machine to do work, then I think a *nix is likely going to be your best bet. So, to be honest, even though I have a slight osx bent, if I were in your shoes and was put in a position to buy a workhorse machine, I'd go linux. I assume you, and the other members in the lab, will have their own desktops/laptops to do downstream analysis -- which can be the OS of your choosing. After doing some of the heavy lifting on a compute-server (I'm thinking of alignment/assembly), you can likely do most all of your work on a lower powered machine -- especially if we're talking about more "canned"/routinary analysis. I've done lots of downstream analysis on my 8gb ram, dual core macbook pro, for instance, although having access to some big iron to do some heavy computing at times is totally necessary. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD REPLY

Login before adding your answer.

Traffic: 931 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6