fastq upload time
1
0
Entering edit mode
@danielbernerunibasch-4268
Last seen 4.7 years ago
Hi there I have a solexa fastq file containing some 27 million reads (file size approx. 4 GB). my plan is to upload this into R for subsequent editing with ShortRead tools. The R version is 64-bit linux, the computer has 8 GB RAM. Can anybody provide a rough estimate of how long the input will take? hours, days...? Thanks! Daniel Berner Zoological Institute University of Basel Vesalgasse 1 4051 Basel Switzerland +41 (0)61 267 0328 daniel.berner at unibas.ch
ShortRead ShortRead • 1.6k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 8 months ago
United States
On Wed, Sep 22, 2010 at 8:07 AM, <daniel.berner@unibas.ch> wrote: > Hi there > I have a solexa fastq file containing some 27 million reads (file size > approx. 4 GB). my plan is to upload this into R for subsequent editing with > ShortRead tools. The R version is 64-bit linux, the computer has 8 GB RAM. > Can anybody provide a rough estimate of how long the input will take? hours, > days...? > Depending on disk and network speeds, perhaps a few minutes. 8GB is pretty small, though. You'll have to give it a try to see if it all fits into memory. Sean [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
On Wed, 22 Sep 2010 08:12:27 -0400, Sean Davis <sdavis2 at="" mail.nih.gov=""> wrote: > On Wed, Sep 22, 2010 at 8:07 AM, <daniel.berner at="" unibas.ch=""> wrote: > >> Hi there >> I have a solexa fastq file containing some 27 million reads (file size >> approx. 4 GB). my plan is to upload this into R for subsequent editing >> with >> ShortRead tools. The R version is 64-bit linux, the computer has 8 GB >> RAM. >> Can anybody provide a rough estimate of how long the input will take? >> hours, >> days...? >> > > Depending on disk and network speeds, perhaps a few minutes. 8GB is pretty > small, though. You'll have to give it a try to see if it all fits into > memory. > > Sean Yes, my experience with ShortRead and files this size was that 8GB was not enough. If it is compatible with your planned analysis I would split the file according to chromosome and work from those. -- Alex Gutteridge
ADD REPLY
0
Entering edit mode
thanks, appreciated!! daniel
ADD REPLY
0
Entering edit mode
If you want to check quality, and base-position/frequency I would suggest random sampling on the fastq file to extract let's say a 5-10% of the file and run Quality analysis on it. Though global numbers won't be real most of the information will be still informative with less memory/CPU. I wrote some c code for this if you are interested. marc Alex Gutteridge wrote: > On Wed, 22 Sep 2010 08:12:27 -0400, Sean Davis <sdavis2 at="" mail.nih.gov=""> > wrote: > >> On Wed, Sep 22, 2010 at 8:07 AM, <daniel.berner at="" unibas.ch=""> wrote: >> >> >>> Hi there >>> I have a solexa fastq file containing some 27 million reads (file size >>> approx. 4 GB). my plan is to upload this into R for subsequent editing >>> with >>> ShortRead tools. The R version is 64-bit linux, the computer has 8 GB >>> RAM. >>> Can anybody provide a rough estimate of how long the input will take? >>> hours, >>> days...? >>> >>> >> Depending on disk and network speeds, perhaps a few minutes. 8GB is >> > pretty > >> small, though. You'll have to give it a try to see if it all fits into >> memory. >> >> Sean >> > > Yes, my experience with ShortRead and files this size was that 8GB was not > enough. If it is compatible with your planned analysis I would split the > file according to chromosome and work from those. > > -- ----------------------------------------------------- Marc Noguera i Julian, PhD Genomics unit / Bioinformatics Institut de Medicina Predictiva i Personalitzada del C?ncer (IMPPC) B-10 Office Carretera de Can Ruti Cam? de les Escoles s/n 08916 Badalona, Barcelona
ADD REPLY

Login before adding your answer.

Traffic: 547 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6