Search
Question: memory exhausted for readAligned
0
gravatar for Lana Schaffer
8.8 years ago by
Lana Schaffer1.3k
Lana Schaffer1.3k wrote:
Hi, I am trying to read the alignment in a lane of Solexa data and ran out of memory. I have 3.2G memory on my desktop computer. Is there a setting I can use to have enough memory for the readAligned command? How much memory do i need? Lana Schaffer Biostatistics/Informatics The Scripps Research Institute DNA Array Core Facility La Jolla, CA 92037 (858) 784-2263 (858) 784-2994 schaffer at scripps.edu
ADD COMMENTlink modified 8.8 years ago by Martin Morgan ♦♦ 20k • written 8.8 years ago by Lana Schaffer1.3k
0
gravatar for Patrick Aboyoun
8.8 years ago by
Patrick Aboyoun1.6k
United States
Patrick Aboyoun1.6k wrote:
Lana, Could you provide your session information for you R session as well as the number of alignments you are trying to read in? My first guess is that you are running on a 32-bit architecture, but without more information we on the list can't help you very much. Patrick Lana Schaffer wrote: > Hi, > I am trying to read the alignment in a lane of Solexa data and ran out > of memory. > I have 3.2G memory on my desktop computer. > Is there a setting I can use to have enough memory for the readAligned > command? > How much memory do i need? > > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENTlink written 8.8 years ago by Patrick Aboyoun1.6k
Patrick, Yes, I am on a 32-bit machine. length=35bp 298,610kb 50% PF and 12% Align R version 2.9.0 Under development (unstable) (2009-02-12 r47905) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Thanks, Lana -----Original Message----- From: Patrick Aboyoun [mailto:paboyoun@fhcrc.org] Sent: Thursday, February 12, 2009 4:20 PM To: Lana Schaffer Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] memory exhausted for readAligned Lana, Could you provide your session information for you R session as well as the number of alignments you are trying to read in? My first guess is that you are running on a 32-bit architecture, but without more information we on the list can't help you very much. Patrick Lana Schaffer wrote: > Hi, > I am trying to read the alignment in a lane of Solexa data and ran out > of memory. > I have 3.2G memory on my desktop computer. > Is there a setting I can use to have enough memory for the readAligned > command? > How much memory do i need? > > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLYlink written 8.8 years ago by Lana Schaffer1.3k
Lana, Given that there is typically a trade-off between memory and speed and the ubiquity of 64-bit machines, most of the Bioconductor sequencing software was designed with 64-bit architectures in mind in order to minimize computation time. Once you move to a 64-bit machine for this work, I'm pretty sure this issue will go away. Patrick Lana Schaffer wrote: > Patrick, > Yes, I am on a 32-bit machine. > length=35bp > 298,610kb 50% PF and 12% Align > R version 2.9.0 Under development (unstable) (2009-02-12 r47905) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > Thanks, > Lana > > -----Original Message----- > From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] > Sent: Thursday, February 12, 2009 4:20 PM > To: Lana Schaffer > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] memory exhausted for readAligned > > Lana, > Could you provide your session information for you R session as well as > the number of alignments you are trying to read in? My first guess is > that you are running on a 32-bit architecture, but without more > information we on the list can't help you very much. > > > Patrick > > > > Lana Schaffer wrote: > >> Hi, >> I am trying to read the alignment in a lane of Solexa data and ran out >> > > >> of memory. >> I have 3.2G memory on my desktop computer. >> Is there a setting I can use to have enough memory for the readAligned >> > > >> command? >> How much memory do i need? >> >> Lana Schaffer >> Biostatistics/Informatics >> The Scripps Research Institute >> DNA Array Core Facility >> La Jolla, CA 92037 >> (858) 784-2263 >> (858) 784-2994 >> schaffer at scripps.edu >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > >
ADD REPLYlink written 8.8 years ago by Patrick Aboyoun1.6k
Patrick, Do I need a 64-bit windows computer? Where is the code for the R v2.9.0dev Unix version? Lana -----Original Message----- From: Patrick Aboyoun [mailto:paboyoun@fhcrc.org] Sent: Thursday, February 12, 2009 4:32 PM To: Lana Schaffer Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] memory exhausted for readAligned Lana, Given that there is typically a trade-off between memory and speed and the ubiquity of 64-bit machines, most of the Bioconductor sequencing software was designed with 64-bit architectures in mind in order to minimize computation time. Once you move to a 64-bit machine for this work, I'm pretty sure this issue will go away. Patrick Lana Schaffer wrote: > Patrick, > Yes, I am on a 32-bit machine. > length=35bp > 298,610kb 50% PF and 12% Align > R version 2.9.0 Under development (unstable) (2009-02-12 r47905) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > Thanks, > Lana > > -----Original Message----- > From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] > Sent: Thursday, February 12, 2009 4:20 PM > To: Lana Schaffer > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] memory exhausted for readAligned > > Lana, > Could you provide your session information for you R session as well > as the number of alignments you are trying to read in? My first guess > is that you are running on a 32-bit architecture, but without more > information we on the list can't help you very much. > > > Patrick > > > > Lana Schaffer wrote: > >> Hi, >> I am trying to read the alignment in a lane of Solexa data and ran >> out >> > > >> of memory. >> I have 3.2G memory on my desktop computer. >> Is there a setting I can use to have enough memory for the >> readAligned >> > > >> command? >> How much memory do i need? >> >> Lana Schaffer >> Biostatistics/Informatics >> The Scripps Research Institute >> DNA Array Core Facility >> La Jolla, CA 92037 >> (858) 784-2263 >> (858) 784-2994 >> schaffer at scripps.edu >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > >
ADD REPLYlink written 8.8 years ago by Lana Schaffer1.3k
Patrick, I have a 64-bit Unix machine but not a 64-bit Windows machine. I can't find the R version code for either at the moment. lana -----Original Message----- From: Patrick Aboyoun [mailto:paboyoun@fhcrc.org] Sent: Thursday, February 12, 2009 4:32 PM To: Lana Schaffer Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] memory exhausted for readAligned Lana, Given that there is typically a trade-off between memory and speed and the ubiquity of 64-bit machines, most of the Bioconductor sequencing software was designed with 64-bit architectures in mind in order to minimize computation time. Once you move to a 64-bit machine for this work, I'm pretty sure this issue will go away. Patrick Lana Schaffer wrote: > Patrick, > Yes, I am on a 32-bit machine. > length=35bp > 298,610kb 50% PF and 12% Align > R version 2.9.0 Under development (unstable) (2009-02-12 r47905) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > Thanks, > Lana > > -----Original Message----- > From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] > Sent: Thursday, February 12, 2009 4:20 PM > To: Lana Schaffer > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] memory exhausted for readAligned > > Lana, > Could you provide your session information for you R session as well > as the number of alignments you are trying to read in? My first guess > is that you are running on a 32-bit architecture, but without more > information we on the list can't help you very much. > > > Patrick > > > > Lana Schaffer wrote: > >> Hi, >> I am trying to read the alignment in a lane of Solexa data and ran >> out >> > > >> of memory. >> I have 3.2G memory on my desktop computer. >> Is there a setting I can use to have enough memory for the >> readAligned >> > > >> command? >> How much memory do i need? >> >> Lana Schaffer >> Biostatistics/Informatics >> The Scripps Research Institute >> DNA Array Core Facility >> La Jolla, CA 92037 >> (858) 784-2263 >> (858) 784-2994 >> schaffer at scripps.edu >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > >
ADD REPLYlink written 8.8 years ago by Lana Schaffer1.3k
Lana, The definitive guide on R installation can be found in the "R Installation and Administration" manual. http://cran.fhcrc.org/doc/manuals/R-admin.pdf In particular, you can consult chapter 2 "Installing R under Unix-alikes". The basic idea is to grab the R-devel tarball from ftp://ftp.stat.math.ethz.ch/Software/R/R-devel.tar.gz and follow the installation instructions mentioned above. Even if Unix is somewhat foreign to you, it is pretty straightforward to create a typical installation of R. If you hit any snags, let me know. Patrick Lana Schaffer wrote: > Patrick, > I have a 64-bit Unix machine but not a 64-bit Windows machine. > I can't find the R version code for either at the moment. > lana > > -----Original Message----- > From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] > Sent: Thursday, February 12, 2009 4:32 PM > To: Lana Schaffer > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] memory exhausted for readAligned > > Lana, > Given that there is typically a trade-off between memory and speed and > the ubiquity of 64-bit machines, most of the Bioconductor sequencing > software was designed with 64-bit architectures in mind in order to > minimize computation time. Once you move to a 64-bit machine for this > work, I'm pretty sure this issue will go away. > > > Patrick > > > Lana Schaffer wrote: > >> Patrick, >> Yes, I am on a 32-bit machine. >> length=35bp >> 298,610kb 50% PF and 12% Align >> R version 2.9.0 Under development (unstable) (2009-02-12 r47905) >> i386-pc-mingw32 >> >> locale: >> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United >> States.1252;LC_MONETARY=English_United >> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> > > >> Thanks, >> Lana >> >> -----Original Message----- >> From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] >> Sent: Thursday, February 12, 2009 4:20 PM >> To: Lana Schaffer >> Cc: bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] memory exhausted for readAligned >> >> Lana, >> Could you provide your session information for you R session as well >> as the number of alignments you are trying to read in? My first guess >> is that you are running on a 32-bit architecture, but without more >> information we on the list can't help you very much. >> >> >> Patrick >> >> >> >> Lana Schaffer wrote: >> >> >>> Hi, >>> I am trying to read the alignment in a lane of Solexa data and ran >>> out >>> >>> >> >> >>> of memory. >>> I have 3.2G memory on my desktop computer. >>> Is there a setting I can use to have enough memory for the >>> readAligned >>> >>> >> >> >>> command? >>> How much memory do i need? >>> >>> Lana Schaffer >>> Biostatistics/Informatics >>> The Scripps Research Institute >>> DNA Array Core Facility >>> La Jolla, CA 92037 >>> (858) 784-2263 >>> (858) 784-2994 >>> schaffer at scripps.edu >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> >> >> > >
ADD REPLYlink written 8.8 years ago by Patrick Aboyoun1.6k
0
gravatar for Martin Morgan
8.8 years ago by
Martin Morgan ♦♦ 20k
United States
Martin Morgan ♦♦ 20k wrote:
Hi Lana -- "Lana Schaffer" <schaffer at="" scripps.edu=""> writes: > Hi, > I am trying to read the alignment in a lane of Solexa data and ran out > of memory. > I have 3.2G memory on my desktop computer. > Is there a setting I can use to have enough memory for the readAligned > command? > How much memory do i need? It depends on the number of reads, their length, ids, what data file you're reading the reads from, etc. 7.5M 35mers take up ~985MB, as one data point; the reads themselves are about 300MB. The implementation of short read representation means that this data won't get duplicated, so once in memory you should be ok. If your desktop is a Windows box, I think you're probably severely handicaped by memory constraints and will be frustrated during the first steps of the analysis (usually the data collapse quite quickly, e.g., after using 'coverage'). You can visit the R windows faq http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to- be-a-limit-on-the-memory-it-uses_0021 and 'Memory' help page for hints. Depending on your data source and what you intend to do, you might be able to read only some records (MAQ binary input), read just the sequence and / or quality scores (e.g.,readFastq, readXStringColumns) or read just the alignemnt information (e.g., read.table with colClasses taking on NULL values to skip unwanted columns). Also you might want to make sure that you're reading just the files you think you are, e.g., a single lane, and not all files in a directory; the paradigm for readAligned and other ShortRead functions is to read in files that are the equivalent of list.files(dirPath, pattern). Martin > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793
ADD COMMENTlink written 8.8 years ago by Martin Morgan ♦♦ 20k
0
gravatar for Martin Morgan
8.8 years ago by
Martin Morgan ♦♦ 20k
United States
Martin Morgan ♦♦ 20k wrote:
Hi Lana -- "Lana Schaffer" <schaffer at="" scripps.edu=""> writes: > Hi, > I am trying to read the alignment in a lane of Solexa data and ran out > of memory. > I have 3.2G memory on my desktop computer. > Is there a setting I can use to have enough memory for the readAligned > command? > How much memory do i need? It depends on the number of reads, their length, ids, what data file you're reading the reads from, etc. 7.5M 35mers take up ~985MB, as one data point; the reads themselves are about 300MB. The implementation of short read representation means that this data won't get duplicated, so once in memory you should be ok. If your desktop is a Windows box, I think you're probably severely handicapped by memory constraints and will be frustrated during the first steps of the analysis (usually the data collapse quite quickly, e.g., after using 'coverage'). You can visit the R windows faq http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to- be-a-limit-on-the-memory-it-uses_0021 and 'Memory' help page for hints. Depending on your data source and what you intend to do, you might be able to read only some records (MAQ binary input), read just the sequence and / or quality scores (e.g.,readFastq, readXStringColumns) or read just the alignment information (e.g., read.table with colClasses taking on NULL values to skip unwanted columns). Also you might want to make sure that you're reading just the files you think you are, e.g., a single lane, and not all files in a directory; the paradigm for readAligned and other ShortRead functions is to read in files that are the equivalent of list.files(dirPath, pattern). Martin > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793
ADD COMMENTlink written 8.8 years ago by Martin Morgan ♦♦ 20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 216 users visited in the last hour