memory exhausted for readAligned
3
0
Entering edit mode
Lana Schaffer ★ 1.3k
@lana-schaffer-1056
Last seen 9.6 years ago
Hi, I am trying to read the alignment in a lane of Solexa data and ran out of memory. I have 3.2G memory on my desktop computer. Is there a setting I can use to have enough memory for the readAligned command? How much memory do i need? Lana Schaffer Biostatistics/Informatics The Scripps Research Institute DNA Array Core Facility La Jolla, CA 92037 (858) 784-2263 (858) 784-2994 schaffer at scripps.edu
Alignment Alignment • 1.5k views
ADD COMMENT
0
Entering edit mode
Patrick Aboyoun ★ 1.6k
@patrick-aboyoun-6734
Last seen 9.6 years ago
United States
Lana, Could you provide your session information for you R session as well as the number of alignments you are trying to read in? My first guess is that you are running on a 32-bit architecture, but without more information we on the list can't help you very much. Patrick Lana Schaffer wrote: > Hi, > I am trying to read the alignment in a lane of Solexa data and ran out > of memory. > I have 3.2G memory on my desktop computer. > Is there a setting I can use to have enough memory for the readAligned > command? > How much memory do i need? > > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT
0
Entering edit mode
Patrick, Yes, I am on a 32-bit machine. length=35bp 298,610kb 50% PF and 12% Align R version 2.9.0 Under development (unstable) (2009-02-12 r47905) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Thanks, Lana -----Original Message----- From: Patrick Aboyoun [mailto:paboyoun@fhcrc.org] Sent: Thursday, February 12, 2009 4:20 PM To: Lana Schaffer Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] memory exhausted for readAligned Lana, Could you provide your session information for you R session as well as the number of alignments you are trying to read in? My first guess is that you are running on a 32-bit architecture, but without more information we on the list can't help you very much. Patrick Lana Schaffer wrote: > Hi, > I am trying to read the alignment in a lane of Solexa data and ran out > of memory. > I have 3.2G memory on my desktop computer. > Is there a setting I can use to have enough memory for the readAligned > command? > How much memory do i need? > > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY
0
Entering edit mode
Lana, Given that there is typically a trade-off between memory and speed and the ubiquity of 64-bit machines, most of the Bioconductor sequencing software was designed with 64-bit architectures in mind in order to minimize computation time. Once you move to a 64-bit machine for this work, I'm pretty sure this issue will go away. Patrick Lana Schaffer wrote: > Patrick, > Yes, I am on a 32-bit machine. > length=35bp > 298,610kb 50% PF and 12% Align > R version 2.9.0 Under development (unstable) (2009-02-12 r47905) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > Thanks, > Lana > > -----Original Message----- > From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] > Sent: Thursday, February 12, 2009 4:20 PM > To: Lana Schaffer > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] memory exhausted for readAligned > > Lana, > Could you provide your session information for you R session as well as > the number of alignments you are trying to read in? My first guess is > that you are running on a 32-bit architecture, but without more > information we on the list can't help you very much. > > > Patrick > > > > Lana Schaffer wrote: > >> Hi, >> I am trying to read the alignment in a lane of Solexa data and ran out >> > > >> of memory. >> I have 3.2G memory on my desktop computer. >> Is there a setting I can use to have enough memory for the readAligned >> > > >> command? >> How much memory do i need? >> >> Lana Schaffer >> Biostatistics/Informatics >> The Scripps Research Institute >> DNA Array Core Facility >> La Jolla, CA 92037 >> (858) 784-2263 >> (858) 784-2994 >> schaffer at scripps.edu >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > >
ADD REPLY
0
Entering edit mode
Patrick, Do I need a 64-bit windows computer? Where is the code for the R v2.9.0dev Unix version? Lana -----Original Message----- From: Patrick Aboyoun [mailto:paboyoun@fhcrc.org] Sent: Thursday, February 12, 2009 4:32 PM To: Lana Schaffer Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] memory exhausted for readAligned Lana, Given that there is typically a trade-off between memory and speed and the ubiquity of 64-bit machines, most of the Bioconductor sequencing software was designed with 64-bit architectures in mind in order to minimize computation time. Once you move to a 64-bit machine for this work, I'm pretty sure this issue will go away. Patrick Lana Schaffer wrote: > Patrick, > Yes, I am on a 32-bit machine. > length=35bp > 298,610kb 50% PF and 12% Align > R version 2.9.0 Under development (unstable) (2009-02-12 r47905) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > Thanks, > Lana > > -----Original Message----- > From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] > Sent: Thursday, February 12, 2009 4:20 PM > To: Lana Schaffer > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] memory exhausted for readAligned > > Lana, > Could you provide your session information for you R session as well > as the number of alignments you are trying to read in? My first guess > is that you are running on a 32-bit architecture, but without more > information we on the list can't help you very much. > > > Patrick > > > > Lana Schaffer wrote: > >> Hi, >> I am trying to read the alignment in a lane of Solexa data and ran >> out >> > > >> of memory. >> I have 3.2G memory on my desktop computer. >> Is there a setting I can use to have enough memory for the >> readAligned >> > > >> command? >> How much memory do i need? >> >> Lana Schaffer >> Biostatistics/Informatics >> The Scripps Research Institute >> DNA Array Core Facility >> La Jolla, CA 92037 >> (858) 784-2263 >> (858) 784-2994 >> schaffer at scripps.edu >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > >
ADD REPLY
0
Entering edit mode
Patrick, I have a 64-bit Unix machine but not a 64-bit Windows machine. I can't find the R version code for either at the moment. lana -----Original Message----- From: Patrick Aboyoun [mailto:paboyoun@fhcrc.org] Sent: Thursday, February 12, 2009 4:32 PM To: Lana Schaffer Cc: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] memory exhausted for readAligned Lana, Given that there is typically a trade-off between memory and speed and the ubiquity of 64-bit machines, most of the Bioconductor sequencing software was designed with 64-bit architectures in mind in order to minimize computation time. Once you move to a 64-bit machine for this work, I'm pretty sure this issue will go away. Patrick Lana Schaffer wrote: > Patrick, > Yes, I am on a 32-bit machine. > length=35bp > 298,610kb 50% PF and 12% Align > R version 2.9.0 Under development (unstable) (2009-02-12 r47905) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > Thanks, > Lana > > -----Original Message----- > From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] > Sent: Thursday, February 12, 2009 4:20 PM > To: Lana Schaffer > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] memory exhausted for readAligned > > Lana, > Could you provide your session information for you R session as well > as the number of alignments you are trying to read in? My first guess > is that you are running on a 32-bit architecture, but without more > information we on the list can't help you very much. > > > Patrick > > > > Lana Schaffer wrote: > >> Hi, >> I am trying to read the alignment in a lane of Solexa data and ran >> out >> > > >> of memory. >> I have 3.2G memory on my desktop computer. >> Is there a setting I can use to have enough memory for the >> readAligned >> > > >> command? >> How much memory do i need? >> >> Lana Schaffer >> Biostatistics/Informatics >> The Scripps Research Institute >> DNA Array Core Facility >> La Jolla, CA 92037 >> (858) 784-2263 >> (858) 784-2994 >> schaffer at scripps.edu >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > >
ADD REPLY
0
Entering edit mode
Lana, The definitive guide on R installation can be found in the "R Installation and Administration" manual. http://cran.fhcrc.org/doc/manuals/R-admin.pdf In particular, you can consult chapter 2 "Installing R under Unix-alikes". The basic idea is to grab the R-devel tarball from ftp://ftp.stat.math.ethz.ch/Software/R/R-devel.tar.gz and follow the installation instructions mentioned above. Even if Unix is somewhat foreign to you, it is pretty straightforward to create a typical installation of R. If you hit any snags, let me know. Patrick Lana Schaffer wrote: > Patrick, > I have a 64-bit Unix machine but not a 64-bit Windows machine. > I can't find the R version code for either at the moment. > lana > > -----Original Message----- > From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] > Sent: Thursday, February 12, 2009 4:32 PM > To: Lana Schaffer > Cc: bioconductor at stat.math.ethz.ch > Subject: Re: [BioC] memory exhausted for readAligned > > Lana, > Given that there is typically a trade-off between memory and speed and > the ubiquity of 64-bit machines, most of the Bioconductor sequencing > software was designed with 64-bit architectures in mind in order to > minimize computation time. Once you move to a 64-bit machine for this > work, I'm pretty sure this issue will go away. > > > Patrick > > > Lana Schaffer wrote: > >> Patrick, >> Yes, I am on a 32-bit machine. >> length=35bp >> 298,610kb 50% PF and 12% Align >> R version 2.9.0 Under development (unstable) (2009-02-12 r47905) >> i386-pc-mingw32 >> >> locale: >> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United >> States.1252;LC_MONETARY=English_United >> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> > > >> Thanks, >> Lana >> >> -----Original Message----- >> From: Patrick Aboyoun [mailto:paboyoun at fhcrc.org] >> Sent: Thursday, February 12, 2009 4:20 PM >> To: Lana Schaffer >> Cc: bioconductor at stat.math.ethz.ch >> Subject: Re: [BioC] memory exhausted for readAligned >> >> Lana, >> Could you provide your session information for you R session as well >> as the number of alignments you are trying to read in? My first guess >> is that you are running on a 32-bit architecture, but without more >> information we on the list can't help you very much. >> >> >> Patrick >> >> >> >> Lana Schaffer wrote: >> >> >>> Hi, >>> I am trying to read the alignment in a lane of Solexa data and ran >>> out >>> >>> >> >> >>> of memory. >>> I have 3.2G memory on my desktop computer. >>> Is there a setting I can use to have enough memory for the >>> readAligned >>> >>> >> >> >>> command? >>> How much memory do i need? >>> >>> Lana Schaffer >>> Biostatistics/Informatics >>> The Scripps Research Institute >>> DNA Array Core Facility >>> La Jolla, CA 92037 >>> (858) 784-2263 >>> (858) 784-2994 >>> schaffer at scripps.edu >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >>> >> >> > >
ADD REPLY
0
Entering edit mode
@martin-morgan-1513
Last seen 3 days ago
United States
Hi Lana -- "Lana Schaffer" <schaffer at="" scripps.edu=""> writes: > Hi, > I am trying to read the alignment in a lane of Solexa data and ran out > of memory. > I have 3.2G memory on my desktop computer. > Is there a setting I can use to have enough memory for the readAligned > command? > How much memory do i need? It depends on the number of reads, their length, ids, what data file you're reading the reads from, etc. 7.5M 35mers take up ~985MB, as one data point; the reads themselves are about 300MB. The implementation of short read representation means that this data won't get duplicated, so once in memory you should be ok. If your desktop is a Windows box, I think you're probably severely handicaped by memory constraints and will be frustrated during the first steps of the analysis (usually the data collapse quite quickly, e.g., after using 'coverage'). You can visit the R windows faq http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to- be-a-limit-on-the-memory-it-uses_0021 and 'Memory' help page for hints. Depending on your data source and what you intend to do, you might be able to read only some records (MAQ binary input), read just the sequence and / or quality scores (e.g.,readFastq, readXStringColumns) or read just the alignemnt information (e.g., read.table with colClasses taking on NULL values to skip unwanted columns). Also you might want to make sure that you're reading just the files you think you are, e.g., a single lane, and not all files in a directory; the paradigm for readAligned and other ShortRead functions is to read in files that are the equivalent of list.files(dirPath, pattern). Martin > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 3 days ago
United States
Hi Lana -- "Lana Schaffer" <schaffer at="" scripps.edu=""> writes: > Hi, > I am trying to read the alignment in a lane of Solexa data and ran out > of memory. > I have 3.2G memory on my desktop computer. > Is there a setting I can use to have enough memory for the readAligned > command? > How much memory do i need? It depends on the number of reads, their length, ids, what data file you're reading the reads from, etc. 7.5M 35mers take up ~985MB, as one data point; the reads themselves are about 300MB. The implementation of short read representation means that this data won't get duplicated, so once in memory you should be ok. If your desktop is a Windows box, I think you're probably severely handicapped by memory constraints and will be frustrated during the first steps of the analysis (usually the data collapse quite quickly, e.g., after using 'coverage'). You can visit the R windows faq http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to- be-a-limit-on-the-memory-it-uses_0021 and 'Memory' help page for hints. Depending on your data source and what you intend to do, you might be able to read only some records (MAQ binary input), read just the sequence and / or quality scores (e.g.,readFastq, readXStringColumns) or read just the alignment information (e.g., read.table with colClasses taking on NULL values to skip unwanted columns). Also you might want to make sure that you're reading just the files you think you are, e.g., a single lane, and not all files in a directory; the paradigm for readAligned and other ShortRead functions is to read in files that are the equivalent of list.files(dirPath, pattern). Martin > Lana Schaffer > Biostatistics/Informatics > The Scripps Research Institute > DNA Array Core Facility > La Jolla, CA 92037 > (858) 784-2263 > (858) 784-2994 > schaffer at scripps.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793
ADD COMMENT

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6