Hi,
I am trying to use Rsubread on the wheat genome ~17 Gbases. I have 32 GB of memory, most is available I am running R version 3.6.1 (64 bit), Rsubread version 2.0.1
I figured that the indexSplit command would allow for larger genome sizes but I am running into a problem where the program seems to run out of memory even though I have more than enough for the operation.
I have tried the following two commands and received the following two errors:
Command 1: buildindex(basename = "wheatfullRsubreadtest", reference = "161010ChineseSpringv1.0_pseudomolecules.fasta", indexSplit = TRUE)
Error 1:
Check the integrity of provided reference sequences ... || || No format issues were found || || Scan uninformative subreads in reference sequences ... || ERROR: the provided reference sequences include more than 4 billion bases. || 1.6 GB of memory is needed for index building.
Command 2: buildindex(basename = "wheatfullRsubreadtest", reference = "161010ChineseSpringv1.0_pseudomolecules.fasta", indexSplit = TRUE, memory = 4000)
Error 2: | || || Check the integrity of provided reference sequences ... || || No format issues were found || || Scan uninformative subreads in reference sequences ... || ERROR: the provided reference sequences include more than 4 billion bases. || 1.5 GB of memory is needed for index building.
sessionInfo() R version 3.6.1 (2019-07-05) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LCCOLLATE=EnglishUnited States.1252
[2] LCCTYPE=EnglishUnited States.1252
[3] LCMONETARY=EnglishUnited States.1252
[4] LCNUMERIC=C
[5] LCTIME=English_United States.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] Rsubread2.0.1 BiocManager1.30.10
loaded via a namespace (and not attached): [1] compiler3.6.1 tools3.6.1
Anyone have any ideas around this?
Thanks!
Bryan
Bryan, you and I have very similar computer setups and although my problem is different using
buildIndex
(my genomes are less than 4Gbp),buildIndex
is failing for me as well. This might be a bug.My machine has 16GB of RAM, and it would not work (it locked up actually) when I set
indexSplit = TRUE, memory = 4000
, however, it worked (or at least it completed without errors) usingindexSplit = TRUE, memory = 10000
. I don't know what's going on, but would be interested to see what happens if you tried those parameters.Peter
Hi Peter,
Thanks! I will try the 10 GB memory (may also try 16 GB) and see if it helps!
Bryan
I think the cause of the problem wasn't the size of the memory. The maximum genome size is 4 billion bases for building an index, and splitting the index or specifying a memory limit wouldn't change this limit.