Rsubread buildIndex: Too many sections in reference?
0
0
Entering edit mode
Wei Shi ★ 3.6k
@wei-shi-2183
Last seen 13 hours ago
Australia/Melbourne
Dear Davis, The buildindex function has a hard limit on the number of chromosomes allowed, which is 1000. Your "rn5.fa" file contains more than 1000 chromsomes/contigs and therefore the function reported that message. The consequence was that those chromosomes/contigs located after the first 1000 chromsomes/configs in the file were not correctly indexed. We have increased the limit to 50,000 which should be OK for your dataset now. The changes have been committed to bioc devel svn. It should be available to you in a couple of days. Let us know if the problem persists. Cheers, Wei On Oct 19, 2012, at 6:30 AM, Davis, Wade wrote: > Dear Wei, > I received the following message when building an index for rn5: > > >buildindex(basename="rn5_rsubread_index",reference="rn5.fa",memory= 12000) > > Building a base-space index. > Size of memory used=12000 MB > Base name of the built index = rn5_rsubread_index > Scanning non-informative reads in the chromosomes... > completed=85.27%; time used=216.9s; rate=14099.1k bps/s; total=2926m bps > There are too many sections in the chromosome data files (more than 1000 sections). > There are 663648 non-informative subreads found in the chromosomes. > Index items per partition = 1375180800 > > My question is: What is the consequence of the message “There are too many sections in the chromosome data files (more than 1000 sections).” > > I imagine this is due to all of the “nonstandard” chromosomes in the reference. I could “clean up” up the reference to get rid of them, but I am curious to know the (biological) opinion of others. This to be used for a standard RNA-Seq run (on rat of course). > > I am running the development version of R and Rsubread, as shown below. > > Thanks, > Wade > > > sessionInfo() > R Under development (unstable) (2012-09-24 r60800) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Rsubread_1.9.0 ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:8}}
Rsubread Rsubread • 1.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 1240 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6