Question: Index genome generation with Subread for colorspace data
0
gravatar for gokberk
7 months ago by
gokberk0
gokberk0 wrote:

Hi all,

I need to analyze some old SOLiD colorspace RNA-seq reads and have heard that Subread still supports colorspace data analysis. So, I downloaded version 1.6.4 and compiled it on my server. I've been trying to generate an index genome using ./subread-buildindex -c -F -o macaca_fascicularis_5.0_index ../../bowtie_index/macaca_fascicularis_5.0_genome.fa command and received the fancy output below:

        ==========     _____ _    _ ____  _____  ______          _____  
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
      v1.6.4

//================================= setting ==================================\\
||                                                                            ||
||                Index name : macaca_fascicularis_5.0_index                  ||
||               Index space : color space                                    ||
||                    Memory : 8000 Mbytes                                    ||
||          Repeat threshold : 100 repeats                                    ||
||              Gapped index : no                                             ||
||                                                                            ||
||               Input files : 1 file in total                                ||
||                             o macaca_fascicularis_5.0_genome.fa            ||
||                                                                            ||
\\============================================================================//

//================================= Running ==================================\\
||                                                                            ||
|| Check the integrity of provided reference sequences ...                    ||
|| No format issues were found                                                ||
|| Scan uninformative subreads in reference sequences ...                     ||

However, it stuck at this point about an hour and a half now, so I was wondering if something is wrong or it's normal. The genome assembly I'm indexing is 3GB.

I'd appreciate any helps, cheers. Gökberk

ADD COMMENTlink modified 7 months ago by Yang Liao180 • written 7 months ago by gokberk0
Answer: Index genome generation with Subread for colorspace data
0
gravatar for Yang Liao
7 months ago by
Yang Liao180
Australia
Yang Liao180 wrote:

Hi Gökberk,

I downloaded the 5.0 version of the Macaca Fascicularis genome from Ensembl (the top-level sequences, 867 MB in gzipped format). I then ran the index builder in Subread-1.6.4 with the same arguments as you used. The index was built in 45 minutes with no error, and the "scan uninformative subreads" step used less than 15 minutes (on a Xeon E5-2690 v3 computer with 512GB of memory). If it is the same genome you used, it looks like the index builder was very slow on your computer.

The index builder uses around 10GB of memory under your settings, so please see if your computer has enough memory to run the index builder. When the physical memory runs out, the operating system may use the swap volume on the HDD and it is very slow.

BTW, if your computer has at least 24GB of free memory, I suggest to use the "-B" option to build a one-block index. This can largely improve the mapping speed.

Cheers, Yang

ADD COMMENTlink written 7 months ago by Yang Liao180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 322 users visited in the last hour