Question

Regarding subread vignette

0

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 13 hours ago

Australia/Melbourne

Dear Rashi, The buildindex function uses all chromosomal files to build a index for a reference genome, eg. mouse genome. You should run buildindex only once either with a list of chromosomal file names or with a single file which is concatenated from all chromosomal files. To create a single file which includes all the chromosomal sequences, you can use the following unix command (run it on a unix shell): cat chr1.fa chr2.fa chr3.fa ... > mm9_all.fa The concatenation will take several minutes and the index building will take about 1 hour for mouse genome. Cheers, Wei > Dear Dr. Shi, > > I wanted to use Rsubread vignette for alignment of single read RNA > sequencing data from GAIIx. As I understand before aligning my reads I > need > to build index for reference genome (mouse in my case). Since I'm very new > to > this field and this is the first time I'am dealing with sequencing data I > have some doubts how to do this and your help in this regard is very > appriciated. > > As mentioned in below example I understand that the extdata is the > directry for the indexed genome which is created but what is > "reference.fa". > Is this the fasta file for single chromosome (which means I have to repeat > this for all the chromosomes one by one). > >>library(Rsubread) >>ref <- system.file("extdata", "reference.fa", package = "Rsubread") >>path <- system.file("extdata", package = "Rsubread") >>buildindex(basename = file.path(path, "reference_index"), reference = >> ref) > > Thanks in advance for your help. > > Regards, > Rashi Halder > > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

Sequencing Alignment Rsubread Sequencing Alignment Rsubread • 1.1k views

ADD COMMENT • link 13.2 years ago Wei Shi ★ 3.6k

score 0 · Answer 1 · 2011-10-13

Dear Rashi, Please copy your email to the list. You don't necessarily need to copy the mm9_all.fa file to the extdata folder. You can copy it to any folder you like. Here is the correct command for using buildindex if you copy the file to the current working directory: buildindex(basename="mm9_index",reference="mm9_all.fa") This will create index files with basename "mm9_index". You should give this name to the 'index' parameter of align() function for mapping your reads. Cheers, Wei On Oct 12, 2011, at 8:42 PM, Halder, Rashi wrote: > Dear Dr. Shi, > > Thanks for the kind reply. As suggested by you I prepared the single > mm9_all.fa file by concatenating all chromosome files. And copied this file > into the extdata folder. But when I run buildindex(basename = file.path(path, > "mm9_index"), reference = ref). It gives the error file is inaccessible and > not prepare the index file. I checked the premission of the file also it has > read and write permissions. I also checked the the path in "path" and "ref" > variables its correct. > > Building the index in the base space. > Size of memory requested=3700 MB > Index base name = > /home/rhalder/R/i686-pc-linux-gnu- library/2.13/Rsubread/extdata/mm9_all_index > INDEX ITEMS PER PARTITION = 408392704 > > File > '/home/rhalder/R/i686-pc-linux-gnu- library/2.13/Rsubread/extdata/mm9_all.fa' > is inaccessible. > > Index > /home/rhalder/R/i686-pc-linux-gnu- library/2.13/Rsubread/extdata/mm9_all_index > is successfully built > > Please suggest what is wrong withh the file and how to prepare index file > now. > > Have a nice day. > > Regards, > Rashi Halder > > > > -----Original Message----- > From: Wei Shi [mailto:shi at wehi.EDU.AU] > Sent: Wed 10/12/2011 12:29 AM > To: Halder, Rashi > Cc: bioconductor at r-project.org > Subject: Re: Regarding subread vignette > > Dear Rashi, > > The buildindex function uses all chromosomal files to build a index for a > reference genome, eg. mouse genome. You should run buildindex only once > either with a list of chromosomal file names or with a single file which > is concatenated from all chromosomal files. > > To create a single file which includes all the chromosomal sequences, you > can use the following unix command (run it on a unix shell): > > cat chr1.fa chr2.fa chr3.fa ... > mm9_all.fa > > The concatenation will take several minutes and the index building will > take about 1 hour for mouse genome. > > > Cheers, > Wei > >> Dear Dr. Shi, >> >> I wanted to use Rsubread vignette for alignment of single read RNA >> sequencing data from GAIIx. As I understand before aligning my reads I >> need >> to build index for reference genome (mouse in my case). Since I'm very new >> to >> this field and this is the first time I'am dealing with sequencing data I >> have some doubts how to do this and your help in this regard is very >> appriciated. >> >> As mentioned in below example I understand that the extdata is the >> directry for the indexed genome which is created but what is >> "reference.fa". >> Is this the fasta file for single chromosome (which means I have to repeat >> this for all the chromosomes one by one). >> >>> library(Rsubread) >>> ref <- system.file("extdata", "reference.fa", package = "Rsubread") >>> path <- system.file("extdata", package = "Rsubread") >>> buildindex(basename = file.path(path, "reference_index"), reference = >>> ref) >> >> Thanks in advance for your help. >> >> Regards, >> Rashi Halder >> >> > > > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:18}}