Question: QuasR: how to use an indexed reference genome?
5.6 years ago by
Paul Shannon • 750
Paul Shannon • 750 wrote:
I am new to QuasR, and alos quite new to aligning short reads to reference genomes more generally. I cannot figure out how to use a pre-built indexed reference genome file with QuasR. The examples supplied with the package work nicely. Scaling up to using all of hg19 raises problems for me. I apologize if I am missing the obvious. To illustrate the problem, I call QuasR's qAlign method with just two arguments (quoting from the man page): sampleFile: a text file listing input sequence files and sample names genome: the reference genome for primary alignments, one of: * a string referring to a "BSgenome" package (e.g. ""BSgenome.Hsapiens.UCSC.hg19""), which will be downloaded automatically from Bioconductor if not present * the name of a fasta sequence file containing one or several sequences (chromosomes) to be used as a reference QuasR apparently invokes the bowtie indexing program when supplied either of the two "genome" options: a BSgenome package, or a fasta file. But since indexing takes a long time -- hours, apparently -- I hoped to provide a ready-made index file, and found some described here: http://bowtie-bio.sourceforge.net/tutorial.shtml specifically ftp://ftp.ccb.jhu.edu/pub/data/bowtie_indexes/hg19.ebwt.zip Various attempts to specify this file, or any of its contents (unzipped) to QuasR fail with these messages: Error: The specified genome /Users/pshannon/s/data/public/bowtie/indexes/hg19.1.ebwt does not have the extension of a fasta file (fa,fasta,fna)> Error: The specified genome has to be a file and not a directory: /Users/pshannon/s/data/public/bowtie/indexes I'll be grateful for advice on how to do this properly. Thanks, - Paul > sessionInfo() R version 3.0.0 (2013-04-03) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale:  C attached base packages:  parallel stats graphics grDevices utils datasets methods base other attached packages:  Rsamtools_1.13.14 BSgenome_1.29.0 Biostrings_2.29.2 QuasR_1.1.4 GenomicRanges_1.13.12 XVector_0.1.0  IRanges_1.19.8 BiocGenerics_0.7.2 Rbowtie_1.1.3 BiocInstaller_1.11.1 loaded via a namespace (and not attached):  AnnotationDbi_1.23.11 Biobase_2.21.2 DBI_0.2-7 GenomicFeatures_1.13.8 RCurl_1.95-4.1  RSQLite_0.11.3 ShortRead_1.19.3 XML_3.95-0.2 biomaRt_2.17.0 bitops_1.0-5  compiler_3.0.0 grid_3.0.0 hwriter_1.3 lattice_0.20-15 rtracklayer_1.21.5  stats4_3.0.0 tools_3.0.0 zlibbioc_1.7.0
ADD COMMENT • link •