error message with easyRNASeq use case

0

Entering edit mode

Richard Friedman ★ 2.0k

@richard-friedman-513

Last seen 10.5 years ago

Dear List, I am working through the easyRNASeq use case. (easyRNASeq: an overview Oct 16, 2012, section 7) I am working on a Mac so I could not do the alignment part of the use case but rather started with bam files produced by top hat: ccrfml1:learning_easyRNAseq friedman$ ls 490224.bam easyRNAseqworkingscripts.txt 490225.bam learningRNASeq.docx easyRNASeqvignette2.txt Here is my session record: > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] easyRNASeq_1.4.2 ShortRead_1.16.2 latticeExtra_0.6-19 RColorBrewer_1.0-5 [5] lattice_0.20-10 Rsamtools_1.10.2 DESeq_1.10.1 locfit_1.5-8 [9] BSgenome_1.26.1 GenomicRanges_1.10.5 Biostrings_2.26.2 IRanges_1.16.4 [13] edgeR_3.0.4 limma_3.12.1 biomaRt_2.14.0 Biobase_2.18.0 [17] genomeIntervals_1.14.0 BiocGenerics_0.4.0 intervals_0.13.3 loaded via a namespace (and not attached): [1] annotate_1.34.1 AnnotationDbi_1.20.2 bitops_1.0-4.1 DBI_0.2-5 genefilter_1.38.0 [6] geneplotter_1.34.0 grid_2.15.2 hwriter_1.3 RCurl_1.91-1 RSQLite_0.11.1 [11] splines_2.15.2 stats4_2.15.2 survival_2.36-14 tools_2.15.2 XML_3.9-4 [16] xtable_1.7-0 zlibbioc_1.2.0 > chr.sizes=seqlengths(Hsapiens) > chr.sizes chr1 chr2 chr3 chr4 249250621 243199373 198022430 191154276 chr5 chr6 chr7 chr8 180915260 171115067 159138663 146364022 chr9 chr10 chr11 chr12 141213431 135534747 135006516 133851895 chr13 chr14 chr15 chr16 115169878 107349540 102531392 90354753 chr17 chr18 chr19 chr20 81195210 78077248 59128983 63025520 chr21 chr22 chrX chrY 48129895 51304566 155270560 59373566 chrM chr1_gl000191_random chr1_gl000192_random chr4_ctg9_hap1 16571 106433 547496 590426 chr4_gl000193_random chr4_gl000194_random chr6_apd_hap1 chr6_cox_hap2 189789 191469 4622290 4795371 chr6_dbb_hap3 chr6_mann_hap4 chr6_mcf_hap5 chr6_qbl_hap6 4610396 4683263 4833398 4611984 chr6_ssto_hap7 chr7_gl000195_random chr8_gl000196_random chr8_gl000197_random 4928567 182896 38914 37175 chr9_gl000198_random chr9_gl000199_random chr9_gl000200_random chr9_gl000201_random 90085 169874 187035 36148 chr11_gl000202_random chr17_ctg5_hap1 chr17_gl000203_random chr17_gl000204_random 40103 1680828 37498 81310 chr17_gl000205_random chr17_gl000206_random chr18_gl000207_random chr19_gl000208_random 174588 41001 4262 92689 chr19_gl000209_random chr21_gl000210_random chrUn_gl000211 chrUn_gl000212 159169 27682 166566 186858 chrUn_gl000213 chrUn_gl000214 chrUn_gl000215 chrUn_gl000216 164239 137718 172545 172294 chrUn_gl000217 chrUn_gl000218 chrUn_gl000219 chrUn_gl000220 172149 161147 179198 161802 chrUn_gl000221 chrUn_gl000222 chrUn_gl000223 chrUn_gl000224 155397 186861 180455 179693 chrUn_gl000225 chrUn_gl000226 chrUn_gl000227 chrUn_gl000228 211173 15008 128374 129120 chrUn_gl000229 chrUn_gl000230 chrUn_gl000231 chrUn_gl000232 19913 43691 27386 40652 chrUn_gl000233 chrUn_gl000234 chrUn_gl000235 chrUn_gl000236 45941 40531 34474 41934 chrUn_gl000237 chrUn_gl000238 chrUn_gl000239 chrUn_gl000240 45867 39939 33824 41933 chrUn_gl000241 chrUn_gl000242 chrUn_gl000243 chrUn_gl000244 42152 43523 43341 39929 chrUn_gl000245 chrUn_gl000246 chrUn_gl000247 chrUn_gl000248 36651 38154 36422 39786 chrUn_gl000249 38502 > bamfiles=dir(getwd(),pattern="*\.bam$") > bamfiles [1] "490224.bam" "490225.bam" > rnaSeq <- easyRNASeq(filesDirectory=getwd(), + organism="Hsapiens", + chr.sizes=chr.sizes, + readLength=58L, + annotationMethod="biomaRt", + count="exons", + filenames=bamfiles[1], + outputFormat="RNAseq" + ) Checking arguments... Error in easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = chr.sizes, : You must indicate the format of you source files, by setting argument 'format' COMMENT: I THOUGHT THAT BAM FILES WERE AUTOMATICALLY THE INPUT FILE FORMAT, > rnaSeq <- easyRNASeq(filesDirectory=getwd(), + organism="Hsapiens", + chr.sizes=chr.sizes, + readLength=58L, + annotationMethod="biomaRt", + count="exons", + format="bam", + filenames=bamfiles[1], + outputFormat="RNAseq" + ) Checking arguments... Error in easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = chr.sizes, : Index files (bai) are required. They are missing for the files: /Documents/clients/Phyllis/learning_easyRNAseq/490224.bam QUESTION: HOW DI I OBTAIN OR PRODUCE THESE INPUT FILES? Thanks and best wishes, Rich Richard A. Friedman, PhD Associate Research Scientist, Biomedical Informatics Shared Resource Herbert Irving Comprehensive Cancer Center (HICCC) Lecturer, Department of Biomedical Informatics (DBMI) Educational Coordinator, Center for Computational Biology and Bioinformatics (C2B2)/ National Center for Multiscale Analysis of Genomic Networks (MAGNet) Room 824 Irving Cancer Research Center Columbia University 1130 St. Nicholas Ave New York, NY 10032 (212)851-4765 (voice) friedman@cancercenter.columbia.edu http://cancercenter.columbia.edu/~friedman/ In memoriam, Ray Bradbury [[alternative HTML version deleted]]

RNASeq Cancer Organism easyRNASeq RNASeq Cancer Organism easyRNASeq • 1.4k views

ADD COMMENT • link updated 12.3 years ago by James W. MacDonald 68k • written 12.3 years ago by Richard Friedman ★ 2.0k

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 12 hours ago

United States

On 11/27/2012 2:30 PM, Richard Friedman wrote: > Dear List, > > I am working through the easyRNASeq use case. > (easyRNASeq: an overview Oct 16, 2012, section 7) > I am working on a Mac so I could not do the alignment > part of the use case but rather started with bam files > produced by top hat: > > > ccrfml1:learning_easyRNAseq friedman$ ls > 490224.bam easyRNAseqworkingscripts.txt > 490225.bam learningRNASeq.docx > easyRNASeqvignette2.txt > > Here is my session record: > >> sessionInfo() > R version 2.15.2 (2012-10-26) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] easyRNASeq_1.4.2 ShortRead_1.16.2 latticeExtra_0.6-19 RColorBrewer_1.0-5 > [5] lattice_0.20-10 Rsamtools_1.10.2 DESeq_1.10.1 locfit_1.5-8 > [9] BSgenome_1.26.1 GenomicRanges_1.10.5 Biostrings_2.26.2 IRanges_1.16.4 > [13] edgeR_3.0.4 limma_3.12.1 biomaRt_2.14.0 Biobase_2.18.0 > [17] genomeIntervals_1.14.0 BiocGenerics_0.4.0 intervals_0.13.3 > > loaded via a namespace (and not attached): > [1] annotate_1.34.1 AnnotationDbi_1.20.2 bitops_1.0-4.1 DBI_0.2-5 genefilter_1.38.0 > [6] geneplotter_1.34.0 grid_2.15.2 hwriter_1.3 RCurl_1.91-1 RSQLite_0.11.1 > [11] splines_2.15.2 stats4_2.15.2 survival_2.36-14 tools_2.15.2 XML_3.9-4 > [16] xtable_1.7-0 zlibbioc_1.2.0 > > >> chr.sizes=seqlengths(Hsapiens) >> chr.sizes > chr1 chr2 chr3 chr4 > 249250621 243199373 198022430 191154276 > chr5 chr6 chr7 chr8 > 180915260 171115067 159138663 146364022 > chr9 chr10 chr11 chr12 > 141213431 135534747 135006516 133851895 > chr13 chr14 chr15 chr16 > 115169878 107349540 102531392 90354753 > chr17 chr18 chr19 chr20 > 81195210 78077248 59128983 63025520 > chr21 chr22 chrX chrY > 48129895 51304566 155270560 59373566 > chrM chr1_gl000191_random chr1_gl000192_random chr4_ctg9_hap1 > 16571 106433 547496 590426 > chr4_gl000193_random chr4_gl000194_random chr6_apd_hap1 chr6_cox_hap2 > 189789 191469 4622290 4795371 > chr6_dbb_hap3 chr6_mann_hap4 chr6_mcf_hap5 chr6_qbl_hap6 > 4610396 4683263 4833398 4611984 > chr6_ssto_hap7 chr7_gl000195_random chr8_gl000196_random chr8_gl000197_random > 4928567 182896 38914 37175 > chr9_gl000198_random chr9_gl000199_random chr9_gl000200_random chr9_gl000201_random > 90085 169874 187035 36148 > chr11_gl000202_random chr17_ctg5_hap1 chr17_gl000203_random chr17_gl000204_random > 40103 1680828 37498 81310 > chr17_gl000205_random chr17_gl000206_random chr18_gl000207_random chr19_gl000208_random > 174588 41001 4262 92689 > chr19_gl000209_random chr21_gl000210_random chrUn_gl000211 chrUn_gl000212 > 159169 27682 166566 186858 > chrUn_gl000213 chrUn_gl000214 chrUn_gl000215 chrUn_gl000216 > 164239 137718 172545 172294 > chrUn_gl000217 chrUn_gl000218 chrUn_gl000219 chrUn_gl000220 > 172149 161147 179198 161802 > chrUn_gl000221 chrUn_gl000222 chrUn_gl000223 chrUn_gl000224 > 155397 186861 180455 179693 > chrUn_gl000225 chrUn_gl000226 chrUn_gl000227 chrUn_gl000228 > 211173 15008 128374 129120 > chrUn_gl000229 chrUn_gl000230 chrUn_gl000231 chrUn_gl000232 > 19913 43691 27386 40652 > chrUn_gl000233 chrUn_gl000234 chrUn_gl000235 chrUn_gl000236 > 45941 40531 34474 41934 > chrUn_gl000237 chrUn_gl000238 chrUn_gl000239 chrUn_gl000240 > 45867 39939 33824 41933 > chrUn_gl000241 chrUn_gl000242 chrUn_gl000243 chrUn_gl000244 > 42152 43523 43341 39929 > chrUn_gl000245 chrUn_gl000246 chrUn_gl000247 chrUn_gl000248 > 36651 38154 36422 39786 > chrUn_gl000249 > 38502 > >> bamfiles=dir(getwd(),pattern="*\.bam$") >> bamfiles > [1] "490224.bam" "490225.bam" >> rnaSeq<- easyRNASeq(filesDirectory=getwd(), > + organism="Hsapiens", > + chr.sizes=chr.sizes, > + readLength=58L, > + annotationMethod="biomaRt", > + count="exons", > + filenames=bamfiles[1], > + outputFormat="RNAseq" > + ) > Checking arguments... > Error in easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = chr.sizes, : > You must indicate the format of you source files, by setting argument 'format' > > COMMENT: I THOUGHT THAT BAM FILES WERE AUTOMATICALLY THE INPUT > FILE FORMAT, > >> rnaSeq<- easyRNASeq(filesDirectory=getwd(), > + organism="Hsapiens", > + chr.sizes=chr.sizes, > + readLength=58L, > + annotationMethod="biomaRt", > + count="exons", > + format="bam", > + filenames=bamfiles[1], > + outputFormat="RNAseq" > + ) > Checking arguments... > Error in easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", chr.sizes = chr.sizes, : > Index files (bai) are required. They are missing for the files: /Documents/clients/Phyllis/learning_easyRNAseq/490224.bam > > QUESTION: HOW DI I OBTAIN OR PRODUCE THESE INPUT FILES? You want indexBam() in Rsamtools. See ?BamFile. Best, Jim > > Thanks and best wishes, > Rich > Richard A. Friedman, PhD > Associate Research Scientist, > Biomedical Informatics Shared Resource > Herbert Irving Comprehensive Cancer Center (HICCC) > Lecturer, > Department of Biomedical Informatics (DBMI) > Educational Coordinator, > Center for Computational Biology and Bioinformatics (C2B2)/ > National Center for Multiscale Analysis of Genomic Networks (MAGNet) > Room 824 > Irving Cancer Research Center > Columbia University > 1130 St. Nicholas Ave > New York, NY 10032 > (212)851-4765 (voice) > friedman at cancercenter.columbia.edu > http://cancercenter.columbia.edu/~friedman/ > > In memoriam, Ray Bradbury > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 12.3 years ago James W. MacDonald 68k

0

Entering edit mode

Jim and Tim, Thank you both for your answers. Best wishes, Rich On Nov 27, 2012, at 2:45 PM, James W. MacDonald wrote: > > >> > > You want indexBam() in Rsamtools. See ?BamFile. > > Best, > > Jim >

ADD REPLY • link 12.3 years ago Richard Friedman ★ 2.0k

Login before adding your answer.