summarizeOverlaps colData does NOT contain countBam() summary data
0
0
Entering edit mode
@martin-morgan-1513
Last seen 12 days ago
United States
Thanks Malcom. The documentation at this point is not accurate; there's a parameter count.mapped.reads=TRUE that needs to be set; it _is_ documented on ?BamFile and has been clarified in devel (where summarizeOverlaps is in the new package 'GenomicAlignments'). Martin ----- Malcolm Cook <mec at="" stowers.org=""> wrote: > Valerie and other Genomics, > > I read in ?summarizeOverlaps that > > 'colData' is a DataFrame with columns of 'object' (class of > 'reads') and 'records' (length of 'reads'). When 'reads' is a > BamFile or BamFileList the 'colData' holds the output of a call to > 'countBam' with columns of 'records' (total records in file), > 'nucleotides' and 'mapped'. The number in 'mapped' is the number > of records returned when 'isUnmappedQuery=FALSE' in the > 'ScanBamParam'. > > and also, > > ## When the reads are Bam files, the 'colData' contains summary > ## information from a call to countBam(). > > However, I find this NOT to be true. Viz (in a fresh R session) > > >library(GenomicRanges) > >example(summarizeOverlaps) > .... > > colData(se) > DataFrame with 2 rows and 0 columns > > # but yet: > > > do.call(rbind,lapply(fls,countBam)) > space start end width file records nucleotides > sm_treated1.bam NA NA NA NA sm_treated1.bam 1800 80260 > sm_untreated1.bam NA NA NA NA sm_untreated1.bam 1800 135000 > > Can you advise? > > Thanks! > > ~ Malcolm Cook > Computational Biology / Shilatifard Lab - Stowers Institute for Medical Research - Kansas City > > > PS > > > sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices datasets utils methods base > > other attached packages: > [1] edgeR_3.4.2 limma_3.18.9 DESeq_1.14.0 lattice_0.20-24 locfit_1.5-9.1 TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.10.1 GenomicFeatures_1.14.2 AnnotationDbi_1.24.0 Biobase_2.22.0 pasillaBamSubset_0.0.8 BiocInstaller_1.12.0 Rsamtools_1.14.2 Biostrings_2.30.1 > [14] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.6 BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] annotate_1.40.0 biomaRt_2.18.0 bitops_1.0-6 BSgenome_1.30.0 DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2 RColorBrewer_1.0-5 RCurl_1.95-4.1 RSQLite_0.11.4 rtracklayer_1.22.2 splines_3.0.2 stats4_3.0.2 survival_2.37-7 tools_3.0.2 XML_3.98-1.1 xtable_1.7-1 zlibbioc_1.8.0 > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
• 783 views
ADD COMMENT

Login before adding your answer.

Traffic: 593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6