pulling out specific lines from a bam file
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
I would like to enter a list of SNPs and a bam file and have Bioconductor return all the lines from the bam file, in human readable form, that contain the SNP in question. Here is my attempt: [code]library(GenomicRanges) genes <- GRanges(seqnames = c("13", "1", "12"), ranges = IRanges( start = c(111942495, 114516752, 116434901), width = 1), strand = rep("*", 3), seqlengths = c("1" = 249250621, "13" = 115169878, "12" = 133851895)) alnFile <- "first_10k_PASRTP_RNA_790790_exome6500SNPtolerant_062913_1_ clean_sorted.bam" aln <- readGappedAlignments(alnFile) bamGRanges <- GRanges(seqnames = seqnames(aln), ranges = ranges(aln), strand = strand(aln), seqlengths = seqlengths(aln)) answers <- findOverlaps(query = genes, subject = bamGRanges) final <- aln[subjectHits(answers)] [/code] my "final" variable consists of almost what I want, but not quite. It has only some of the fields from the bam file. It doesn't, for example, have the sequence. Can someone point me in the right direction? Thank you. Eric -- output of sessionInfo(): R version 3.0.1 (2013-05-16) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] Rsamtools_1.12.3 Biostrings_2.28.0 GenomicRanges_1.12.4 IRanges_1.18.1 [5] BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] bitops_1.0-5 stats4_3.0.1 tools_3.0.1 zlibbioc_1.6.0 -- Sent via the guest posting facility at bioconductor.org.
SNP SNP • 702 views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 4 days ago
United States
On 06/29/2013 05:29 PM, Eric Foss [guest] wrote: > > I would like to enter a list of SNPs and a bam file and have Bioconductor return all the lines from the bam file, in human readable form, that contain the SNP in question. Here is my attempt: > > [code]library(GenomicRanges) > > genes <- GRanges(seqnames = c("13", "1", "12"), > ranges = IRanges( > start = c(111942495, 114516752, 116434901), > width = 1), > strand = rep("*", 3), > seqlengths = c("1" = 249250621, "13" = 115169878, "12" = 1 33851895)) > > alnFile <- "first_10k_PASRTP_RNA_790790_exome6500SNPtolerant_062913_ 1_clean_sorted.bam" > > aln <- readGappedAlignments(alnFile) here you can specify what you'd like to read in, along the lines of param = ScanBamParam(what="seq") readGappedAlignments(alnFile, param=param) see scanBamWhat() for other things one might read in. Martin > > bamGRanges <- GRanges(seqnames = seqnames(aln), > ranges = ranges(aln), > strand = strand(aln), > seqlengths = seqlengths(aln)) > > answers <- findOverlaps(query = genes, subject = bamGRanges) > > final <- aln[subjectHits(answers)] > [/code] > > my "final" variable consists of almost what I want, but not quite. It has only some of the fields from the bam file. It doesn't, for example, have the sequence. Can someone point me in the right direction? > > Thank you. > > Eric > > > -- output of sessionInfo(): > > R version 3.0.1 (2013-05-16) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Rsamtools_1.12.3 Biostrings_2.28.0 GenomicRanges_1.12.4 IRanges_1.18.1 > [5] BiocGenerics_0.6.0 > > loaded via a namespace (and not attached): > [1] bitops_1.0-5 stats4_3.0.1 tools_3.0.1 zlibbioc_1.6.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENT

Login before adding your answer.

Traffic: 918 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6