Rsamtools bam to ShortReadQ for export
1
0
Entering edit mode
Marcus Davy ▴ 390
@marcus-davy-5153
Last seen 6.1 years ago
Hi, I would like to export certain reads filtered using Rsamtools, for example aligning against contaminant references to exclude a set of sequences filtering on the bitwise flag=0x4 (similar to SamToFastq.jar in the picard http://picard.sourceforge.net/ tools), but couldn't find much on exporting/writing to file in the Rsamtools-Overview.pdf vignette or on google. Is there a function/utility which converts an Rsamtools bam list region to ShortReadQ object(s) for export using writeFastq() with single end *and* paired end alignments which have been read in via the Rsamtools interface? ## Load a bam bam <- scanBam(...) ## Filter using R's subsetting, bamRegion <- bam[[1]][ ... ] ## list of list ## Should work for single end alignments, but not paired end alignments ## Read pairs may have to index on mpos = PNEXT field rfq <- ShortReadQ(bamRegion[["seq"]], bamRegion[["qual"]], BStringSet(bamRegion[["qname"]])) If not is the area of exporting from Rsamtools worth developing futher? cheers, Marcus [[alternative HTML version deleted]]
Rsamtools Rsamtools • 1.2k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 5 days ago
United States
On 11/27/2012 01:17 PM, Marcus Davy wrote: > Hi, > I would like to export certain reads filtered using Rsamtools, for example > aligning against contaminant references > to exclude a set of sequences filtering on the bitwise flag=0x4 (similar to > SamToFastq.jar in the picard http://picard.sourceforge.net/ tools), > but couldn't find much on exporting/writing to file in the > Rsamtools-Overview.pdf vignette or on google. > > Is there a function/utility which converts an Rsamtools bam list region to > ShortReadQ object(s) for export using writeFastq() > with single end *and* paired end alignments which have been read in via the > Rsamtools interface? > > ## Load a bam > bam <- scanBam(...) > > ## Filter using R's subsetting, > bamRegion <- bam[[1]][ ... ] ## list of list > > ## Should work for single end alignments, but not paired end alignments > ## Read pairs may have to index on mpos = PNEXT field > rfq <- ShortReadQ(bamRegion[["seq"]], bamRegion[["qual"]], > BStringSet(bamRegion[["qname"]])) > > If not is the area of exporting from Rsamtools worth developing futher? I think I would use readBamGappedAlignments, specifying what=c("seq", "qual") and maybe "qname", flag=scanBamFlag(<...>) and perhaps the tag argument to filter / load relevant records, and then create a fastq object ShortReadQ(<...>) and writeFastq. This could be made space efficient using BamFile(, yieldSize=) and iterating through bf = open(BamFile(...)) out = FastqFile("..." while (nrow(ga <- readBamGappedAlignments(bf, ...))) { ## filter, create fq = ShortReadQ writeFastq(fq, out, mode ="a") } close(bf) close(out) > > cheers, > > Marcus > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENT
0
Entering edit mode
Thanks Martin, that gives me a new lead to explore further. Marcus On Wed, Nov 28, 2012 at 11:26 AM, Martin Morgan <mtmorgan@fhcrc.org> wrote: > On 11/27/2012 01:17 PM, Marcus Davy wrote: > >> Hi, >> I would like to export certain reads filtered using Rsamtools, for example >> aligning against contaminant references >> to exclude a set of sequences filtering on the bitwise flag=0x4 (similar >> to >> SamToFastq.jar in the picard http://picard.sourceforge.net/ tools), >> but couldn't find much on exporting/writing to file in the >> Rsamtools-Overview.pdf vignette or on google. >> >> Is there a function/utility which converts an Rsamtools bam list region to >> ShortReadQ object(s) for export using writeFastq() >> with single end *and* paired end alignments which have been read in via >> the >> Rsamtools interface? >> >> ## Load a bam >> bam <- scanBam(...) >> >> ## Filter using R's subsetting, >> bamRegion <- bam[[1]][ ... ] ## list of list >> >> ## Should work for single end alignments, but not paired end alignments >> ## Read pairs may have to index on mpos = PNEXT field >> rfq <- ShortReadQ(bamRegion[["seq"]], bamRegion[["qual"]], >> BStringSet(bamRegion[["qname"]**])) >> >> If not is the area of exporting from Rsamtools worth developing futher? >> > > I think I would use readBamGappedAlignments, specifying what=c("seq", > "qual") and maybe "qname", flag=scanBamFlag(<...>) and perhaps the tag > argument to filter / load relevant records, and then create a fastq object > ShortReadQ(<...>) and writeFastq. > > This could be made space efficient using BamFile(, yieldSize=) and > iterating through > > bf = open(BamFile(...)) > out = FastqFile("..." > while (nrow(ga <- readBamGappedAlignments(bf, ...))) { > ## filter, create fq = ShortReadQ > writeFastq(fq, out, mode ="a") > } > close(bf) > close(out) > > >> cheers, >> >> Marcus >> >> [[alternative HTML version deleted]] >> >> ______________________________**_________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https: stat.e="" thz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: http://news.gmane.org/gmane.** >> science.biology.informatics.**conductor<http: news.gmane.org="" gmane="" .science.biology.informatics.conductor=""> >> >> > > -- > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 1045 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6