Accessing next gen sequence data remotely via biocondcutor
1
0
Entering edit mode
@ruppert-valentino-1376
Last seen 9.6 years ago
Hello, I am trying to access next gen sequencing data remotely via R/bioconductor but I can't seem to send queries to it like using biomaRt. I tried Rsamtools but even with that there is no way to query the sequence file directly. What I am trying to do is to get sequence data for specific regions e.g. chrom5 150100000 to 150101000 from http://www.1000genomes.org/ cases e.g. NA19240, however there doesn't seem to be any tool to this easily. In the Rsamtools they mention that initially they downloaded this using samtools view bamfile Does anyone know of a way to access next gen sequence data remotely without having to download them locally, if so I would appreciate it if they email me the R script for that. Thanks [[alternative HTML version deleted]]
Sequencing Rsamtools Sequencing Rsamtools • 834 views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 4 days ago
United States
On 12/17/2010 07:23 AM, Ruppert Valentino wrote: > > Hello, > > I am trying to access next gen sequencing data remotely via > R/bioconductor but I can't seem to send queries to it like using > biomaRt. I tried Rsamtools but even with that there is no way to > query the sequence file directly. > > What I am trying to do is to get sequence data for specific regions > e.g. chrom5 150100000 to 150101000 from http://www.1000genomes.org/ > cases e.g. NA19240, however there doesn't seem to be any tool to this > easily. > > In the Rsamtools they mention that initially they downloaded this > using samtools view bamfile > > Does anyone know of a way to access next gen sequence data remotely > without having to download them locally, if so I would appreciate it > if they email me the R script for that. Pointing to the bam url as the 'file' argument to scanBam will first download the index and then perform the query. Better to download the index ('.bai') file then scanBam(remoteUrl, localIndex). It also makes sense to do the arithmetic about volume of data to be downloaded -- if you're going to download most of the data anyway, then far better to use the 'aspera' plugin provided by 1000genomes to pull the bam files, quickly, down, and do local access. The basic work flow is sketched in the Rsamtools vignette; look for na19240url. Martin > > Thanks [[alternative HTML version deleted]] > > _______________________________________________ Bioconductor mailing > list Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
ADD COMMENT

Login before adding your answer.

Traffic: 1087 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6