how to open a SNP data file as large as 500M in Windows OR just extract part of data
1
0
Entering edit mode
xiangxue Guo ▴ 70
@xiangxue-guo-3524
Last seen 10.2 years ago
Hi,there Does anybody know how to open a SNP data file as large as 500M in Windows computer? These data are SNPs for many chromosomes, and we just need one of them. Thus if someone knowes how to extract the data of just one chromosome, it also should be OK for us. Thanks in advanced, Guo
SNP SNP • 947 views
ADD COMMENT
0
Entering edit mode
@michael-imbeault-3593
Last seen 10.2 years ago
You could try http://www.editpadpro.com/ - it opens arbitrary large files, I opened 1 GB text files with it before. Michael On 09/09/2010 11:26 PM, xiangxue Guo wrote: > Hi,there > > Does anybody know how to open a SNP data file as large as 500M in Windows computer? These data are SNPs for many chromosomes, and we just need one of them. Thus if someone knowes how to extract the data of just one chromosome, it also should be OK for us. > > Thanks in advanced, > > Guo > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD COMMENT
0
Entering edit mode
Another solution is to install the Rtools toolset and use grep or sed. http://www.murdoch-sutherland.com/Rtools/ something like grep <your snp="" name="" here=""> <snp file="" name="" here=""> will get the SNP data without having to open the entire file at one time. An alternative is sed -n '/<snp name="" here="" p'="" which="" will="" do="" the="" same.="" and="" usually="" faster="" than="" opening="" the="" entire="" file="" just="" to="" find="" one="" line.="" you="" can="" of="" course="" re-direct="" the="" output="" into="" a="" new="" file="" by="" adding="" a=""> mynewfile.txt at the end of either of the above. Best, Jim On 9/10/2010 12:49 AM, Michael Imbeault wrote: > > You could try http://www.editpadpro.com/ - it opens arbitrary large > files, I opened 1 GB text files with it before. > > Michael > > On 09/09/2010 11:26 PM, xiangxue Guo wrote: >> Hi,there >> >> Does anybody know how to open a SNP data file as large as 500M in >> Windows computer? These data are SNPs for many chromosomes, and we >> just need one of them. Thus if someone knowes how to extract the data >> of just one chromosome, it also should be OK for us. >> >> Thanks in advanced, >> >> Guo >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Douglas Lab University of Michigan Department of Human Genetics 5912 Buhl 1241 E. Catherine St. Ann Arbor MI 48109-5618 734-615-7826 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
ADD REPLY
0
Entering edit mode
On 09/10/2010 06:36 AM, James W. MacDonald wrote: > Another solution is to install the Rtools toolset and use grep or sed. > > http://www.murdoch-sutherland.com/Rtools/ > > something like > > grep <your snp="" name="" here=""> <snp file="" name="" here=""> > > will get the SNP data without having to open the entire file at one > time. An alternative is > > sed -n '/<snp name="" here="" p'=""> > which will do the same. And usually faster than opening the entire > file just to find one line. > > You can of course re-direct the output into a new file by adding a > > > mynewfile.txt > > at the end of either of the above. > Best, > > Jim > > > > On 9/10/2010 12:49 AM, Michael Imbeault wrote: >> >> You could try http://www.editpadpro.com/ - it opens arbitrary large >> files, I opened 1 GB text files with it before. >> >> Michael >> >> On 09/09/2010 11:26 PM, xiangxue Guo wrote: >>> Hi,there >>> >>> Does anybody know how to open a SNP data file as large as 500M in >>> Windows computer? These data are SNPs for many chromosomes, and we >>> just need one of them. Thus if someone knowes how to extract the data >>> of just one chromosome, it also should be OK for us. >>> Or in R open a connection to the file (possibly compressed or remote) and process chunks until satisfied con <- file("c:\\some\\file", "r") repeat { value <- grep(snpId, readLines(con, 1000000), value=TRUE) if (0 != length(value)) break; } value Or maybe your file is structured like a table, perhaps read.table with colClasses to read in just the necessary columns would allow you to read the relevant parts of the entire file. Or use the approach above, where 'con' is used with read.table to process chunks at a time and accumulate matching records. Martin > >>> Thanks in advanced, >>> >>> Guo >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6