Search
Question: Write the "map" DataFrame from genotypeToSnpMatrix() to a text file
0
gravatar for TimothéeFlutre
2.0 years ago by
France
TimothéeFlutre70 wrote:

I would like to export the genotypes (GT field) from a VCF file to allele dosage (0, 1 or 2 copies of a given allele for bi-allelic variants). As the VCF file can be large, I need to do it by batch:

gdose.con <- file("gdose.txt", open="a")
amap.con <- file("amap.txt", open="a")
cat("snp.names\tallele.1\tallele.2\tignore\n", file=amap.con, append=TRUE)
tabix.file <- TabixFile(file="variants.vcf.gz", yieldSize=10^3)
open(tabix.file)
while(nrow(vcf <- readVcf(file=tabix.file, genome="test"))){
  res <- genotypeToSnpMatrix(vcf)
  write.SnpMatrix(x=res$genotypes, file=gdose.con, append=TRUE,
                  quote=FALSE, sep="\t", row.names=TRUE,
                  col.names=FALSE)
  write.table(x=res$map, file=amap.con, append=TRUE,
              quote=FALSE, sep="\t", row.names=FALSE, col.names=FALSE)
}
close(tabix.file)
close(gdose.con)
close(amap.con)

The output "gdose" file is fine. However, the output "amap" file contains <S4 object of class "DNAStringSet"> in its 2nd column, even if I use as.data.frame. What should I do to have instead the alternate alleles?

ADD COMMENTlink modified 24 months ago • written 2.0 years ago by TimothéeFlutre70
1
gravatar for TimothéeFlutre
24 months ago by
France
TimothéeFlutre70 wrote:

Instead of using genotypeToSnpMatrix(vcf), I used VariantAnnotation::geno(vcf) and GenomicRanges::rowRanges(vcf). This answer from Martin Morgan was useful.

ADD COMMENTlink written 24 months ago by TimothéeFlutre70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 254 users visited in the last hour