Question: "set genotype" function for gds?
0
gravatar for Lna
2.8 years ago by
Lna0
Lna0 wrote:

I'm using GWASTools for my GWAS analysis. It turned out that there are some genotyping problems with a few hundred SNPs on the X-chromosome. I already have all my data stored in gds files for use in GWASTools and I want to change the genotypes of the erroneous SNPs directly in the gds file. I was hoping there is kind of a "setGenotype" function with works exactly like the "getGenotype" function that I use to obtain specific SNP genotypes, but couldn't find anything like that.

Did I miss anything? How can genotype changes of specific SNPs be done in gds files without going back to raw data file and creating a new gds file based on them?

Thank you for your help!

gwastools gds • 552 views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by Lna0
Answer: "set genotype" function for gds?
1
gravatar for Stephanie M. Gogarten
2.8 years ago by
University of Washington
Stephanie M. Gogarten740 wrote:

You need to use the gdsfmt package to modify GDS files directly.

library(gdsfmt)

file <- system.file("extdata", "illumina_geno.gds", package="GWASdata")
gdsfile <- tempfile()
file.copy(file, gdsfile)

# open file in write mode
gdsobj <- openfn.gds(gdsfile, readonly=FALSE)

# get node of the genotype variable
node <- index.gdsn(gdsobj, "genotype")

# uncompress node so we can write to it
compression.gdsn(node, "")

# change some values for 100 SNPs
nsnp <- 100
nsamp <- 77
newsnps <- matrix(sample(c(0,1,2), size=nsnp*nsamp, replace=TRUE), nrow=nsnp, ncol=nsamp)

# write new values for SNPs 101:200
write.gdsn(node, newsnps, start=c(101,1), count=c(100,77))

# close
closefn.gds(gdsobj)
cleanup.gds(gdsfile)

file.remove(gdsfile)

 

ADD COMMENTlink written 2.8 years ago by Stephanie M. Gogarten740

There's one thing... When I read an entry for one SNP which includes "NA"s 

(like: NA   NA   NA   NA   NA   NA   NA   NA    2    NA    NA    NA    NA    NA    NA    NA     1    NA    NA    NA     2    NA     2   2  NA    NA    NA    NA)

from a gds file and try to write it with write.gdsn to another, the "NA"s are exchanged by "0"s when I read the same data from the changed gds file
(0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 0 2 0 2 2 0 0 0 0)

I checked the help entry for write.gdsn, because I hoped I can choose what happens to missing entries and it says:

"GDS format does not support missing characters NA, and any NA will be converted to a blank string "". "

I don't really get it. In general there can be missing values in a gds file (that is where my data came from), but how can I set them with write.gdsn?

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by Lna0

The standard missing code for genotype gds files is 3. The GWASTools getGenotype function takes care of converting a value of 3 to NA, but the gdsfmt package does not. When writing the new genotype values, set the missing genotypes to 3 before calling write.gdsn. For example:

geno[is.na(geno)] <- 3

write.gdsn(node, geno, ...)
ADD REPLYlink written 2.8 years ago by Adrienne Stilp30

Thank you! Now everything seems to work. Maybe it would make sense to include this information in the help text for write.gdsn...

ADD REPLYlink written 2.8 years ago by Lna0
Answer: "set genotype" function for gds?
0
gravatar for Lna
2.8 years ago by
Lna0
Lna0 wrote:

Thank you! That worked fine!

ADD COMMENTlink written 2.8 years ago by Lna0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 174 users visited in the last hour