Meaning of NA for LRR and BAF values estimated with GWASTools
Last seen 10 months ago
I am estimating LRR and BAF value using the GWASTools package. After all steps, approximately 3% of LRR and 2% of BAF values turned into NA. What exactly must be the interpretation for those NA values? Must I assume a lack of signal or hybridization (just noise) for NA values?
• 933 views
Last seen 11 weeks ago
University of Washington
NA values are in LRR and BAF are probably due to poorly-performing SNPs or very low minor allele frequency. For a high quality SNP we expect to see well-defined clusters in intensity space (R vs Theta, which is a polar coordinate transformation of X and Y). The centers of the three clusters corresponding to AA, AB, and BB genotypes are used to define LRR and BAF for each sample (so the values for a given sample depend, in part, on the intensity values and genotype calls for all other samples). Figure 1 in this paper has an excellent illustration of how LRR and BAF are determined.
BAFfromGenotypes will not attempt to calculate LRR and BAF if the number of genotype calls for AA, AB, or BB is less than
min.n.genotypes (default 2). The lack of genotype calls can be due to poor clustering, which you could identify with a cluster plot (
chromIntensityPlot in GWASTools). You could also have missing LRR and BAF values from some good SNPs with very low minor allele frequency, for example if you had only a few heterozygotes and zero or one minor homozygotes.
Traffic: 200 users visited in the last hour