Question

Meaning of NA for LRR and BAF values estimated with GWASTools

0

Entering edit mode

Vinicius Henrique da Silva ▴ 40

@vinicius-henrique-da-silva-6713

Last seen 11 months ago

Brazil

I am estimating LRR and BAF value using the GWASTools package. After all steps, approximately 3% of LRR and 2% of BAF values turned into NA. What exactly must be the interpretation for those NA values? Must I assume a lack of signal or hybridization (just noise) for NA values?

gwastools lrr baf NA • 1.7k views

ADD COMMENT • link updated 8.4 years ago by Stephanie M. Gogarten ▴ 870 • written 8.4 years ago by Vinicius Henrique da Silva ▴ 40

score 2 · Accepted Answer · 2015-11-24

NA values are in LRR and BAF are probably due to poorly-performing SNPs or very low minor allele frequency. For a high quality SNP we expect to see well-defined clusters in intensity space (R vs Theta, which is a polar coordinate transformation of X and Y). The centers of the three clusters corresponding to AA, AB, and BB genotypes are used to define LRR and BAF for each sample (so the values for a given sample depend, in part, on the intensity values and genotype calls for all other samples). Figure 1 in this paper has an excellent illustration of how LRR and BAF are determined. BAFfromGenotypes will not attempt to calculate LRR and BAF if the number of genotype calls for AA, AB, or BB is less than min.n.genotypes (default 2). The lack of genotype calls can be due to poor clustering, which you could identify with a cluster plot (chromIntensityPlot in GWASTools). You could also have missing LRR and BAF values from some good SNPs with very low minor allele frequency, for example if you had only a few heterozygotes and zero or one minor homozygotes.