Dear all. I am working on allele-specific expression analysis on 60+ human samples (unrelated, do not have parental genotyping).
One of my applications involves comparing expression of reference and alternative allele of around 50k heterozygous SNPs within these individuals between patients who develop cancer and those who do not. I have created a count matrix that containsFor each individual heterozygous for each SNP I have RNAcounts with ref and alt allele, or have listed the counts as NA if the individual is not heterozygous for the SNP:
I would prefer to use DESeq2 (or EdgeR) for this, based on good prior experience with this software
Of course for each SNP, multiple individuals are homozygous and therefore do not contain information on ref/alt counts. Simply replacing the NA with 0 can bias the count variability and dispersion estimate. Neither package allows NA counts.
Any thoughts on how to do this or alternative suggestions of solution?