I'm trying to perform an allelic imbalance analysis on some ATAC-seq data generated from a population of human individuals. As such for each SNP there will be some individuals that are heterozygous and some that are homozygous. In this case my table contains counts for the individuals that are heterozygous but NA values for the individuals that are homozygous.
I've tried following the apeglm vignette, Modeling ratios of counts
section to perform a beta-binomial fit, but if I present a table with NA values it will fail with the following error:
Error in seq_len(sum(nonzero)): argument must be coercible to non-negative integer
Traceback:
1. system.time({
. for (i in 1:niter) {
. param <- cbind(theta.hat, cts)
. fit.mle <- apeglm(Y = ase.cts, x = x, log.lik = NULL,
. param = param, no.shrink = TRUE, log.link = FALSE,
. method = "betabinCR")
. theta.hat <- bbEstDisp(success = ase.cts, size = cts,
. x = x, beta = fit.mle$map, minDisp = 0.01, maxDisp = 5000)
. }
. })
2. apeglm(Y = ase.cts, x = x, log.lik = NULL, param = param, no.shrink = TRUE,
. log.link = FALSE, method = "betabinCR") # at line 7-8 of file <text>
3. betabinCppRoutine(Y, x, weights, offset, param, prior.control,
. method, result, bounds, optim.method)
4. sapply(seq_len(sum(nonzero)), function(i) {
. betabinFn(initC, x = x, y = YNZ[, i], size = sizeNZ[, i],
. theta = theta[i], weights = weightsNZ[, i], sigma = sigma,
. S = S, no.shrink = no.shrink, shrink = shrink, cnst = 0)
. })
5. lapply(X = X, FUN = FUN, ...)
What's the correct procedure to analyse this data? Do I need to fill in the NA values with something?
I wonder if there is any code/tutorial to run this kind of analysis, including the estimation of p-values?
Michael Love could you help me? Thanks!