Hi, I am attempting to analyze 4 files with a total of 12,800,011 elements using BiSeq. Currently, this is the code I am using to obtain beta results:
rrbs.clust <- clusterSites(object = rrbs, groups = colData(rrbs)$group, perc.samples = 4/5, min.sites = 4, max.dist = 30) ind.cov <- totalReads(rrbs.clust) > 0 quant <- quantile(totalReads(rrbs.clust)[ind.cov], 0.9) rrbs.clust.lim <- limitCov(rrbs.clust, maxCov = quant) predictedMeth <- predictMeth(object = rrbs.clust.lim) betaResults <- betaRegression(formula = ~group, link = "probit", object = predictedMeth, type = "BR")`
I am not sure why, but when I go to view the
betaResults, I see a data table with a majority of 'NA's instead of actual values:
head(betaResults) . . chr pos p.val meth.group1 meth.group2 meth.diff 1.1 17 30458 NA NA NA NA 1.2 17 30462 NA NA NA NA 1.3 17 30464 NA NA NA NA 1.4 17 30475 NA NA NA NA 1.5 17 30480 NA NA NA NA 1.6 17 30487 NA NA NA NA estimate std.error pseudo.R.sqrt cluster.id 1.1 NA NA NA 17_1 1.2 NA NA NA 17_1 1.3 NA NA NA 17_1 1.4 NA NA NA 17_1 1.5 NA NA NA 17_1 1.6 NA NA NA 17_1
Very few of the rows actually have numbers (about 840/ 5670), but the ones that do, have very high p values. Does anyone have any idea why I'm seeing all these NAs, or what might be causing the p values to appear very high?