I have a large number of heirarchical(nested) covariates. I would like to do some glm model testing to see if any of the levels are impact any genes in an RNA-seq experiment. I have been using DESeq2 to do differential tests. I have been using the bestglm package.
This is an example, I will scale this to the desired number of covariates
bestglm(cbind(y=counts(dds)[1,],phenotypes[,1:8]))
The above defaults to a gaussian family. I would like to use a negative binomial. If I used the negative.binomial from the MASS package, it requires a theta value, which works.
bestglm(cbind(y=counts(dds)[1,],p1[,1:8]),family=negative.ninomial(theta=1))
However, I am not sure what to use with the theta parameter here, can I use the gene-wise dispersion from DESeq2. Also is there a better "family " function to use for DESeq2 dispersions?
Yes, theta=1/dispersion. Note that dispersions(dds) is a better estimate of dispersion than mcols(dds)$dispGeneEst.