IHW and correlated covariates
1
1
Entering edit mode
@mikelove
Last seen 3 minutes ago
United States

I'm trying IHW for use with eQTL analysis, using the genomic distance.

I'm also interested in using the expression level of the gene as a covariate.

I could use a summary, such as the average normalized count for the gene, but also I'm interested in trying a number of quantiles of normalized count (e.g. 0.75, 0.9, 0.95, 0.99). Do you have any advice here, or have you thought about how IHW works when a number of the covariates are correlated?

ihw • 722 views
0
Entering edit mode
Last seen 2.3 years ago
Heidelberg

Hi Mike,

I am afraid this will largely be a non-answer, since right now the support for multivariate covariates is minimal: You have to manually bin covariates into higher dimensional rectangles, with the obvious repercussions in terms of the curse of dimensionality.

One thing you could do is just follow the approach used for finding the independent filtering threshold (try different covariates, see which one works best). I think this will not bias FDR control in any practically relevant way.

From a more theoretical point of view, what are the difficulties in getting higher dimensional covariates to work (and how does this relate to correlated covariates)?

1) We need to estimate the conditional distribution of the p-values given the covariates. Right now we do this non-parametrically (assuming concavity of the conditional CDFs). Unfortunately this does not really  extend well to higher dimensions (again curse of dimensionality), so possibly some parametric approach will be needed. Correlation among covariates however helps, since that reduces the dimensionality of the problem.

2) Once we have the conditional CDF, we need to find the optimal weight function w*(x). One way to think about it is as a regression problem and indeed, the convex optimization framework of IHW in principle could accomodate all the usual suspects (smoothing splines, LASSO, etc.) instead of the regressogram type estimator used now. Here the usual caveats for correlated variables in regression problems apply.

One possibly way out of 1 and 2 is by assuming an additive or multiplicative structure of the weights, but we have not pursued that.

Best,

Nikos

0
Entering edit mode

hi Nikos,

I have a new problem that would benefit IHW with two ordinal covariates, do you have any thoughts on approaching this while preserving the ordinal nature of both variables? Or any new work in this area you would recommend we try?

thanks, Mike

0
Entering edit mode

FWIW, here is some toy code playing with just splitting two covariates into nominal, to see how it does. Seems to work fine