I have been reading about IHW as a method to adjust p-values for multiple testing by assigning weights to each hypothesis based on a prior (informative) knowledge. My rationale here is that given a list of differentially expressed genes (as returned by DESeq2, but also other methods), I would like to adjust their p-values while accounting for the prior knowledge of which pathway the genes belong to. The idea behind this is that I would expect genes belonging to the same pathways to be more or less co-regulated.
That said, I have immediately faced the harsh truth that most genes are annotated for more then one and sometimes dozens of pathways and so the only way I could come up to pass this information as a vector-like object is to generate a binary string as long as the total number of observed pathways for my list of genes, where each position is either a zero (not belonging) or a one (the gene belongs to that pathway).
I am currently trying this out. Does anyone have a better idea?
Thanks in advance!