2.9 years ago by
No, there's no explicit way to consider a grouping variable in a standard baySeq analysis, as the philosophy underlying the baySeq models does not really allow for this - it's not clear to me that there is any reason to expect a (log?-)linear effect on gene expression from some grouping variable. If you include an interaction effect, then this removes the objection, but at this point you are equivalently constructing every possible model (see the 'allModels' function in baySeq and the consensus = TRUE option in the getPriors function).
There are two approaches that I think make sense here; and a third which will very rarely be the right thing to do. You can analyse the data for each site separately, and combine the posterior likelihoods. This will find data which behave similarly across sites; e.g., if a gene shows a high probability of increasing expression with categorical variable level in site A and a high probability of increasing expression with categorical variable in site B, then if you take the product of those probabilities, you will end up with a high probability of increasing expression in both sites - though the amplitude of increase may be considerably different between sites. This is the approach I would generally recommend.
Alternatively, you can construct all possible models for site/variable interaction, and run the analysis using consensus priors. This will probably work if you have three or fewer sites; more than that and you will have to find some way to filter the total number of models. This analysis will discriminate between cases where a gene's expression goes up more or less identically in site A and site B, and those cases where the gene's expression goes up in site A, and up in site B, but at different rates.
The last option is to create a new 'densityFunction' object (see the vignette at http://bioconductor.org/packages/release/bioc/vignettes/baySeq/inst/doc/baySeq_generic.pdf) which incorporates grouping variables. For the reasons I give above, I don't think this is the right route for this particular data set, but there may occasionally be times when it is the right approach.