Hi,
I wonder if analyses of "non-linear"-like dose-response data is or could be covered by limma. By dose response gene expression data I mean data with 5-10 doses (3-5 biological replicates) of a chemical treatment per time-point. Novel technologies, conceptually related to the LINCS project, e.g., the TempO-seq are making it possible to generate this data economically.
The objective is to fit each gene in this data set to a number of different dose response models (10-12 sometimes) and then to both 1) test whether there is a statistically significant dose response (to a particular model) but also to 2) determine the lowest dose where a certain level of response can be expected (i.e. set response y to obtain concentration x, along with uncertainty).
https://www.epa.gov/sites/production/files/2015-01/documents/benchmark_dose_guidance.pdf
https://github.com/auerbachs/BMDExpress-2.0/wiki/Benchmark-Dose-Analysis
These doses can be important for drug development as well as for toxicity evaluations. It is starting to look like this kind of data will be incredibly useful for instance for 3R i.e., replacing animal testing with cellular assays. Problem is that software currently in use e.g., U.S. EPA BMDS and BMDExpress 2.0 does not use any of the developments pioneered by limma but imports models directly from the pre-microarray era (i.e. like using the straight-up t-test).
Question: 1) Can you extend or even directly use limma to these other models, i.e., are they in fact linear models. The goal would be to use all the adjustments of limma covered in Ritchie et al. 2015. 2) can you use R's vector capabilities to estimate these models in the same way as limma is doing and greatly speed up the analysis (doing the full genome analysis with 10-12 of these models can take over a day)
Models include:
- Power
- Linear
- Polynomial 2-4
- Hill[1]
- Exponential 2-5
This is a bit complicated question. Maybe even a whole new research field :-). Just wanted to put it out there since I don't personally have the expertise to do the modifications to limma (I think). Doing gene expression data over multiple doses gives a whole new perspective on mechanisms of compounds ... and has likely some implications for connectivity mapping, which has used single-dose data sets. So far.
Formulae for these models:
where n is the degree of the polynomial.
Linear model: The linear model is a special case of the polynomial model with n fixed at 1.
Power model: μ(dose)=γ+ β_ 〖dose〗^δ
where 0 < γ < 1, β≥ 0, and 18 ≥ δ > 0.
Note: For the the exponential 2 and 3 models, ‘sign’ is the adverse direction
Hill model: μ(dose)=γ+ (v 〖dose〗^n)/(k^n+ 〖dose〗^n )
Exponential 2: μ(dose)=aexp(sign b*dose)
Exponential 3: μ(dose)=aexp(sign (b*dose)^d)
Exponential 4: μ(dose)=a*(c-(c-1)exp(-1 b*dose))
Exponential 5: μ(dose)=a*(c-(c-1)exp(-1 (b*dose)^d ))