Question

Extending limma to cover "non-linear" dose response models e.g. Hill, Exponential, etc...

0

Entering edit mode

Pekka Kohonen ▴ 190

@pekka-kohonen-5862

Last seen 7.2 years ago

Sweden

Hi,

I wonder if analyses of "non-linear"-like dose-response data is or could be covered by limma. By dose response gene expression data I mean data with 5-10 doses (3-5 biological replicates) of a chemical treatment per time-point. Novel technologies, conceptually related to the LINCS project, e.g., the TempO-seq are making it possible to generate this data economically.

The objective is to fit each gene in this data set to a number of different dose response models (10-12 sometimes) and then to both 1) test whether there is a statistically significant dose response (to a particular model) but also to 2) determine the lowest dose where a certain level of response can be expected (i.e. set response y to obtain concentration x, along with uncertainty).

https://www.epa.gov/sites/production/files/2015-01/documents/benchmark_dose_guidance.pdf

https://github.com/auerbachs/BMDExpress-2.0/wiki/Benchmark-Dose-Analysis

These doses can be important for drug development as well as for toxicity evaluations. It is starting to look like this kind of data will be incredibly useful for instance for 3R i.e., replacing animal testing with cellular assays. Problem is that software currently in use e.g., U.S. EPA BMDS and BMDExpress 2.0 does not use any of the developments pioneered by limma but imports models directly from the pre-microarray era (i.e. like using the straight-up t-test).

Question: 1) Can you extend or even directly use limma to these other models, i.e., are they in fact linear models. The goal would be to use all the adjustments of limma covered in Ritchie et al. 2015. 2) can you use R's vector capabilities to estimate these models in the same way as limma is doing and greatly speed up the analysis (doing the full genome analysis with 10-12 of these models can take over a day)

Models include:

Power
Linear
Polynomial 2-4
Hill[1]
Exponential 2-5

limma dose-response lincs tempO-seq high-throughput transcriptomics • 2.5k views

ADD COMMENT • link updated 7.5 years ago by Gordon Smyth 52k • written 7.5 years ago by Pekka Kohonen ▴ 190

0

Entering edit mode

This is a bit complicated question. Maybe even a whole new research field :-). Just wanted to put it out there since I don't personally have the expertise to do the modifications to limma (I think). Doing gene expression data over multiple doses gives a whole new perspective on mechanisms of compounds ... and has likely some implications for connectivity mapping, which has used single-dose data sets. So far.

ADD REPLY • link 7.5 years ago Pekka Kohonen ▴ 190

0

Entering edit mode

Formulae for these models:

Polynomial model: μ(dose)=β_0+β_(1 ) dose+ β_(2 ) 〖dose〗^2+ ⋯ + β_(n ) 〖dose〗^n

where n is the degree of the polynomial.

Linear model: The linear model is a special case of the polynomial model with n fixed at 1.
Power model: μ(dose)=γ+ β_ 〖dose〗^δ

where 0 < γ < 1, β≥ 0, and 18 ≥ δ > 0.

Note: For the the exponential 2 and 3 models, ‘sign’ is the adverse direction

Hill model: μ(dose)=γ+ (v 〖dose〗^n)/(k^n+ 〖dose〗^n )
Exponential 2: μ(dose)=aexp⁡(sign b*dose)
Exponential 3: μ(dose)=aexp⁡(sign (b*dose)^d)
Exponential 4: μ(dose)=a*(c-(c-1)exp⁡(-1 b*dose))
Exponential 5: μ(dose)=a*(c-(c-1)exp⁡(-1 (b*dose)^d ))

ADD REPLY • link 7.5 years ago Pekka Kohonen ▴ 190

score 2 · Answer 1 · 2017-10-10

2

Entering edit mode

Gordon Smyth 52k

@gordon-smyth

Last seen 3 hours ago

WEHI, Melbourne, Australia

As you already know, limma only fits linear models.

Polynomial models are linear in the parameters so there is no problem.

Your exponential models 2 and 3 are linearizable by taking logs. Just log the response and apply a linear regression in dose.

The other models are strictly nonlinear. These nonlinear models are all intended to model strictly positive responses and they're not linearizable.

limma is designed for expression data that isn't bounded, i.e., it's not designed for strictly positive data unless you can take logs.