edge, timeseries analysis, model specification question
1
0
Entering edit mode
@warrenanderson-7158
Last seen 9.2 years ago
European Union

Hi,

I am seeking advise on model specification for a gene expression data set (24 genes) with the following covariates: age (n=5) and phenotype (n=2, control and disease). The samples are independent (one organ per animal). I would like to evaluate time-dependent (age-dependent) between class (between phenotype) gene expression differences. From the vignette, it appears that the following model specification is appropriate:

cov <- data.frame(tme=age, grp=phenotype)

null_model <- ~grp + ns(tme, df=4, intercept=F)

full_model <- ~grp + ns(tme, df=4, intercept=F) + (grp):ns(tme, df=4, intercept=F)

However, I recognize that this is the model specification for longitudinal sampling, as contrasted to independent sampling. 

Could someone please advise?

Thanks, Warren

edge timecourse fittimeseries • 1.5k views
ADD COMMENT
0
Entering edit mode
@andrew-bass-8155
Last seen 9.0 years ago
United States

Posting the response I gave through emails (in case someone else has a similar question):

Hi Warren,

I am a maintainer of edge and would be happy to help. There is a great discussion regarding the difference between longitudinal and independent sampling on page 12 (http://www.pnas.org/content/102/36/12837.full.pdf?with-ds=yes). To summarize, independent sampling is a special case of longitudinal sampling where each individual is sampled at one time point. So when we have longitudinal sampling and have observations at different time points for each individual, then we need to adjust the intercepts for the individuals (through the input "ind" in the "build_models" function). The point is that the model formulation remains the same:

- Between class differential expression

cov <- data.frame(tme=age, grp=phenotype)
null_model <- ~ns(tme, df=4, intercept=F)
full_model <- ~ns(tme, df=4, intercept=F) + grp

- Within class differential expression

cov <- data.frame(tme=age, grp=phenotype)
null_model <- ~grp + ns(tme, df=4, intercept=F)
full_model <- ~grp + ns(tme, df=4, intercept=F) + grp:ns(tme, df=4, intercept=F)

It's important to note that "df" should be chosen carefully by looking at the data. Please let me know if you have additional questions.

Andy

ADD COMMENT
0
Entering edit mode

Hi Andy,

Could you provide a little more information about the basis for choosing df? I looked through the 2005 PNAS paper and its SI and didn't see a justification for using 2 df for the endotoxin analysis.For what it's worth, I'm doing a longitudinal time course qPCR experiment.

Thanks for any help.

 

Joe

ADD REPLY

Login before adding your answer.

Traffic: 546 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6