Question

setting contrast in edgeR for time series data

0

Entering edit mode

wangzhang1988 • 0

@wangzhang1988-13198

Last seen 8.6 years ago

Hello,

I'm analyzing a time series RNA-Seq data with repeated measures on six different time points corresponding to pre-treatment, on-treatment and post-treatment phases, using edgeR:

Timepoint1 Pretreat

Timepoint2 Pretreat

Timepoint3 Pretreat

Timepoint4 Ontreat

Timepoint5 Ontreat

Timepoint6 Posttreat

And my hypothesis test is to look for DEGs comparing Ontreat VS Pretreat, and Posttreat VS Ontreat.

There are two options I can think of to do this:

1). Include the timepoint variable into the glm model (include subject as well since it's repeated measure data) and setting the contrast as:

design<-model.matrix(~0+timepoint+subject)

mycontrast<-makeContrasts(OnvsPre=(timepoint4+timepoint5)/2-(timepoint1+timepoint2+timepoint3)/3, PostvsOn=timepoint6-(timepoint4+timepoint5)/2, levels=design)

2). Include the treatment phase variable into the model (which essentially combines different timepoints within the same treatment into one group):

design<-model.matrix(~0+treatment+subject)
mycontrast<-makeContrasts(OnvsPre=Ontreat-Pretreat,PostvsOn=Posttreat-OnTreat,levels=design)

Since I am really new to RNA-Seq analysis and ignorant in statistics, my questions are:

1). For the first method, am I setting the contrast in the right way?

2). For the second method, is it justified to combine different timepoints into one group, or will it fall into the issue of repeated measures?

Thanks very much for any help here!

rnaseq edger makecontrasts • 2.2k views

ADD COMMENT • link updated 8.5 years ago by Aaron Lun ★ 29k • written 8.5 years ago by wangzhang1988 • 0

score 0 · Answer 1 · 2017-06-07

0

Entering edit mode

Aaron Lun ★ 29k

@alun

Last seen 8 hours ago

The city by the bay

Use the first model. If there are differences between time points in the same treatment category, this will lead to inflation of the dispersion estimates when you try to treat those time points as "replicates" in the second model.

The contrasts for the first model look fine to me.

ADD COMMENT • link 8.5 years ago Aaron Lun ★ 29k

0

Entering edit mode

Thanks Aaron. Much appreciated.

ADD REPLY • link 8.5 years ago wangzhang1988 • 0

0

Entering edit mode

A follow up question:

To interpret the first model, does it mean that the model essentially takes the averaged expression values of the three timepoints of pre-treatment, and takes the average of the two timepoints of on-treatment, and then obtains the DE genes from comparing the averaged expression scores?

Thanks again.

ADD REPLY • link 8.5 years ago wangzhang1988 • 0

0

Entering edit mode

Yes. If you need more stringency, you can test each pair of on-treatment vs pre-treatment timepoints to verify that genes are consistently DE (in the same direction) between groups.

ADD REPLY • link 8.5 years ago Aaron Lun ★ 29k