I have RNA-seq data for ~250 samples, the design is like so:
Patient Trial Time-point Response Time_Resp
A 1 T0 R T0_R
A 1 T1 R T1_R
B 1 T0 NR T0_NR
B 1 T1 NR T1_NR
C 1 T0 R T0_R
C 1 T1 R T1_R
C 1 T1 R T1_R
D 1 T0 NR T0_NR
D 1 T1 NR T1_NR
D 2 T0 R T0_R
D 2 T1 R T1_R
I would like to identify significant associations between expression (as change from baseline i.e T1 - T0) and response outcome. If all patients only contributed two samples (like A and B) then I would use the following design:
0 + Time_Resp + Patient
and perform the following contrast:
(t1_R - t0_R) - (t1_NR - t0_NR)
However, despite consulting the Limma manual, I am unsure how to average across multiple T1
patient samples within a trial (i.e patient C), and also account for the fact that some patients
contribute multiple trials
(i.e. patient D).