Question: Model matrix for longitudinal pairwise rna-seq data: linear regression or linear mixed model?
0
5 months ago by
agm0
agm0 wrote:

Hello,

I am looking for statistical expertise to figure out if I have set up my model matrix correctly. I am going to give an example info table that represents my experimental design. I have read the limma and edgeR documentation through and through and cannot find a clear example of this.

DESIGN: 3 individuals with recordings of a treatment at baseline and 3 future time points. baseline is preinfection. timepoints relate to stage in disease progression.I am interested in the differential expression at each timepoint with timepoint 1 being baseline.

PSEUDOCODE (did not run this in R it is a fake example):

file individual timepoint status
file <- 1:12
individual <- c(rep(1,4),rep(2,4),rep(3,4))
timepoint <- rep(0:4,3)
status <- rep(c("uninfected",4),rep("infected",8))
df <- cbind(file,individual,timepoint,status)
model.matrix(~individual+timepoint)
model.matrix(timepoint~(1|individual))


The main effect I want to see is the effect between timepoints not individuals- so I understand I can treat the individuals as a random effect in a mixed model. I just am not sure not to correctly integrate multiple independent variables, paired samples, and longitudinal data with baseline measures.

Any help and or explanations would be greatly appreciated!

modified 5 months ago • written 5 months ago by agm0

You say that you're interested in DE between timepoints, but do you want separate time effects for infected and uninfected individuals?

It would help considerably if you check your pseudo code so that it runs and doesn't have syntax errors. At the moment, individual has length 12, timepoint has length 15 and status isn't defined because of syntax errors, so the code doesn't help us to understand your experimental design.

You are right. Sorry about that.

Fixed example:

file <- 1:12

individual<- c(rep(1,4),rep(2,4),rep(3,4))

timepoint <- rep(1:4,3)

status <- rep(c("uninfected","infected","infected","infected"),3)

df <- as.data.frame(cbind(file,individual, timepoint,status))

Comparisons to consider: 1) differential expression between each time point taking into consideration paired nature of data. 2) less of interest to us uninfected cs infected

model.matrix(~individual+timepoint) model.matrix(timepoint~(1|individual))