Question: Differential gene expression of longitudinal data
I am trying to perform differential gene expression of longitudinal data. In this longitudinal study 62 consecutive patients were enrolled and followed up prospectively. Age, disease activity were documented of all these patients in their each visit. The main challenge here is few patients has 2 visits, few has 3 visits, few has more than three. And few patients has missing information too, like one patients has visit1, no visit2 but there is information for visit3. I would like to find the differential gene expression of these diseased patients over the time. They do not have any controls. 

Title Accession Patient.ID Visit cohort SLEDAI A.DSD Age
C1PL10V1 GSM1199504 1 1 SLE 2 4 26
C1PL10V2 GSM1199472 1 2 SLE 2 4 27
C1PL11V1 GSM1199537 2 1 SLE 2 29 25
C1PL11V2 GSM1199358 2 2 SLE 2 20 25
C1PL11V3 GSM1199519 2 3 SLE 0 14 26
C1PL12V1 GSM1199513 3 1 SLE 10 86 59
C1PL12V3 GSM1199452 3 3 SLE 10 74 60
C1PL13V1 GSM1199540 4 1 SLE 10 196 54
C1PL13V2 GSM1199493 4 2 SLE 4 228 54
C1PL14V1 GSM1199480 5 1 SLE 4 110 39
C1PL14V2 GSM1199539 5 2 SLE 6 258 40
C1PL15V1 GSM1199498 6 1 SLE 8 69 67
C1PL16V1 GSM1199403 7 1 SLE 4 50 32
C1PL16V2 GSM1199365 7 2 SLE 4 52 32
C1PL1V1 GSM1199435 8 1 SLE 12 14 26
C1PL1V2 GSM1199442 8 2 SLE 8 13 26
C1PL1V3 GSM1199406 8 3 SLE 12 7 27
C1PL2V2 GSM1199461 9 2 SLE 0 11 31
C1PL2V3 GSM1199373 9 3 SLE 2 10 31
C1PL3V1 GSM1199482 10 1 SLE 4 3 38
C1PL3V2 GSM1199477 10 2 SLE 6 5 38


I have tried to see other threads related to time course and longitudinal data. But did not find anything on how to handle missing data. I am not sure if ANOVA would be better or if there are any other packages to deal this kind of data. I am thinking of ANOVA because of it is similar to pre-post analysis. But there is no drug treatment here. I would like see change in the patients over time with respect to disease activity. If you see the above table there are few patient ids for those the disease activity remain same and vary for few.

Is the second visit for every patient qualitatively the same? Why do some patients have three or more visits?

The patients are paired samples. The patient were enrolled between 2009 to 2011. Basically the patients had follow up visits. For example the patientid 1 is same patient but quantitatively they are different because if you see the age parameter is different. And the reason was not mentioned why few patients has more than two visits. I am assuming that few patients would have missed their follow up visits. The major challenge is missed follow up visits data (unequal spaced time points). I am not sure how to deal this problem. I am not sure which statistical test is appropriate to deal this kind of data.

Thanks in advance.

I think you could look into mixed models with repeated measures instead of ANOVA, they can handle missing data I think.

Another option is data imputation prior to doing a repeated ANOVA, but I think the above option is usually preferred.

Thank you Chris. I would try mixed models with repeated measures.

