Hi, I have a dataset like this:
Subject_ID Treatment Duration Pasient_Kontroll Age CTP_NPC
NSC-2019-290 K5137 DMSO 6hours Control 47 0.28845794
NSC-2019-291 K5039 DMSO 6hours Control 62 0.06154937
NSC-2019-292 K50128 DMSO 6hours Control 45 0.25351167
NSC-2019-293 U815 DMSO 6hours Patient 32 0.22540896
NSC-2019-294 U793 DMSO 6hours Patient 23 0.25467268
NSC-2019-295 U162 DMSO 6hours Patient 38 0.20031763
NSC-2019-299 K5137 DMSO 1week Control 47 0.21114240
NSC-2019-300 K5039 DMSO 1week Control 62 0.11281822
NSC-2019-301 K50128 DMSO 1week Control 45 0.23429267
NSC-2019-302 U815 DMSO 1week Patient 32 0.20184951
NSC-2019-303 U793 DMSO 1week Patient 23 0.35476381
NSC-2019-304 U162 DMSO 1week Patient 38 0.19938699
NSC-2019-308 K5137 Li 6hours Control 47 0.30317057
NSC-2019-309 K5039 Li 6hours Control 62 0.28853031
NSC-2019-310 K50128 Li 6hours Control 45 0.31602120
NSC-2019-311 U815 Li 6hours Patient 32 0.42000711
NSC-2019-312 U793 Li 6hours Patient 23 0.36429453
NSC-2019-313 U162 Li 6hours Patient 38 0.37220548
NSC-2019-317 K5137 Li 1week Control 47 0.18130258
NSC-2019-318 K5039 Li 1week Control 62 0.12880417
NSC-2019-319 K50128 Li 1week Control 45 0.31397647
NSC-2019-320 U815 Li 1week Patient 32 0.14236717
NSC-2019-321 U793 Li 1week Patient 23 0.28045362
NSC-2019-322 U162 Li 1week Patient 38 0.30689310
I.e., 6 donors: 3 controls and 3 patients. For each donor, we have generated iPSC-derived NPCs and split these into 4 samples before running RNA-seq. One sample treated with DMSO (control treatment) for 6 hours, one sample treated with lithium (Li) for 6 hours, one sample treated with DMSO for 1 week, and one sample treated for lithium for 1 week. In total, 24 samples.
We want to identify the genes that are DE between Li-DMSO at 6 hours and Li-DMSO at 1 week across patient/control status (i.e. different treatment response in patients and controls) while adjusting for age and CPT_NPC effects. I understand that this is a very complex design (including paired samples), and I have not seen anyone do something similar before. What would be the appropriate R package and procedure? Any help is highly appreciated.
Hi Kevin Blighe,
Your solution does not seem to take into account the paired design, i.e. multiple samples from the same subject. I am pretty sure DEseq2 cannot handle paired designs while accounting for additonal variables. I think limma is more suitable for this kind of analysis, but I am not sure how to proceed.
Best, Ibrahim
Paired designs are usually handled by including the sample pairing as a covariate / blocking factor (
Subject_ID
). In this case, your design formula is already expansive, but you may want to try that. There are also many previous questions on this topic. If you have a particular / specific question regarding pairing in, e.g., limma, then you could possibly create a new question on that very topic. This current question (above) was somewhat general).