Nested paired design with covariates in limma
1
0
Entering edit mode
@ibrahimakkouh-23133
Last seen 3.0 years ago

Hi, I have a dataset like this:

              Subject_ID Treatment Duration Pasient_Kontroll Age    CTP_NPC
NSC-2019-290      K5137      DMSO   6hours          Control  47 0.28845794
NSC-2019-291      K5039      DMSO   6hours          Control  62 0.06154937
NSC-2019-292     K50128      DMSO   6hours          Control  45 0.25351167
NSC-2019-293       U815      DMSO   6hours          Patient  32 0.22540896
NSC-2019-294       U793      DMSO   6hours          Patient  23 0.25467268
NSC-2019-295       U162      DMSO   6hours          Patient  38 0.20031763
NSC-2019-299      K5137      DMSO    1week          Control  47 0.21114240
NSC-2019-300      K5039      DMSO    1week          Control  62 0.11281822
NSC-2019-301     K50128      DMSO    1week          Control  45 0.23429267
NSC-2019-302       U815      DMSO    1week          Patient  32 0.20184951
NSC-2019-303       U793      DMSO    1week          Patient  23 0.35476381
NSC-2019-304       U162      DMSO    1week          Patient  38 0.19938699
NSC-2019-308      K5137        Li   6hours          Control  47 0.30317057
NSC-2019-309      K5039        Li   6hours          Control  62 0.28853031
NSC-2019-310     K50128        Li   6hours          Control  45 0.31602120
NSC-2019-311       U815        Li   6hours          Patient  32 0.42000711
NSC-2019-312       U793        Li   6hours          Patient  23 0.36429453
NSC-2019-313       U162        Li   6hours          Patient  38 0.37220548
NSC-2019-317      K5137        Li    1week          Control  47 0.18130258
NSC-2019-318      K5039        Li    1week          Control  62 0.12880417
NSC-2019-319     K50128        Li    1week          Control  45 0.31397647
NSC-2019-320       U815        Li    1week          Patient  32 0.14236717
NSC-2019-321       U793        Li    1week          Patient  23 0.28045362
NSC-2019-322       U162        Li    1week          Patient  38 0.30689310


I.e., 6 donors: 3 controls and 3 patients. For each donor, we have generated iPSC-derived NPCs and split these into 4 samples before running RNA-seq. One sample treated with DMSO (control treatment) for 6 hours, one sample treated with lithium (Li) for 6 hours, one sample treated with DMSO for 1 week, and one sample treated for lithium for 1 week. In total, 24 samples.

We want to identify the genes that are DE between Li-DMSO at 6 hours and Li-DMSO at 1 week across patient/control status (i.e. different treatment response in patients and controls) while adjusting for age and CPT_NPC effects. What makes this design different from all other posted designs I have seen, is that this one is a nested paired design. I.e., I need to somehow extract the DMSO effect from the treatment effect (since these are separate samples from the same subject) before running the comparison. I am pretty sure DEseq2 cannot do this, but maybe it is possible with limma?

I have seen some authors generate robust Z scores (RZS), which combine the DMSO values and the treatment values into a single score and then run DE analysis. This is mostly done with microarray measurements, but can a similar approach be followed for RNA-seq data and then run limma?

RNA-seq Nested paired design limma • 453 views
0
Entering edit mode
1
Entering edit mode
@gordon-smyth
Last seen 10 minutes ago
WEHI, Melbourne, Australia

This type of design is called "multi-level" in the limma User's Guide and there have actually been quite a few questions on this forum about designs like this over the years.

The easiest way to analyse the experiment is to estimate intra-subject correlations using duplicateCorrelation. Then Subject doesn't need to be included in the design matrix and the analysis becomes quite straightforward.

It is also possible to include Subject in the design matrix, and to estimate treatment effects within the patient/control groups.

There is certainly no need to generate Z scores -- that is not a recommended approach. Correcting the Li treatment effects for the DMSO control is instead done by forming interaction contrasts.

0
Entering edit mode

Hello Ibrahim, Gordon (this is an extension of the question)

I have a data set that is similarly nested in subject (like "Subject_ID" above), but also has a real analytical batches (39 samples were processed in 4 batches). My recollection is that duplicateCorrelation can only address one (random?) factor. Some guidance would be appreciated. Include 'batch' as a variable in the model so it can be regressed out?

Thanks, Nathan

0
Entering edit mode

Please post a new question with enough detail to answer your question. If you want bespoke advance, then you need to post the details of your experiment like the poster above did. Depending on the experimental design and your questions of interest, you might not need duplicateCorrelation or batch correction.