Nested paired design with covariates in limma
Entering edit mode
Last seen 3.2 years ago

Hi, I have a dataset like this:

              Subject_ID Treatment Duration Pasient_Kontroll Age    CTP_NPC
 NSC-2019-290      K5137      DMSO   6hours          Control  47 0.28845794
 NSC-2019-291      K5039      DMSO   6hours          Control  62 0.06154937
 NSC-2019-292     K50128      DMSO   6hours          Control  45 0.25351167
 NSC-2019-293       U815      DMSO   6hours          Patient  32 0.22540896
 NSC-2019-294       U793      DMSO   6hours          Patient  23 0.25467268
 NSC-2019-295       U162      DMSO   6hours          Patient  38 0.20031763
 NSC-2019-299      K5137      DMSO    1week          Control  47 0.21114240
 NSC-2019-300      K5039      DMSO    1week          Control  62 0.11281822
 NSC-2019-301     K50128      DMSO    1week          Control  45 0.23429267
 NSC-2019-302       U815      DMSO    1week          Patient  32 0.20184951
 NSC-2019-303       U793      DMSO    1week          Patient  23 0.35476381
 NSC-2019-304       U162      DMSO    1week          Patient  38 0.19938699
 NSC-2019-308      K5137        Li   6hours          Control  47 0.30317057
 NSC-2019-309      K5039        Li   6hours          Control  62 0.28853031
 NSC-2019-310     K50128        Li   6hours          Control  45 0.31602120
 NSC-2019-311       U815        Li   6hours          Patient  32 0.42000711
 NSC-2019-312       U793        Li   6hours          Patient  23 0.36429453
 NSC-2019-313       U162        Li   6hours          Patient  38 0.37220548
 NSC-2019-317      K5137        Li    1week          Control  47 0.18130258
 NSC-2019-318      K5039        Li    1week          Control  62 0.12880417
 NSC-2019-319     K50128        Li    1week          Control  45 0.31397647
 NSC-2019-320       U815        Li    1week          Patient  32 0.14236717
 NSC-2019-321       U793        Li    1week          Patient  23 0.28045362
 NSC-2019-322       U162        Li    1week          Patient  38 0.30689310

I.e., 6 donors: 3 controls and 3 patients. For each donor, we have generated iPSC-derived NPCs and split these into 4 samples before running RNA-seq. One sample treated with DMSO (control treatment) for 6 hours, one sample treated with lithium (Li) for 6 hours, one sample treated with DMSO for 1 week, and one sample treated for lithium for 1 week. In total, 24 samples.

We want to identify the genes that are DE between Li-DMSO at 6 hours and Li-DMSO at 1 week across patient/control status (i.e. different treatment response in patients and controls) while adjusting for age and CPT_NPC effects. What makes this design different from all other posted designs I have seen, is that this one is a nested paired design. I.e., I need to somehow extract the DMSO effect from the treatment effect (since these are separate samples from the same subject) before running the comparison. I am pretty sure DEseq2 cannot do this, but maybe it is possible with limma?

I have seen some authors generate robust Z scores (RZS), which combine the DMSO values and the treatment values into a single score and then run DE analysis. This is mostly done with microarray measurements, but can a similar approach be followed for RNA-seq data and then run limma?

RNA-seq Nested paired design limma • 549 views
Entering edit mode
Entering edit mode
Last seen 47 minutes ago
WEHI, Melbourne, Australia

This type of design is called "multi-level" in the limma User's Guide and there have actually been quite a few questions on this forum about designs like this over the years.

The easiest way to analyse the experiment is to estimate intra-subject correlations using duplicateCorrelation. Then Subject doesn't need to be included in the design matrix and the analysis becomes quite straightforward.

It is also possible to include Subject in the design matrix, and to estimate treatment effects within the patient/control groups.

There is certainly no need to generate Z scores -- that is not a recommended approach. Correcting the Li treatment effects for the DMSO control is instead done by forming interaction contrasts.

Entering edit mode

Hello Ibrahim, Gordon (this is an extension of the question)

I have a data set that is similarly nested in subject (like "Subject_ID" above), but also has a real analytical batches (39 samples were processed in 4 batches). My recollection is that duplicateCorrelation can only address one (random?) factor. Some guidance would be appreciated. Include 'batch' as a variable in the model so it can be regressed out?

Thanks, Nathan

Entering edit mode

Please post a new question with enough detail to answer your question. If you want bespoke advance, then you need to post the details of your experiment like the poster above did. Depending on the experimental design and your questions of interest, you might not need duplicateCorrelation or batch correction.


Login before adding your answer.

Traffic: 256 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6