I have a complex design question with limma. We have 12 patients and we have several RNA seq data from each patient. PAtients have different sexes and race. From each patient we collected samples from different parts of the diseased and health tissues. So the idea is to see the heterogeneity of the disease. Therefore, each diseased sample from each patient supposed to be different with no replicates. So the phenotype looks like this
Sample.ID (Unique) | Patient (1:12) | Tissue_Type (Diseased or Healthy) |Gender | Race | Label (did manually details below)
The idea is to compare each individual diseased tissue to its control. But also which individual diseased samples from different patients are similar. Anothher problem we have is that some patients we do not have a healthy tissue. To overcome that I treated all the healthy tissues from different patients as biological replicates and given them the same label H. As for the diseased I labeled them seperetaly, i.e. some examples 1_D_1 (First diseased sample from patient 1) 1_D_2 (Second diseased sample from patient 2), 2_D_1 ( Second diseased sample from patient 2) etc. I want to get DE Genes associated with contrasts (1_D_2-H) or in some cases (1_D_1-2_D_1).
I am using duplicate correlation of limma/voom with blocking on patients and as for design formula I am using (~0+Label). However, this is very complex to me and I am not sure whether I should add the effects of Gender and Race and use formula (~0+Label+Gender+Race) or are the effects of gender and race incorporated with the duplicate correlation? Any help is appreciated.