So I have the following samples for differential expression analysis and I'm hoping to see see if my design matrix makes sense. There are cell samples from three different donors each gone through 2 different cell culturing processes and 5 different treatments. The goal is to look at the differences between different treatments and also between different processes as well. Samples that gone through process A have data for all 5 treatments, while samples that gone through process B only have data for 2 of the 5 treatments. Is the design matrix here the right construction? Thanks a lot!

sampleInfo <- read_csv(<samplemanifest_csvfile>,col_names=TRUE Donor <- factor(sampleInfo$Donor) Treatment <- factor(sampleInfo$Treatment) Process <- factor(sampleInfo$Process) design <- model.matrix(~0+Treatment+Process+Donor)

Donors |
Process |
Treatment |

P01 | A | 1 |

P01 | A | 2 |

P01 | A | 3 |

P01 | A | 4 |

P01 | A | 5 |

P02 | A | 1 |

P02 | A | 2 |

P02 | A | 3 |

P02 | A | 4 |

P02 | A | 5 |

P03 | A | 1 |

P03 | A | 2 |

P03 | A | 3 |

P03 | A | 4 |

P03 | A | 5 |

P01 | B | 2 |

P01 | B | 5 |

P02 | B | 2 |

P02 | B | 5 |

P03 | B | 2 |

P03 | B | 5 |

