Question

Question about DESeq2 Design for Paired Treatment Study

0

Entering edit mode

Junmo • 0

@87b5baf9

Last seen 1 day ago

United States

Hi, I have a question about the best practice design for my dataset.

I have total 300 samples with

Medicine group: 150 samples (75 pre-treatment, 75 post-treatment)
Placebo group: 150 samples (75 pre-placebo, 75 post-placebo)

My goal is to determine whether medicine has a treatment effect compared to placebo. My samples have following characteristics:

Generally low sequencing depth, which initially produced a winged volcano plot due to high variability in low-count genes
PCA analysis (using RSEM TPM values) revealed the following covariates:

PC1: Batch effects
PC2: Age
PC2 + PC3: Pre/post treatment timing (regardless of medicine vs. placebo)

Therefore, I designed a DESeq2 interaction model including the covariates (paired sample information, batch, sex, age). The code is shown below. However, I'm encountering a "model matrix not full rank" error. While I found some guidance online for resolving this issue, I'm not confident my overall design is appropriate for my case. Could you provide advice on the best way to analyze this paired treatment dataset?

library(DESeq2)
countData <- read.csv('count.csv', row.names=1, check.names=FALSE)
colData <- read.csv("meta.csv", row.names=1)
countData <- round(countData)
colData$subject_id <- factor(colData$subject_id)
colData$batch <- factor(colData$batch)
colData$sex <- factor(colData$sex)
colData$age <- as.numeric(colData$age)
colData$age <- scale(colData$age) # Center and scale the age variable
colData$time <- factor(colData$time, levels = c("pre", "post")) # pre as reference
colData$condition <- factor(colData$condition, levels = c("Placebo", "Medicine")) # Placebo as reference
dds <- DESeqDataSetFromMatrix(countData = countData,
colData = colData,
design = ~ subject_id(paired_sample info) + batch + sex + age + time(pre/post) + condition(placebo/medicine) + time:condition
keep <- rowSums(counts(dds) >= 10) >= 30 # %10 of my total samples. Might be bit strigent?
dds <- dds[keep,]
dds <- DESeq(dds)
res <- results(dds, name = "timepost.conditionDFO")

Thanks for your help

Regards,

Junmo

DESeq2 • 180 views

ADD COMMENT • link updated 4 weeks ago by ATpoint ★ 5.0k • written 4 weeks ago by Junmo • 0

score 0 · Answer 1 · 2025-11-30

It's a paired analysis so you don't need any patient-specific covariates such as age and sex. What you call batch effect might just be the donor effect? I would inspect PCA after regressing donor and then see whether you can capture unwanted variations with RUVseq or svaseq and include a few factors into the design. A full-factorial model, see vignette on interactions, might be easier to interpret here.