Subsetting DESeq data set to compare treatments within one group (multi-group experiment)
Entering edit mode
Last seen 17 days ago

Hi, I am having issues subsetting my DESeqDataSet to compare treatments within just one group of samples in my multi-group experiment.

My coldata contains the following factors: CellType, PatientGroup, and Treatment.

  • 2 cell types ($CellType): epithelial cells (EEC) and stromal cells (ESC)
  • 4 patient groups ($PatientGroup): A, B, C, D
  • 2 treatments ($Treatment): untreated and treated

I used the DESeq2 package in R to analyze differential gene expression following RNA-seq:

#DESeq design formula
dds <- DESeqDataSetFromMatrix(countData = cts_clean,
  colData = coldata,
  design = ~ CellType+PatientGroup+Treatment)

#setting "untreated" as the reference level and running DESeq
dds$Treatment <- relevel(dds$Treatment, ref = "untreated")
dds <- DESeq(dds)

I want to subset the data and only get the differential expression results when comparing treated to untreated samples within each CellType separately, i.e. identify DEGs between untreated and treated epithelial cells (EEC only). This is the code I used to subset the EEC and ESC samples from the original data set and obtain the separate results:

#subset EEC and ESC samples separately; 66 EEC samples and 69 ESC samples
dds_EEC <- dds[, dds$CellType %in% c("EEC")]
dds_EEC$CellType <- droplevels(dds_$CellType)

dds_ESC <- dds[, dds$CellType %in% c("ESC")]
dds_ESC$CellType <- droplevels(dds_$CellType)

#identify differentially expressed genes using the results function
results_EEC <- results(dds_EEC, contrast=c("Treatment","HPL","SFM"))
results_ESC <- results(dds_ESC, contrast=c("Treatment","HPL","SFM"))

I also tried an alternative line of code for subsetting before using the results function:

dds_EEC <- subset(dds, select=colData(dds)$CellType=="EEC")
dds_EEC$CellType <- droplevels(dds_EEC$CellType)

dds_ESC <- subset(dds, select=colData(dds)$CellType=="ESC")
dds_ESC$CellType <- droplevels(dds_ESC$CellType)

However, when I view the summary of my dds_EEC or dds_ESC, it is still showing all of my samples (total N=135). So for my results, it is still giving me the combined results of both cell types, as if I ran the results function as:

results_wholedataset <- results(dds, contrast=c("Treatment","HPL","SFM"))

Because my samples are distinctly different depending on their cell type, I want to analyze the DEGs separately but still run DESeq on the entire data set (as is recommended in the DESeq2 vignette and FAQs for multiple groups).

Downstream I also want to look at contrasting untreated vs. treated cells per patient group (A, B, C, D) but still within each cell type (EEC or ESC), but I need to get the code to subset the data correctly first before trying a second level of subsetting for individual patient groups.

Can anyone please help identify what I am doing incorrectly with the code?

Thank you!

DifferentialExpression DESeq2 • 226 views
Entering edit mode

Michael Love If you can please help me sort this out, would be very much appreciated!

Entering edit mode
ATpoint ★ 2.9k
Last seen 16 hours ago

Please post over at -- the support site is for technical problems with the packages, not for consultations or hands-on guidance with your particular analysis.

Entering edit mode

Login before adding your answer.

Traffic: 339 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6