Subsetting DESeq data set to compare treatments within one group (multi-group experiment)
1
2
Entering edit mode
@9dc4c0e9
Last seen 10 months ago
Canada

Hi, I am having issues subsetting my DESeqDataSet to compare treatments within just one group of samples in my multi-group experiment.

My coldata contains the following factors: CellType, PatientGroup, and Treatment.

  • 2 cell types ($CellType): epithelial cells (EEC) and stromal cells (ESC)
  • 4 patient groups ($PatientGroup): A, B, C, D
  • 2 treatments ($Treatment): untreated and treated

I used the DESeq2 package in R to analyze differential gene expression following RNA-seq:

#DESeq design formula
dds <- DESeqDataSetFromMatrix(countData = cts_clean,
  colData = coldata,
  design = ~ CellType+PatientGroup+Treatment)


#setting "untreated" as the reference level and running DESeq
dds$Treatment <- relevel(dds$Treatment, ref = "untreated")
dds <- DESeq(dds)

I want to subset the data and only get the differential expression results when comparing treated to untreated samples within each CellType separately, i.e. identify DEGs between untreated and treated epithelial cells (EEC only). This is the code I used to subset the EEC and ESC samples from the original data set and obtain the separate results:

#subset EEC and ESC samples separately; 66 EEC samples and 69 ESC samples
dds_EEC <- dds[, dds$CellType %in% c("EEC")]
dds_EEC$CellType <- droplevels(dds_$CellType)

dds_ESC <- dds[, dds$CellType %in% c("ESC")]
dds_ESC$CellType <- droplevels(dds_$CellType)

#identify differentially expressed genes using the results function
results_EEC <- results(dds_EEC, contrast=c("Treatment","HPL","SFM"))
results_ESC <- results(dds_ESC, contrast=c("Treatment","HPL","SFM"))

I also tried an alternative line of code for subsetting before using the results function:

dds_EEC <- subset(dds, select=colData(dds)$CellType=="EEC")
dds_EEC$CellType <- droplevels(dds_EEC$CellType)

dds_ESC <- subset(dds, select=colData(dds)$CellType=="ESC")
dds_ESC$CellType <- droplevels(dds_ESC$CellType)

However, when I view the summary of my dds_EEC or dds_ESC, it is still showing all of my samples (total N=135). So for my results, it is still giving me the combined results of both cell types, as if I ran the results function as:

results_wholedataset <- results(dds, contrast=c("Treatment","HPL","SFM"))

Because my samples are distinctly different depending on their cell type, I want to analyze the DEGs separately but still run DESeq on the entire data set (as is recommended in the DESeq2 vignette and FAQs for multiple groups).

Downstream I also want to look at contrasting untreated vs. treated cells per patient group (A, B, C, D) but still within each cell type (EEC or ESC), but I need to get the code to subset the data correctly first before trying a second level of subsetting for individual patient groups.

Can anyone please help identify what I am doing incorrectly with the code?

Thank you!

DifferentialExpression DESeq2 • 875 views
ADD COMMENT
0
Entering edit mode

Michael Love If you can please help me sort this out, would be very much appreciated!

ADD REPLY
1
Entering edit mode
ATpoint ★ 4.0k
@atpoint-13662
Last seen 5 hours ago
Germany

Please post over at biostars.org -- the support site is for technical problems with the packages, not for consultations or hands-on guidance with your particular analysis.

ADD COMMENT
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 730 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6