Question

DESeq2 design matrix for normal & enriched samples

0

Entering edit mode

newmanss • 0

@newmanss-7190

Last seen 6.3 years ago

United States

Hello,

I am working with a dataset composed of three treatment groups. I have paired samples that consist of a baseline sample and the same sample which has been enriched for active transcription. The batch variable accounts for batches during sample preparation. The experimental design looks like this:

SampleName	Treatment	Batch	Pair.nested
RT1_Input	RT	2	1
RT1_IP	RT	2	1
RT2_Input	RT	2	2
RT2_IP	RT	2	2
RT3_Input	RT	1	3
RT3_IP	RT	1	3
RT4_Input	RT	1	4
RT4_IP	RT	1	4
C1_Input	C	2	1
C1_IP	C	2	1
C2_Input	C	2	2
C2_IP	C	2	2
C3_Input	C	1	3
C3_IP	C	1	3
C4_Input	C	1	4
C4_IP	C	1	4
W1_Input	W	2	1
W1_IP	W	2	1
W2_Input	W	2	2
W2_IP	W	2	2
W3_Input	W	1	3
W3_IP	W	1	3
W4_Input	W	1	4
W4_IP	W	1	4

I need to compare C vs RT, W vs RT, and C vs W for differential expression due to treatment while controlling for the enrichment and the batch effects. I've used design = ~Pair.nested + Batch + Treatment. Does this design properly control for the IP/Input enrichment? Should I approach this differently? How should I specify the contrasts?

R version 3.1.1 (2014-07-10)

DESeq2 1.4.5

## Platform: i386-w64-mingw32/i386 (32-bit)

##

## locale:

## [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252

## [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C

## [5] LC_TIME=English_United States.1252

##

## attached base packages:

## [1] parallel stats graphics grDevices utils datasets methods base

##

## other attached packages:

## [1] RColorBrewer_1.1-2 gplots_2.16.0 DESeq2_1.4.5

## [4] RcppArmadillo_0.4.650.1.1 Rcpp_0.11.4 GenomicRanges_1.16.4

## [7] GenomeInfoDb_1.0.2 IRanges_1.22.10 BiocGenerics_0.10.0

## [10] genefilter_1.44.0 lattice_0.20-29 gtools_3.4.1

##

## loaded via a namespace (and not attached):

## [1] annotate_1.40.1 AnnotationDbi_1.26.1 Biobase_2.22.0 BiocStyle_1.0.0

## [5] bitops_1.0-6 caTools_1.17.1 DBI_0.3.1 digest_0.6.8

## [9] evaluate_0.5.5 formatR_1.0 gdata_2.13.3 geneplotter_1.40.0

## [13] grid_3.1.1 htmltools_0.2.6 KernSmooth_2.23-14 knitr_1.9

## [17] locfit_1.5-9.1 rmarkdown_0.5.1 RSQLite_1.0.0 splines_3.1.1

## [21] stats4_3.1.1 stringr_0.6.2 survival_2.37-7 tools_3.1.1

## [25] XML_3.98-1.1 xtable_1.7-4 XVector_0.2.0

deseq2 multiple factor design • 1.7k views

ADD COMMENT • link updated 9.0 years ago by Michael Love 41k • written 9.0 years ago by newmanss • 0

0

Entering edit mode

"differential expression due to treatment while controlling for the enrichment and the batch effects"

Just to check: you mean that you want to find genes which have differential expression due to treatment in the enrichment samples relative to baseline?

ADD REPLY • link 9.0 years ago Michael Love 41k

0

Entering edit mode

I am looking for differential expression in the enriched samples due to treatment after normalizing (adjusting?) for the enrichment. I have tried using FC ratios from DESeq2 run for each condition individually using enriched vs unenriched (IP/Input) for the contrast. The resulting fold change would be the normalized enrichment ratio. Then manually, divide the treatment ratios, C by RT etc. for an expression ratio for each contrast. I'm using DESeq2 with the design above and finding very few genes meeting padj<0.1 for any contrast. I have also run a standard gene expression analysis on the input (unenriched) samples to obtain a list of expressed genes as one would normally see.

Many thanks for any assistance you can offer,

Susan

ADD REPLY • link 9.0 years ago newmanss • 0

score 0 · Answer 1 · 2015-05-01

hi Susan,

Looking for DE due to treatment in the enrichment samples relative to baseline can be accomplished by an interaction term. You have a 'treatment' column, but you will also need a new column, say 'type' which is Input or IP.

Then use relevel() to make sure Input is the base level (it should thanks to alphabetical order, but just to note this).

Then use a design of ~ treatment + type + treatment:pair.nested + treatment:type

Note that you can't add a batch term, because this is collinear with treatment:pair.nested. Which means that by controlling for treatment:pair.nested, you will control for differences across batch.

When running DESeq(), use betaPrior=FALSE, so that you have the standard terms (this will just make it easier for extracting contrasts with the interactions in the design).

You should then have in resultsNames(dds), [edit] two terms which look something like treatmentRT.typeIP, treatmentW.typeIP

The two terms are compared against the reference level treatment C, and so RT vs C can be extracted with:

results(dds, name="treatmentRT.typeIP")

You can compare RT vs W with

results(dds, contrast=list("treatmentRT.typeIP","treatmentW.typeIP"))

Remember, these are *interaction terms*, so results tables are for the *extra* change in enrichment samples controlling for changes seen in Input.