DESeq2 design matrix for normal & enriched samples
1
0
Entering edit mode
newmanss • 0
@newmanss-7190
Last seen 4.6 years ago
United States

Hello,

I am working with a dataset composed of three treatment groups.  I have paired samples that consist of a baseline sample and the same sample which has been enriched for active transcription.  The batch variable accounts for batches during sample preparation. The experimental design looks like this:

 SampleName Treatment Batch Pair.nested RT1_Input RT 2 1 RT1_IP RT 2 1 RT2_Input RT 2 2 RT2_IP RT 2 2 RT3_Input RT 1 3 RT3_IP RT 1 3 RT4_Input RT 1 4 RT4_IP RT 1 4 C1_Input C 2 1 C1_IP C 2 1 C2_Input C 2 2 C2_IP C 2 2 C3_Input C 1 3 C3_IP C 1 3 C4_Input C 1 4 C4_IP C 1 4 W1_Input W 2 1 W1_IP W 2 1 W2_Input W 2 2 W2_IP W 2 2 W3_Input W 1 3 W3_IP W 1 3 W4_Input W 1 4 W4_IP W 1 4

I need to compare C vs RT, W vs RT, and C vs W for differential expression due to treatment while controlling for the enrichment and the batch effects.   I've used design = ~Pair.nested + Batch + Treatment.  Does this design properly control for the IP/Input enrichment?  Should I approach this differently?  How should I specify the contrasts?

R version 3.1.1 (2014-07-10)

DESeq2 1.4.5

## Platform: i386-w64-mingw32/i386 (32-bit)

##

## locale:

## [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252

## [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C

## [5] LC_TIME=English_United States.1252

##

## attached base packages:

## [1] parallel stats graphics grDevices utils datasets methods base

##

## other attached packages:

## [1] RColorBrewer_1.1-2 gplots_2.16.0 DESeq2_1.4.5

## [7] GenomeInfoDb_1.0.2 IRanges_1.22.10 BiocGenerics_0.10.0

## [10] genefilter_1.44.0 lattice_0.20-29 gtools_3.4.1

##

## loaded via a namespace (and not attached):

## [1] annotate_1.40.1 AnnotationDbi_1.26.1 Biobase_2.22.0 BiocStyle_1.0.0

## [5] bitops_1.0-6 caTools_1.17.1 DBI_0.3.1 digest_0.6.8

## [9] evaluate_0.5.5 formatR_1.0 gdata_2.13.3 geneplotter_1.40.0

## [13] grid_3.1.1 htmltools_0.2.6 KernSmooth_2.23-14 knitr_1.9

## [17] locfit_1.5-9.1 rmarkdown_0.5.1 RSQLite_1.0.0 splines_3.1.1

## [21] stats4_3.1.1 stringr_0.6.2 survival_2.37-7 tools_3.1.1

## [25] XML_3.98-1.1 xtable_1.7-4 XVector_0.2.0

deseq2 multiple factor design • 1.3k views
0
Entering edit mode

"differential expression due to treatment while controlling for the enrichment and the batch effects"

Just to check: you mean that you want to find genes which have differential expression due to treatment in the enrichment samples relative to baseline?

0
Entering edit mode

I am looking for differential expression in the enriched samples due to treatment after normalizing (adjusting?)  for the enrichment.  I have tried using FC ratios from DESeq2 run for each condition individually using enriched vs unenriched (IP/Input) for the contrast.  The resulting fold change would be the normalized enrichment ratio.  Then manually, divide the treatment ratios, C by RT etc. for an expression ratio for each contrast.  I'm using DESeq2 with the design above and finding very few genes meeting padj<0.1 for any contrast.  I have also run a standard gene expression analysis on the input (unenriched) samples to obtain a list of expressed genes as one would normally see.

Many thanks for any assistance you can offer,

Susan

0
Entering edit mode
@mikelove
Last seen 3 days ago
United States

hi Susan,

Looking for DE due to treatment in the enrichment samples relative to baseline can be accomplished by an interaction term. You have a 'treatment' column, but you will also need a new column, say 'type' which is Input or IP.

Then use relevel() to make sure Input is the base level (it should thanks to alphabetical order, but just to note this).

Then use a design of ~ treatment + type + treatment:pair.nested + treatment:type

Note that you can't add a batch term, because this is collinear with treatment:pair.nested. Which means that by controlling for treatment:pair.nested, you will control for differences across batch.

When running DESeq(), use betaPrior=FALSE, so that you have the standard terms (this will just make it easier for extracting contrasts with the interactions in the design).

You should then have in resultsNames(dds),  two terms which look something like treatmentRT.typeIP, treatmentW.typeIP

The two terms are compared against the reference level treatment C, and so RT vs C can be extracted with:

results(dds, name="treatmentRT.typeIP")

You can compare RT vs W with

results(dds, contrast=list("treatmentRT.typeIP","treatmentW.typeIP"))

Remember, these are *interaction terms*, so results tables are for the *extra* change in enrichment samples controlling for changes seen in Input.

0
Entering edit mode

Many thanks for your detailed answer.  I'm looking forward to seeing the result.

Susan