Using DESeq2 to analyse multi-variate design resulting in testing the wrong parameter
1
0
Entering edit mode
Jonas • 0
@f91ff31c
Last seen 19 months ago
United Kingdom

Enter the body of text here Hi, I am analysing a RNA Seq dataset coming from 3 independent cell isolates (isolate1, isolate2, isolate3), each given 3 different treatments (control, drug1, drug2).

We are testing drug 1 against control in the first instance:

We also observed that there is some variation among isolates - i.e., batch effect

We load the data into DESeq2, the coldata looks like this:

             drug            isolate
Sample 1     control       isolate1
Sample 2     control       isolate2
Sample 3     control       isolate3
Sample 4     drug1        isolate1
Sample 5     drug1        isolate2
Sample 6     drug1        isolate3

Code should be placed in three backticks as shown below

configure = data.frame(drug=factor(c("control","control","control","drug1","drug1","drug1")), isolate=c("isolate1","isolate2","isolate3","isolate1","isolate2","isolate3"))
dds = DESeqDataSetFromMatrix(countData = readcount[-1],colData = configure,design = ~ drug + isolate)

Then I performed DEG calling later

dds <- DESeq(dds)
res <- results(dds)

However it showed this message: log2 fold change (MLE): isolate isolate1 vs isolate2
Wald test p-value: isolate isolate1 vs isolate2
DataFrame with 15793 rows and 6 columns

I suppose as the formula design ~ drug + isolate suggests, it is going to test the supposed major effet which is the drug, rather than the isolate, and in the same time account for the batch effect brought in by isolates. But it has tested the effect of isolate1 vs isolate 2 instead, even though there are only 2 samples per isolate.

Could you please help how it would happen and how should I correct it? Or should I rearrange the design to ~ isolate + drug.

I am also very curious, given the 3X3 nested design, do we have an option to include all 9 samples under one dds and specify with R which ones we would like to involve in the test each time? Thank you very much!

batch RNASeq BatchEffect DESeq2 • 1.0k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 11 hours ago
United States

See the vignette about how to extract multiple results tables. Running results() without any additional arguments will only return a single table — it can’t guess which table you want.

I’d recommend to work out your statistical design with a local statistician or someone familiar with linear models in R.

ADD COMMENT
0
Entering edit mode

Thank you very much for your suggestions!

ADD REPLY

Login before adding your answer.

Traffic: 362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6