Hello,
I've read the DESEq2 Vignette anything that Google returned from searches about "DESeq2 paired samples" and "DESeq2 Multi-factor designs" and still don't understand how to extract the data from the resultsNames(dds) structure. I understand how contrast works but the meaning of the underlying pairs I can't seem to grasp / be able extract.
Here we go:
> sampleTable
COL_NAME SAMPLE PATIENT SEX INFLAMMATION LOADING
WOP_202_CONT_S3 WOP_202_CONT_S3 S03 WOP_202 F LOW CONT
WOP_214_CONT_S6 WOP_214_CONT_S6 S06 WOP_214 M HIGH CONT
WOP_225_CONT_S9 WOP_225_CONT_S9 S09 WOP_225 M LOW CONT
WOP_187_CONT_S12 WOP_187_CONT_S12 S12 WOP_187 M HIGH CONT
WOP_192_CONT_S15 WOP_192_CONT_S15 S15 WOP_192 M HIGH CONT
WOP_201_CONT_S18 WOP_201_CONT_S18 S18 WOP_201 F LOW CONT
WOP_202_HIGH_S2 WOP_202_HIGH_S2 S02 WOP_202 F LOW HIGH
WOP_214_HIGH_S5 WOP_214_HIGH_S5 S05 WOP_214 M HIGH HIGH
WOP_225_HIGH_S8 WOP_225_HIGH_S8 S08 WOP_225 M LOW HIGH
WOP_187_HIGH_S11 WOP_187_HIGH_S11 S11 WOP_187 M HIGH HIGH
WOP_192_HIGH_S14 WOP_192_HIGH_S14 S14 WOP_192 M HIGH HIGH
WOP_201_HIGH_S17 WOP_201_HIGH_S17 S17 WOP_201 F LOW HIGH
WOP_202_LOW_S1 WOP_202_LOW_S1 S01 WOP_202 F LOW LOW
WOP_214_LOW_S4 WOP_214_LOW_S4 S04 WOP_214 M HIGH LOW
WOP_225_LOW_S7 WOP_225_LOW_S7 S07 WOP_225 M LOW LOW
WOP_187_LOW_S10 WOP_187_LOW_S10 S10 WOP_187 M HIGH LOW
WOP_192_LOW_S13 WOP_192_LOW_S13 S13 WOP_192 M HIGH LOW
WOP_201_LOW_S16 WOP_201_LOW_S16 S16 WOP_201 F LOW LOW
> str(sampleTable)
'data.frame': 18 obs. of 6 variables:
$ COL_NAME : Factor w/ 18 levels "WOP_187_CONT_S12",..: 10 13 16 1 4 7 11 14 17 2 ...
$ SAMPLE : Factor w/ 18 levels "S01","S02","S03",..: 3 6 9 12 15 18 2 5 8 11 ...
$ PATIENT : Factor w/ 6 levels "WOP_187","WOP_192",..: 4 5 6 1 2 3 4 5 6 1 ...
$ SEX : Factor w/ 2 levels "F","M": 1 2 2 2 2 1 1 2 2 2 ...
$ INFLAMMATION: Factor w/ 2 levels "HIGH","LOW": 2 1 2 1 1 2 2 1 2 1 ...
$ LOADING : Factor w/ 3 levels "CONT","HIGH",..: 1 1 1 1 1 1 2 2 2 2 ...
Gets feed to with the design based on the DESeq 2 vignette FAQ paired-smples
> ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design = ~ PATIENT + LOADING)
> dds<-DESeq(ddsHTSeq,parallel = TRUE)
And this spits out the following:
> as.data.frame(colData(dds))
SAMPLE PATIENT SEX INFLAMMATION LOADING sizeFactor
WOP_202_CONT_S3 S03 WOP_202 F LOW CONT 0.6649501
WOP_214_CONT_S6 S06 WOP_214 M HIGH CONT 0.7315580
WOP_225_CONT_S9 S09 WOP_225 M LOW CONT 1.4301472
WOP_187_CONT_S12 S12 WOP_187 M HIGH CONT 0.9628168
WOP_192_CONT_S15 S15 WOP_192 M HIGH CONT 0.8955194
WOP_201_CONT_S18 S18 WOP_201 F LOW CONT 1.2467095
WOP_202_HIGH_S2 S02 WOP_202 F LOW HIGH 0.5552934
WOP_214_HIGH_S5 S05 WOP_214 M HIGH HIGH 0.9598039
WOP_225_HIGH_S8 S08 WOP_225 M LOW HIGH 1.0751180
WOP_187_HIGH_S11 S11 WOP_187 M HIGH HIGH 1.2218595
WOP_192_HIGH_S14 S14 WOP_192 M HIGH HIGH 1.0896355
WOP_201_HIGH_S17 S17 WOP_201 F LOW HIGH 1.7084590
WOP_202_LOW_S1 S01 WOP_202 F LOW LOW 0.5507257
WOP_214_LOW_S4 S04 WOP_214 M HIGH LOW 1.0685809
WOP_225_LOW_S7 S07 WOP_225 M LOW LOW 1.1032990
WOP_187_LOW_S10 S10 WOP_187 M HIGH LOW 1.2393120
WOP_192_LOW_S13 S13 WOP_192 M HIGH LOW 1.0462885
WOP_201_LOW_S16 S16 WOP_201 F LOW LOW 1.5054519
> as.data.frame(resultsNames(dds))
resultsNames(dds)
1 Intercept
2 PATIENT_WOP_192_vs_WOP_187
3 PATIENT_WOP_201_vs_WOP_187
4 PATIENT_WOP_202_vs_WOP_187
5 PATIENT_WOP_214_vs_WOP_187
6 PATIENT_WOP_225_vs_WOP_187
7 LOADING_HIGH_vs_CONT
8 LOADING_LOW_vs_CONT
Now what I want to achieve with this is pair wise compare. By looking at the resultsNames(dds) seem as the design formula design = ~ PATIENT + LOADING)
gives, from what I can grasp, All conditions (CONT, LOW, HIGH) between All PATIENTS and All HIGH vs CONT and All LOW vs CONT.
Now my question with these results, how do I use contrast to extract Paired sample analyses ? Looking to do these examples:
1- PATIENT WOP_201 CONT vs PATIENT WOP_201 LOW
2- PATIENT WOP_201 CONT vs PATIENT WOP_201 HIGH
3- PATIENT WOP_201 HIGH vs PATIENT WOP_201 LOW
4- PATIENT WOP_202 CONT vs PATIENT WOP_202 LOW
5- PATIENT WOP_202 CONT vs PATIENT WOP_202 HIGH
6- PATIENT WOP_202 HIGH vs PATIENT WOP_202 LOW
7-18: RInse and repeat same pattern for all the patients.
19- Is it possible to to get condition (HIGH + LOW) vs CONT ?
I don't know what to put in the contrast section to extract the pair wise data within PATIENT ?
res<-results(dds,parallel = TRUE,contrast=c( ???????? ,"CONT","HIGH"))
From the results of resultsNames(dds) I see that I can do some thing like:
> res<-results(dds,parallel = TRUE,contrast=c("PATIENT","WOP_192","WOP_187"))
But my guess these results are for (CONT + HIGH + LOW) for PATIENT WOP_192 vs (CONT + HIGH + LOW) for PATIENT WOP_187.
With another design [ design = ~ LOADING ]
with the same sampleTable:
> ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design = ~ LOADING)
> as.data.frame(resultsNames(dds))
resultsNames(dds)
1 Intercept
2 LOADING_HIGH_vs_CONT
3 LOADING_LOW_vs_CONT
I used these conditions, and things work out and get the wanted results...
> res<-results(dds,parallel = TRUE,contrast=c("LOADING","CONT","LOW"))
> res<-results(dds,parallel = TRUE,contrast=c("LOADING","CONT","HIGH"))
> res<-results(dds,parallel = TRUE,contrast=c("LOADING","HIGH","LOW"))
Thanks for the help and steering me towards an enlightened path,
B.
So what you are saying is:
So this is pair wise between the LOADING conditions... 6 x CONT vs 6 x HIGH in the code above.
But can I get to a granularity of pair wise within PATIENT analysts for example:
Thanks,
You can’t do this analysis. I’d recommend consulting with a local statistician if you’d like to understand why.