Once again: Analyze Paired Samples from DESeq2 Vignette design formula and how to interprete
1
0
Entering edit mode
Benoit.Fiset ▴ 30
@benoitfiset-18473
Last seen 4.4 years ago

Hello,

I've read the DESEq2 Vignette anything that Google returned from searches about "DESeq2 paired samples" and "DESeq2 Multi-factor designs" and still don't understand how to extract the data from the resultsNames(dds) structure. I understand how contrast works but the meaning of the underlying pairs I can't seem to grasp / be able extract.

Here we go:

>  sampleTable
                         COL_NAME SAMPLE PATIENT SEX INFLAMMATION LOADING
WOP_202_CONT_S3   WOP_202_CONT_S3    S03 WOP_202   F          LOW    CONT
WOP_214_CONT_S6   WOP_214_CONT_S6    S06 WOP_214   M         HIGH    CONT
WOP_225_CONT_S9   WOP_225_CONT_S9    S09 WOP_225   M          LOW    CONT
WOP_187_CONT_S12 WOP_187_CONT_S12    S12 WOP_187   M         HIGH    CONT
WOP_192_CONT_S15 WOP_192_CONT_S15    S15 WOP_192   M         HIGH    CONT
WOP_201_CONT_S18 WOP_201_CONT_S18    S18 WOP_201   F          LOW    CONT
WOP_202_HIGH_S2   WOP_202_HIGH_S2    S02 WOP_202   F          LOW    HIGH
WOP_214_HIGH_S5   WOP_214_HIGH_S5    S05 WOP_214   M         HIGH    HIGH
WOP_225_HIGH_S8   WOP_225_HIGH_S8    S08 WOP_225   M          LOW    HIGH
WOP_187_HIGH_S11 WOP_187_HIGH_S11    S11 WOP_187   M         HIGH    HIGH
WOP_192_HIGH_S14 WOP_192_HIGH_S14    S14 WOP_192   M         HIGH    HIGH
WOP_201_HIGH_S17 WOP_201_HIGH_S17    S17 WOP_201   F          LOW    HIGH
WOP_202_LOW_S1     WOP_202_LOW_S1    S01 WOP_202   F          LOW     LOW
WOP_214_LOW_S4     WOP_214_LOW_S4    S04 WOP_214   M         HIGH     LOW
WOP_225_LOW_S7     WOP_225_LOW_S7    S07 WOP_225   M          LOW     LOW
WOP_187_LOW_S10   WOP_187_LOW_S10    S10 WOP_187   M         HIGH     LOW
WOP_192_LOW_S13   WOP_192_LOW_S13    S13 WOP_192   M         HIGH     LOW
WOP_201_LOW_S16   WOP_201_LOW_S16    S16 WOP_201   F          LOW     LOW

> str(sampleTable)
'data.frame':   18 obs. of  6 variables:
 $ COL_NAME    : Factor w/ 18 levels "WOP_187_CONT_S12",..: 10 13 16 1 4 7 11 14 17 2 ...
 $ SAMPLE      : Factor w/ 18 levels "S01","S02","S03",..: 3 6 9 12 15 18 2 5 8 11 ...
 $ PATIENT     : Factor w/ 6 levels "WOP_187","WOP_192",..: 4 5 6 1 2 3 4 5 6 1 ...
 $ SEX         : Factor w/ 2 levels "F","M": 1 2 2 2 2 1 1 2 2 2 ...
 $ INFLAMMATION: Factor w/ 2 levels "HIGH","LOW": 2 1 2 1 1 2 2 1 2 1 ...
 $ LOADING     : Factor w/ 3 levels "CONT","HIGH",..: 1 1 1 1 1 1 2 2 2 2 ...

Gets feed to with the design based on the DESeq 2 vignette FAQ paired-smples

> ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design = ~ PATIENT + LOADING)
> dds<-DESeq(ddsHTSeq,parallel = TRUE)

And this spits out the following:

> as.data.frame(colData(dds))
                 SAMPLE PATIENT SEX INFLAMMATION LOADING sizeFactor
WOP_202_CONT_S3     S03 WOP_202   F          LOW    CONT  0.6649501
WOP_214_CONT_S6     S06 WOP_214   M         HIGH    CONT  0.7315580
WOP_225_CONT_S9     S09 WOP_225   M          LOW    CONT  1.4301472
WOP_187_CONT_S12    S12 WOP_187   M         HIGH    CONT  0.9628168
WOP_192_CONT_S15    S15 WOP_192   M         HIGH    CONT  0.8955194
WOP_201_CONT_S18    S18 WOP_201   F          LOW    CONT  1.2467095
WOP_202_HIGH_S2     S02 WOP_202   F          LOW    HIGH  0.5552934
WOP_214_HIGH_S5     S05 WOP_214   M         HIGH    HIGH  0.9598039
WOP_225_HIGH_S8     S08 WOP_225   M          LOW    HIGH  1.0751180
WOP_187_HIGH_S11    S11 WOP_187   M         HIGH    HIGH  1.2218595
WOP_192_HIGH_S14    S14 WOP_192   M         HIGH    HIGH  1.0896355
WOP_201_HIGH_S17    S17 WOP_201   F          LOW    HIGH  1.7084590
WOP_202_LOW_S1      S01 WOP_202   F          LOW     LOW  0.5507257
WOP_214_LOW_S4      S04 WOP_214   M         HIGH     LOW  1.0685809
WOP_225_LOW_S7      S07 WOP_225   M          LOW     LOW  1.1032990
WOP_187_LOW_S10     S10 WOP_187   M         HIGH     LOW  1.2393120
WOP_192_LOW_S13     S13 WOP_192   M         HIGH     LOW  1.0462885
WOP_201_LOW_S16     S16 WOP_201   F          LOW     LOW  1.5054519

> as.data.frame(resultsNames(dds))
           resultsNames(dds)
1                  Intercept
2 PATIENT_WOP_192_vs_WOP_187
3 PATIENT_WOP_201_vs_WOP_187
4 PATIENT_WOP_202_vs_WOP_187
5 PATIENT_WOP_214_vs_WOP_187
6 PATIENT_WOP_225_vs_WOP_187
7       LOADING_HIGH_vs_CONT
8        LOADING_LOW_vs_CONT

Now what I want to achieve with this is pair wise compare. By looking at the resultsNames(dds) seem as the design formula design = ~ PATIENT + LOADING) gives, from what I can grasp, All conditions (CONT, LOW, HIGH) between All PATIENTS and All HIGH vs CONT and All LOW vs CONT.

Now my question with these results, how do I use contrast to extract Paired sample analyses ? Looking to do these examples:

1- PATIENT WOP_201 CONT vs PATIENT WOP_201 LOW
2- PATIENT WOP_201 CONT vs PATIENT WOP_201 HIGH
3- PATIENT WOP_201 HIGH vs PATIENT WOP_201 LOW

4- PATIENT WOP_202 CONT vs PATIENT WOP_202 LOW
5- PATIENT WOP_202 CONT vs PATIENT WOP_202 HIGH
6- PATIENT WOP_202 HIGH vs PATIENT WOP_202 LOW

7-18: RInse and repeat same pattern for all the patients.

19-  Is it possible to to get condition (HIGH + LOW) vs CONT ?

I don't know what to put in the contrast section to extract the pair wise data within PATIENT ?

 res<-results(dds,parallel = TRUE,contrast=c( ????????  ,"CONT","HIGH"))

From the results of resultsNames(dds) I see that I can do some thing like:

> res<-results(dds,parallel = TRUE,contrast=c("PATIENT","WOP_192","WOP_187"))

But my guess these results are for (CONT + HIGH + LOW) for PATIENT WOP_192 vs (CONT + HIGH + LOW) for PATIENT WOP_187.

With another design [ design = ~ LOADING ] with the same sampleTable:

> ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design = ~ LOADING)
> as.data.frame(resultsNames(dds))
     resultsNames(dds)
1            Intercept
2 LOADING_HIGH_vs_CONT
3  LOADING_LOW_vs_CONT

I used these conditions, and things work out and get the wanted results...

> res<-results(dds,parallel = TRUE,contrast=c("LOADING","CONT","LOW"))
> res<-results(dds,parallel = TRUE,contrast=c("LOADING","CONT","HIGH"))
> res<-results(dds,parallel = TRUE,contrast=c("LOADING","HIGH","LOW"))

Thanks for the help and steering me towards an enlightened path,

B.

deseq2 • 723 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 4 hours ago
United States

I don't know what to put in the contrast section to extract the pair wise data within PATIENT ?

You have performed a pairwise analysis by using the design you have chosen. You can now compare cont, low and high using the contrast argument. The first element of three which you provide to contrast should be the name of the factor, so loading in your case.

See here:

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#differential-expression-analysis

or here:

https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#building-the-results-table

ADD COMMENT
0
Entering edit mode

So what you are saying is:

 res<-results(dds,parallel = TRUE,contrast=c("LOADING" ,"CONT","HIGH"))

So this is pair wise between the LOADING conditions... 6 x CONT vs 6 x HIGH in the code above.

But can I get to a granularity of pair wise within PATIENT analysts for example:

PATIENT WOP_201 CONT vs PATIENT WOP_201 LOW

Thanks,

ADD REPLY
0
Entering edit mode

You can’t do this analysis. I’d recommend consulting with a local statistician if you’d like to understand why.

ADD REPLY

Login before adding your answer.

Traffic: 837 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6