input for DEseq2 differential expression and multi comparisons between samples
1
0
Entering edit mode
Safa.A • 0
@safaa-13866
Last seen 8.3 years ago
United States

Hello all,
I have RNAseq data for 24 samples divided as 8 different conditions and 3 biological replicates for each condition. the 8 conditions are for two different plants with 4 similar ages for each plant. the phenotypic data for these plants are either susceptible or resistant according to what age. the first plant has two susceptible ages and two resistant ages while the other plant has three susceptible ages and one resistant age.
I am using the htseq-DESeq2 pipeline to do the differential expression. my goals are: first, compare the different ages S with S and R with R for plant 1 then compare the list of genes from S to the list of genes with R. do the same comparison for plant 2 except that I have 3 S ages: 1 R age so I need to compare all three S ages with the R age.
second, compare two ages 14 and 21 between the two plants as 14 : 14 and 21 : 21.
Third, I need to know if the resistance phenotype is plant dependent means that there are unique R genes for each plant or there are common genes for R phenotype for both plants.
Here is my R code for the DESeq2:

sampleFiles <- list.files(path="/to/htseq-output")
directory <- c("/main directory/")
sampleCondition<- read.table("path/to/phenodata.txt",head=TRUE) 
sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, condition = sampleCondition)

my sampleCondition file is:

sampleID    cultivar    phenotype    condition
C7-1    C    S    C7
C7-2    C    S    C7
C7-3    C    S    C7
C10-1    C    S    C10
C10-2    C    S    C10
C10-3    C    S    C10
C14-1    C    R    C14
C14-2    C    R    C14
C14-3    C    R    C14
C21-1    C    R    C21
C21-2    C    R    C21
C21-3    C    R    C21
D7-1    D    S    D7
D7-2    D    S    D7
D7-3    D    S    D7
D10-1    D    S    D10
D10-2    D    S    D10
D10-3    D    S    D10
D14-1    D    S    D14
D14-2    D    S    D14
D14-3    D    S    D14
D21-1    D    R    D21
D21-2    D    R    D21
D21-3    D    R    D21

SampleTable file:

    sampleName    fileName    condition.sampleID    condition.cultivar    condition.phenotype    condition.condition
1    C10_1.txt    C10_1.txt    C10-1    C    S    C10
2    C10_2.txt    C10_2.txt    C10-2    C    S    C10
3    C10_3.txt    C10_3.txt    C10-3    C    S    C10
4    C14_1.txt    C14_1.txt    C14-1    C    R   C14
5    C14_2.txt    C14_2.txt    C14-2    C    R    C14
6    C14_3.txt    C14_3.txt    C14-3    C    R    C14
7    C21_1.txt    C21_1.txt    C21-1    C    R    C21
8    C21_2.txt    C21_2.txt    C21-2    C    R    C21
9    C21_3.txt    C21_3.txt    C21-3    C    R    C21
10    C7_1.txt    C7_1.txt    C7-1    C    S    C7
11    C7_2.txt    C7_2.txt    C7-2    C    S    C7
12    C7_3.txt    C7_3.txt    C7-3    C    S    C7
13    D10_1.txt    D10_1.txt    D10-1    D    S    D10
14    D10_2.txt    D10_2.txt    D10-2    D    S    D10
15    D10_3.txt    D10_3.txt    D10-3    D    S    D10
16    D14_1.txt    D14_1.txt    D14-1    D    S    D14
17    D14_2.txt    D14_2.txt    D14-2    D    S    D14
18    D14_3.txt    D14_3.txt    D14-3    D    S    D14
19    D21_1.txt    D21_1.txt    D21-1    D    R    D21
20    D21_2.txt    D21_2.txt    D21-2    D    R    D21
21    D21_3.txt    D21_3.txt    D21-3    D    R    D21
22    D7_1.txt    D7_1.txt    D7-1    D    S    D7
23    D7_2.txt    D7_2.txt    D7-2    D    S    D7
24    D7_3.txt    D7_3.txt    D7-3    D    S    D7

DESeq2 differential expression:

dds <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ )

-I am not sure what I should put in the design to achieve my goals. I have tried:

dds <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ condition.cultivar + condition.phenotype + condition.cultivar:condition.phenotype) , #but I am not convinced 

then: 

dds <- dds[ rowSums(counts(dds)) > 1, ]
gene_de_comparisons <- DESeq(dds)
resultsNames(gene_de_comparisons)

the result is:

[1] "Intercept"                                       
[2] "condition.cultivar_D_vs_C"       
[3] "condition.phenotype_S_vs_R"                      
[4] "condition.cultivarD.condition.phenotypeS"

the result is not my goal I don't understand why I have this pair of comparisons and I am not sure what is the correct code to achieve my goals.

any help is appreciated. and sorry for being too long in my question I just wanted to provide all the details.
Thanks.

deseq2 • 1.0k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 4 days ago
United States

hi,

I unfortunately don't have much time right now to provide involved statistical support. As you have a complex design, I'd recommend you partner with a statistical collaborator to formulate how to compare the samples. There are simple comparisons you can easily make, e.g. by combining variables and contrasting groups (see the vignette) and otherwise for testing interactions, this is also simple to do in DESeq2 (see sections in the vignette). But if you've never seen interactions before, it's important you get it right, and this really requires a statistical collaborator to at least explain what the terms in the model mean, before you can interpret the results.

ADD COMMENT

Login before adding your answer.

Traffic: 1640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6