Question

Loading samples into DESeq2

0

Entering edit mode

deena ▴ 20

@deena-7415

Last seen 6.6 years ago

Germany

Hi all,

I have a RNA seq data performed at different time points. So for every time I have 4 samples(Control, Knock-down_1, Knock-down_2, Knock-down_3). So all together there are 36 samples from all 3 time points.

Now I want to compare every Knock-down sample within a particular time point to its Control samples within the particular time point.
So for this type of analysis,

a) Should I include all the 36 samples all together in the count matrix and later contrast to specify the sample vs samples in DESeq2

OR

b) Should I include only samples within the particular time point(12 samples) in the count matrix and compare all Knock-down samples with respect to Control sample using DESeq2
OR
c) Should I include only two samples within a time point in count matrix and compute diff. expressed genes using DESeq2

Thanks in advance

deseq2 • 1.2k views

ADD COMMENT • link updated 7.1 years ago by Michael Love 41k • written 7.1 years ago by deena ▴ 20

score 0 · Answer 1 · 2017-03-09

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 3 hours ago

United States

4 conditions, 3 time points => 12, then 3 biological replicates? Just want to make sure I understand what the experimental design looks like.

Include all the samples in the design, and you can follow the example here:

http://www.bioconductor.org/help/workflows/rnaseqGene/#time

You are interested in the Wald test results, which are demonstrated later in that section, e.g.:

res <- results(dds, name="strainmut.minute30", test="Wald")

ADD COMMENT • link 7.1 years ago Michael Love 41k

0

Entering edit mode

Thank you Michael. You understood correctly my experimental design. So far I have made a data frame called conditions in R like following

                                             Conditions
Treated1_T2_1                      Treated1_T2
Treated1_T2_2                      Treated1_T2
Treated1_T2_3                      Treated1_T2         
Treated1_T3_1                      Treated1_T3          
Treated1_T3_2                      Treated1_T3      
Treated1_T3_3                       Treated1_T3         
Treated1_T1_1                       Treated1_T1          
Treated1_T1_2                       Treated1_T1          
Treated1_T1_3                       Treated1_T1          
Treated1Treated2_T2_1         Treated1Treated2_T2     
Treated1Treated2_T2_2         Treated1Treated2_T2     
Treated1Treated2_T2_3         Treated1Treated2_T2     
Treated1Treated2_T3_1          Treated1Treated2_T3      
Treated1Treated2_T3_2          Treated1Treated2_T3    
Treated1Treated2_T3_3          Treated1Treated2_T3      
Treated1Treated2_T1_1          Treated1Treated2_T1      
Treated1Treated2_T1_2          Treated1Treated2_T1      
Treated1Treated2_T1_3          Treated1Treated2_T1      
Treated2_T2_1                        Treated2_T2         
Treated2_T2_2                        Treated2_T2         
Treated2_T2_3                        Treated2_T2         
Treated2_T3_1                        Treated2_T3          
Treated2_T3_2                        Treated2_T3          
Treated2_T3_3                        Treated2_T3          
Treated2_T1_1                        Treated2_T1          
Treated2_T1_2                        Treated2_T1          
Treated2_T1_3                        Treated2_T1          
Control_T2_1                          Control_T2          
Control_T2_2                          Control_T2          
Control_T2_3                          Control_T2          
Control_T3_1                          Control_T3           
Control_T3_2                          Control_T3           
Control_T3_3                          Control_T3           
Control_T1_1                         Control_T1           
Control_T1_2                         Control_T1           
Control_T1_3                         Control_T1   

This will passed into DESeq2DataSetFromMatrix like following

ddsFullCountTable <- DESeqDataSetFromMatrix(countData = rnaseqMatrix,colData = conditions,design = ~ conditions

When I want compute the diff expressed between two conditions, I use results in following way

res=results(dds,contrast = c("conditions","Treated_T3","Control_T3")))

Now using the "name" parameter, how can this be achieved according to my conditions. Just to make sure, that I am not comparing the samples between the time points but within time point.

ADD REPLY • link 7.1 years ago deena ▴ 20

0

Entering edit mode

Your code is correct for the way you have set it up. You should use 'contrast' as you have.

The example I pointed you to is going at it a different way, but you can go ahead and use the code you have.

ADD REPLY • link 7.1 years ago Michael Love 41k

0

Entering edit mode

Thanks a lot Michael. The strange thing which I found this huge data is that many replicates particular samples in particular time point gets clustered with other samples of different time point. I havent observed such kind of clustering and I am wondering how to handle such replicate that dosent clusters within its own group. Kindly guide me

ADD REPLY • link 7.1 years ago deena ▴ 20

0

Entering edit mode

I don't have any particular advice. Remember that PCA is not the data itself, but a reduction into 2 dimensions. So it doesn't imply a problem necessarily.

ADD REPLY • link 7.1 years ago Michael Love 41k