Analysis with DESeq2: should I put all the samples from three conditions in the ColData and CountData or perform the analysis separately
1
0
Entering edit mode
@aroa-suarez-vega-6484
Last seen 6.1 years ago

Hello! I am performing my differential expression RNA-Seq analysis with DESeq2.

I have the following design, with three conditions

ID_seq  ID_animal            Condition           

C8DRYANXX_4_22           LUFO209              FO         

C8DRYANXX_4_25           LUFO219              FO         

C8DRYANXX_5_14           LUFO169              FO         

C8DRYANXX_6_18           LUFO177              FO         

C8DRYANXX_4_23           LUFO218              Control

C8DRYANXX_5_15           LUFO171              Control                

C8DRYANXX_6_20           LUFO181              Control

C8DRYANXX_6_21           LUFO197              Control

C8F23ACXX_7_20             LUFO181              Control                

C8DRYANXX_4_27           LUFO238              LU          

C8DRYANXX_5_16           LUFO173              LU          

C8DRYANXX_6_19           LUFO179              LU          

C8EB2ANXX_5_27            LUFO238              LU          

C8F23ACXX_7_19             LUFO179              LU          

C8F23ACXX_8_13             LUFO163              LU          

HHGFTBBXX_6_11           LUFO215              LU          

HHGFTBBXX_6_12           LUFO234              LU          

And the PCA of my data:

When I perform the analysis like this:

>dds <- DESeqDataSetFromMatrix(DE_genesCondition, colData, design = ~Condition)

>DESeq.dsCollapsed <- collapseReplicates( dds, groupby = dds$ID_animal)

>DESeq.dsCollapsed <-DESeq(DESeq.dsCollapsed)

And, I obtain the following results:

FOvsControl: 37 differentially expressed genes (DEG)

LUvsControl: 2515 DEG

LUvsFO: 817 DEG

However, when I perform the analyses independently, that is, indicating in the colData dataframe only the samples within the different contrast (for example, only Control and LU samples) and running DESeq separately three times, I obtain these results:

FOvsControl: 237 DEG

LUvsControl: 1992 DEG

LUvsFO: 672 DEG

As it can be seen, the results change from one to another approach. And the first thing that draws my attention is the high increase of the DEG in FOvsControl, that could be due to the reduction of the dispersion caused by the LU samples when you run the DESeq function in the first approach. However, in order to make the analysis of my experiment, I do not know which of these two approaches is the most correct. Could anyone help me?

 

deseq2 • 1.1k views
ADD COMMENT
0
Entering edit mode

PCA

ADD REPLY
0
Entering edit mode
@mikelove
Last seen 1 hour ago
United States

This is discussed in one of the Frequently Asked Questions (FAQ) in the DESeq2 vignette. Please check there for the explanation, and post a comment on this answer if you have more questions.

ADD COMMENT
0
Entering edit mode

Thank you very much for your help, sorry I hadn't read that

ADD REPLY

Login before adding your answer.

Traffic: 790 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6