Question: Uncertain about PCA plot for DESeq2 analysis
0
gravatar for pepere
13 days ago by
pepere40
pepere40 wrote:

Hi, I have a RNA seq dataset obtained from patients treated with a specific drug (The sequencing data is of very good quality), they are separated into 3 groups: healthy controls, good responder patients and bad responder patients (patients samples are also further divided into pre-treatment and post-treatment for the same patient). 

After aligning the RNA seq and performing read count I had a look a the PCA plot, and noticed that the groups do not separate well:

Regular DESeq2 analysis for differential expression between the groups yields no results, which is very strange given that we are also comparing healthy and sick people. What could have gone wrong?

thanks

bioconductor deseq2 pca rnaseq • 67 views
ADD COMMENTlink modified 13 days ago by Michael Love21k • written 13 days ago by pepere40
Answer: Uncertain about PCA plot for DESeq2 analysis
0
gravatar for Michael Love
13 days ago by
Michael Love21k
United States
Michael Love21k wrote:

The most plausible explanation is that there isn’t the signal you expect in this dataset, and that’s a question to bring back to the team.

How did you perform the DE analysis. Please post your code. Did you control for patient baseline? There is an example of how to do this in the vignette.

ADD COMMENTlink written 13 days ago by Michael Love21k

Thanks for the quick reply.

My main concern was not seeing differences with the healthy controls. We did a similar experiment in the past with another drug and differences in the PCA and DE were evindent.

I performed the DE analysis using this coldata:

Sample Patient Responder Time Control
m01 15 BadResponder Basal No
m02 14 BadResponder Basal No
m03 11 BadResponder Basal No
m04 6 GoodResponder Basal No
m05 7 GoodResponder Basal No
m06 8 GoodResponder Basal No
m07 9 GoodResponder Basal No
m08 10 GoodResponder Basal No
m09 13 BadResponder Basal No
m10 12 BadResponder Basal No
m11 6 GoodResponder 1year No
m12 14 BadResponder 1year No
m13 7 GoodResponder 1year No
m14 13 BadResponder 1year No
m15 10 GoodResponder 1year No
m16 11 BadResponder 1year No
m17 8 GoodResponder 1year No
m18 15 BadResponder 1year No
m19 9 GoodResponder 1year No
m20 12 BadResponder 1year No
m21 1 Control None Yes
m22 2 Control None Yes
m23 3 Control None Yes
m24 4 Control None Yes
m25 5 Control None Yes

and simply (the cts variable contains the count data for each gene):

dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Control)
dds <- dds[ rowSums(counts(dds)) > 1, ]
dds <- DESeq(dds)
res <- results(dds)

almost no gene was found to be DE, which is very strange....

I didn't try with the control for patient baseline, I will check it out

 

ADD REPLYlink written 13 days ago by pepere40

As Michael implied, the differences that you expect may simply not exist in this dataset. Also, you should not make any major conclusion about your data from just the PCA bi-plot. In most cases, major differences between control / healthy and other samples will simply not be revealed by PCA. What you can at least say, looking at your plot, is that your dataset does not contain outliers.

I note that your dataset is imbalanced, though, with only 5 controls versus 20 non-controls.

ADD REPLYlink modified 13 days ago • written 13 days ago by Kevin Blighe30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 360 users visited in the last hour