DEseq2 results
1
0
Entering edit mode
Last seen 14 months ago

I am a beginner in R I did Differential expressed gene analysis and I have two questions:

1- depending on the graphs is the differentially expressed genes list reliable or not For further analysis?

I am disappointed because I didn't get the segregation of the samples in the heatmap and PCA1 and PCA2 are low

I noticed that the sample S1 varies from other samples resulting in low PCA1 but also different from the 10 controls. is there any way to handle it?!

2- when I am exporting the list of DEGs I get (some genes appears many times) For example, if the gene id is df3t_00100 I got records as following:

df3t_00100

df3t_00100.1

df3t_00100.2

df3t_00100.3

what are those and how can I handle them?

heatmaps DEGs PCA • 464 views
2
Entering edit mode
@kevin
Last seen 3 hours ago
V&A Waterfront, Cape Town, South Africa

Regarding the PCA bi-plot, I see no major issue, assuming that you have generated this PCA bi-plot in an unbiased ('unsupervised') way using all genes. Can you share the code that you used? Your 2 groups (Control + Sample) are almost exclusively segregated along PC1. The sample at the bottom-right is behaving differently, but it is still not grouping with Control.

Then, in your second figure generated with pheatmap(), it seems that —yes— your groups are segregated perfectly via hierarchical clustering, and the heatmap colour shade also indicates this.

Regarding the gene naming issue, which species is this? Can you confirm how the read count quantification was performed and with which reference GTF? Generally, to help, please explain your broader analysis pipeline so that we can begin to try to solve this.

Kevin

0
Entering edit mode

PCA code:

vsd <- vst(dds, blind = T) # Varaiance Stabilizing transformation

plotPCA(vsd, intgroup = "C.S")

2- the organism is plasmodium falicparium

design(dds) <- ~ C.S

dds <- DESeq(dds)

res <-results(dds)

summary(res)

library("org.Pf.plasmo.db")

geneinfo <- select(org.Pf.plasmo.db, keys=rownames(resSort)[1:20], columns=c("SYMBOL","GENENAME","GO"), keytype="SYMBOL")

geneinfo

gene info returns some repeated genes and some with decimal:

1
Entering edit mode

Thanks, you are evidently not following the typical DESeq2 analysis pipeline - you are missing the lfcShrink() stage. Please take a look at the Quick start.

Are you showing all of the output of geneinfo? There seems to be at least 2 columns missing.

0
Entering edit mode

Thank you,

can you kindly where should I use ilfcShrink() stage

the geneinfo output is ok its just cut to show gene_id

1
Entering edit mode

Hi, regarding lfcShrink, the information is in the Quick start (please see my other comment)