DEseq2 results
1
0
Entering edit mode
MOHAMMAD • 0
@MOHAMMAD-24781
Last seen 3.0 years ago

I am a beginner in R I did Differential expressed gene analysis and I have two questions:

1- depending on the graphs is the differentially expressed genes list reliable or not For further analysis?

I am disappointed because I didn't get the segregation of the samples in the heatmap and PCA1 and PCA2 are low

enter image description here

enter image description here

enter image description here

enter image description here

I noticed that the sample S1 varies from other samples resulting in low PCA1 but also different from the 10 controls. is there any way to handle it?!

2- when I am exporting the list of DEGs I get (some genes appears many times) For example, if the gene id is df3t_00100 I got records as following:

df3t_00100

df3t_00100.1

df3t_00100.2

df3t_00100.3

what are those and how can I handle them?

Thank you in advance!

heatmaps DEGs PCA • 1.3k views
ADD COMMENT
2
Entering edit mode
Kevin Blighe ★ 3.9k
@kevin
Last seen 1 day ago
Republic of Ireland

Hi Mohammad,

Regarding the PCA bi-plot, I see no major issue, assuming that you have generated this PCA bi-plot in an unbiased ('unsupervised') way using all genes. Can you share the code that you used? Your 2 groups (Control + Sample) are almost exclusively segregated along PC1. The sample at the bottom-right is behaving differently, but it is still not grouping with Control.

Then, in your second figure generated with pheatmap(), it seems that —yes— your groups are segregated perfectly via hierarchical clustering, and the heatmap colour shade also indicates this.

Regarding the gene naming issue, which species is this? Can you confirm how the read count quantification was performed and with which reference GTF? Generally, to help, please explain your broader analysis pipeline so that we can begin to try to solve this.

Kevin

ADD COMMENT
0
Entering edit mode

PCA code:

vsd <- vst(dds, blind = T) # Varaiance Stabilizing transformation

plotPCA(vsd, intgroup = "C.S")

2- the organism is plasmodium falicparium

design(dds) <- ~ C.S

dds <- DESeq(dds)

res <-results(dds)

summary(res)

resSort <- res[order(res$padj),]

library("org.Pf.plasmo.db")

geneinfo <- select(org.Pf.plasmo.db, keys=rownames(resSort)[1:20], columns=c("SYMBOL","GENENAME","GO"), keytype="SYMBOL")

geneinfo

gene info returns some repeated genes and some with decimal:

enter image description here

ADD REPLY
1
Entering edit mode

Thanks, you are evidently not following the typical DESeq2 analysis pipeline - you are missing the lfcShrink() stage. Please take a look at the Quick start.

Are you showing all of the output of geneinfo? There seems to be at least 2 columns missing.

ADD REPLY
0
Entering edit mode

Thank you,

can you kindly where should I use ilfcShrink() stage

the geneinfo output is ok its just cut to show gene_id

ADD REPLY
1
Entering edit mode

Hi, regarding lfcShrink, the information is in the Quick start (please see my other comment)

ADD REPLY

Login before adding your answer.

Traffic: 796 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6