Dear colleagues,
Please help me!
I was using DESeq2 to analyze gene expression from a batch of IPSCs and have observed some results that I can't explain.
The study design was that there were 3 iPSCs derived from skin biopsy from patient A with Mutation X, and other 3 iPSCs from patient B with Mutation X, therefore in total 6 iPSCs, supposedly all with mutation X. Also, there were 7 iPSCs derived from healthy individuals (in total 5 individuals) as controls.
First, I ran the DESeq2 on the cohort the 3 iPSCs derived from patient A with mutation X and all control iPSC cells and got the following graph: https://photos.app.goo.gl/2VpiBRhbDcw1Zy468
Second, I ran the DESeq2 on the cohort the 3 iPSCs derived from patient B with mutation X and all control iPSC cells and got the following graph: https://photos.app.goo.gl/f9YD6CQquxWwQRkU7
At last, I ran the DESeq2 on the cohort the 6 iPSCs derived from patient A and patient B with mutation X and all control iPSC cells and got the following graph: https://photos.app.goo.gl/FdJbm9ReymU4Twrr8
There are two questions:
1) Why does the distribution of the volcano plot skew so bizarrely to the left when I analyzed all 16 samples together, which was not the case when I analyzed only samples from a single patients versus controls? 2) Why are there much more statistically significant candidates when I analyzed all 16 samples together, which was not the case when I analyzed only samples from a single patients versus controls?
Your answers and suggestions would be highly helpful for me. Thank you very much for your time and patience.
Best regards, Weng-Tein Gi