Hello,
I want to generate a dummy sample data but I want to do it based on DESeq2 result.
Suppose, I have a gene, gene A and I have do DESeq2 analysis between cancer and normal. The result is, gene A logFold change is 2.5, which is upregulated in cancer with p-value 0.01. The baseMean is around 35.
What I want to do is to generate a dummy data for each category, cancer and normal which will have baseMean 35 and if it is caluclated back, it will give 2.5 log fold change with 0.01 p-value.
Is it possible to do that?
I imagine something like this. Because DESeq2 use negative binomial as distribution, I just need to make random sampling with negative binomial distribution given the baseMean for normal 35 and for cancer 37.5 (35 + 2.5 fold Change). Is it possible? I don"t know though for the distribition variance. I don"t think it is written in the output of DESeq2 DEG analysis.
Thank you.
Thank you! I think the baseMean is in log format so I add the LFC. I'm not familiar with what is dispersion value. I need to check from the paper oncemore, but is it similar like variance?
Sorry yes, I got confused because you had 2.5 as the "fold change" in the last sentence.
The base mean is not in log scale.
It should be:
Dispersion is the second parameter of the negative binomial. Yes you should check the paper again.
In R, the dispersion and mean can be used like so: