Entering edit mode

Liu, XiaoChuan
▴
30

@liu-xiaochuan-5812
Last seen 8.7 years ago

Dear Dr. Simon Anders,
I have some questions on the multi-factor test in DESeq.
Here I have total 8 RNA-seq samples. There are two factors:
condition(Wildtype and Knockout), and treatment(Picrotoxin and
Control). And each has two biological replicates. Details I have
attached (see Table 1 in attachment).
Now I want to use the generalized linear models to do the ANOVA-like
analysis for these two factors and want to find the differentially
expressed genes caused by the two factors. The part 2 in attachment
are the different GLMs tests by DESeq. And the cut-off are chosen as
P-value<0.05 and FDR<0.05 separately. The number of significant
differentially expressed genes are counted by cut-offs. Details see
Part 2 in attachment.
I also pick up raw counts for all the genes, then I try to use the
function getVarianceStabilizedData() to transfer the raw counts to
variance stabilizing transformation (VST). After that, I directly use
the ANOVA function in R to calculate the P-value for each genes based
on the VST value. When I get all P-value for all genes, I also do the
p.adjust to calculate the FDR by method = "BH". Results see Part 3 in
attachment.
Here are my questions:
1. What is the meaning when I use ?count ~ 1?? Here 1 is a cut-
off? Or other meaning? I saw you give an example like this in DESeq
Reference Manual. So I try to follow using it. But I do not know the
meaning for test.
2. In Part 2, I did 6 different GLMs tests by DESeq. And I also
do the overlap amongst results. The overlap sometimes is very small.
Why are they so different? How to explain it? Could you give me some
comments?
3. In Part 3, I also do the overlap with 6 results in Part 2. But
the overlap are very small. I wonder if I make a mistake to misuse the
variance stabilizing transformation? If I want to directly use the
ANOVA function in R to calculate co-factor P-value, could I use the
raw count? Or How to normalize the raw counts then I can use ANOVA
function in R?
Thank you very much!
Looking forward to your response!
Best,
Leo
1. Table1
condition treatment
WP1 Wildtype Picrotoxin
WC1 Wildtype Control
KP1 Knockout Picrotoxin
KC1 Knockout Control
WP2 Wildtype Picrotoxin
WC2 Wildtype Control
KP2 Knockout Picrotoxin
KC2 Knockout Control
2. Multi-factor test
fit1 = fitNbinomGLMs( cds, count ~ treatment + condition )
fit0 = fitNbinomGLMs( cds, count ~ treatment )
pvalsGLM1 = nbinomGLMTest( fit1, fit0 )
padjGLM1 = p.adjust( pvalsGLM1, method="BH" )
Dataset1
P-value<0.05: 615
FDR<0.05: 20
fit3 = fitNbinomGLMs( cds, count ~ condition + treatment )
fit2 = fitNbinomGLMs( cds, count ~ condition )
pvalsGLM2 = nbinomGLMTest( fit3, fit2 )
padjGLM2 = p.adjust( pvalsGLM2, method="BH" )
Dataset2
P-value<0.05: 1959
FDR<0.05: 870
fit1 = fitNbinomGLMs( cds, count ~ treatment + condition )
fit4 = fitNbinomGLMs( cds, count ~ 1 )
pvalsGLM3 = nbinomGLMTest( fit1, fit4 )
padjGLM3 = p.adjust( pvalsGLM3, method="BH" )
Dataset3
P-value<0.05: 1767
FDR<0.05: 778
fit5 = fitNbinomGLMs( cds, count ~ condition * treatment )
fit0 = fitNbinomGLMs( cds, count ~ treatment )
pvalsGLM4 = nbinomGLMTest( fit5, fit0 )
padjGLM4 = p.adjust( pvalsGLM4, method="BH" )
Dataset4
P-value<0.05: 393
FDR<0.05: 26
fit5 = fitNbinomGLMs( cds, count ~ condition * treatment )
fit2 = fitNbinomGLMs( cds, count ~ condition )
pvalsGLM5 = nbinomGLMTest( fit5, fit2 )
padjGLM5 = p.adjust( pvalsGLM5, method="BH" )
Dataset5
P-value<0.05: 1450
FDR<0.05: 665
fit5 = fitNbinomGLMs( cds, count ~ condition * treatment )
fit4 = fitNbinomGLMs( cds, count ~ 1 )
pvalsGLM6 = nbinomGLMTest( fit5, fit4 )
padjGLM6 = p.adjust( pvalsGLM6, method="BH" )
Dataset6
P-value<0.05: 1465
FDR<0.05: 657
Overlap(P-value<0.05)
Dataset1
Dataset2
Dataset3
Dataset4
Dataset5
Dataset6
Dataset1
Dataset2
240
Dataset3
455
1511
Dataset4
356
178
342
Dataset5
220
1423
1378
184
Dataset6
396
1282
1432
328
1247
Overlap(FDR<0.05)
Dataset1
Dataset2
Dataset3
Dataset4
Dataset5
Dataset6
Dataset1
Dataset2
10
Dataset3
20
720
Dataset4
18
15
26
Dataset5
9
658
652
15
Dataset6
20
618
653
26
602
3. Two way ANOVA After VST?
P-value<0.05: 671
FDR<0.05: 0
Overlap(P-value<0.05)
Dataset1
Dataset2
Dataset3
Dataset4
Dataset5
Dataset6
Two way ANOVA After VST
35
77
68
34
69
69
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Anova.pdf
Type: application/pdf
Size: 106447 bytes
Desc: Anova.pdf
URL: <https: stat.ethz.ch="" pipermail="" bioconductor="" attachments="" 20130308="" e5c68473="" attachment.pdf="">