Fisher test in Rna seq
1
0
Entering edit mode
g.k • 0
@gk-13275
Last seen 5.4 years ago

Hello,

I would like to do rna seq data for several genes, tested in control and treatment.


To do fisher.test I need a contingency table for each gene, is there a way to do this in R instead of computing a contingency table for each gene?
I am new to this so any advice can be helpful.

I have the count data and the sample Info data,

                    control1 treated1 control2 treat2 control3 treat3
    ENSG00000000003        723        486        904        445       1170       1097 
    ENSG00000000005          0          0          0          0          0          0
    ENSG00000000419        467        523        616        371        582        781

Where ENSG are the genes

Thank you

rna-seq rnaseq r • 22k views
ADD COMMENT
2
Entering edit mode

Hello, it's not clear from your question what it is you are testing with your fisher.test. Are you trying to test enrichment of the transcript in one condition versus another? If so it might be better to use a dedicated package for rna seq such as DESeq2 rather than fisher.test.

I have used fisher.test to test enrichment of candidate gene sets compared to the reference for things such as GO terms or similar classification terms. If you want some advice on setting that up let me know.

ADD REPLY
2
Entering edit mode

I'll echo what Anna said, but with more conviction: you absolutely should not use a fisher.test for this. Use edgeR, limma/voom, or DESeq2.

ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 47 minutes ago
WEHI, Melbourne, Australia

The function nbinomTest in the edgeR package does the Fisher test you suggest. If counts is your matrix, then

Control<- rowSums( counts[,c(1,3,5)] )
Treated <- rowSums( counts[,c(2,4,6)] )
out <- nbinomTest(Control, Treated)

does all the Fisher tests. However, as Steve and Anna have commented, we strongly advise against this because it ignores biological variation and will drastically over-estimate the significance of any differences found.

It would seem from a casual look at the data you give that you actually have paired data whereby each treated sample is paired with a control sample. You should use limma, edgeR or DESeq2 to undertake a paired analysis with proper estimation of replicate to replicate variability.

ADD COMMENT

Login before adding your answer.

Traffic: 943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6