Which test fits the best here
1
0
Entering edit mode
Fereshteh ▴ 30
@fereshteh-15803
Last seen 1 hour ago
United Kingdom

I have a list of patients in rows and oncogenic signalling pathways in columns of two independent matrixes

One for responders to a drug

one for non-responders to the same drugs

If a patient gets mutation in pathway X we give that 1 otherwise 0

I want to know if pathway X is significantly altered between two groups

I have tried 3 things

wilcox.test(group1$pathwayX, group2$pathwayX)
t.test(group1$pathwayX, group2$pathwayX)
fisher.test(x = matrix(
  c(
    group1_sample_size,
    pathwayX_mutated_samples,
    group2_sample_size,
    pathwayX_mutated_samples
  ),
  nrow = 2
)

)

Basically I have two boolean matrixes for each group

And I am not sure using which statistical test I can say which pathway is significantly altered between two groups

Any help? Thanks

My matrixes look like this

    > head(group1)
                           patients BER CPF CR CS FA HR MMR NER NHEJ OD p53 TLS TM UR DR AM
1 2SKsnsuD9my3Mona.vep.txt_1   0   0  0  0  0  0   0   0    0  0   1   0  0  0  0  0
2 4Pyv3CFxV1xnub78.vep.txt_1   0   0  0  0  0  0   0   0    0  0   0   0  1  0  0  0
3 8X6mBq2k2pJ07trv.vep.txt_1   0   1  0  0  1  1   0   0    0  0   0   0  0  0  0  0
4 aoZMTHJebqIv4XPB.vep.txt_1   0   0  1  0  1  1   0   0    0  0   0   0  1  0  0  0
5 eI178OJnaJgJiChV.vep.txt_1   1   0  0  0  0  0   0   1    1  0   0   0  0  0  0  0
6 iwyHwDFnhwBqHpiY.vep.txt_1   0   0  0  0  1  0   0   0    0  0   1   0  0  0  0  0
> 

set.seed(123)
training.samples <- data$Response %>% 
  createDataPartition(p = 0.8, list = FALSE)
train.data  <- data[training.samples, ]
test.data <- data[-training.samples, ]

model <- glm( Response ~., data = train.data, family = binomial)
fisher wilcox t.test • 181 views
ADD COMMENT
2
Entering edit mode
@kevin
Last seen 2 hours ago
Republic of Ireland

Hi again,

I would try binary logistic regression, with all variables encoded as binary factors. In my mind, a Wilcoxon or Student's t-test is not appropriate here, due to the fact that the data is just 0 and 1.

Please also keep in mind that your question does not relate to any Bioconductor package.

Kevin

Edit: if you tabulate the data into counts (How many have the 0 condition? vs How many have the 1 condition?), then you could justify a Fisher's or Chi-squared test here, in my opinion. You may have to think about what, exactly, you want to compare. To me, it seems like there will be many pairwise comparisons here.

ADD COMMENT

Login before adding your answer.

Traffic: 333 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6