Which test fits the best here
1
0
Entering edit mode
AZ ▴ 30
@fereshteh-15803
Last seen 6 months ago
United Kingdom

I have a list of patients in rows and oncogenic signalling pathways in columns of two independent matrixes

One for responders to a drug

one for non-responders to the same drugs

If a patient gets mutation in pathway X we give that 1 otherwise 0

I want to know if pathway X is significantly altered between two groups

I have tried 3 things

wilcox.test(group1$pathwayX, group2$pathwayX)
t.test(group1$pathwayX, group2$pathwayX)
fisher.test(x = matrix(
c(
group1_sample_size,
pathwayX_mutated_samples,
group2_sample_size,
pathwayX_mutated_samples
),
nrow = 2
)


)

Basically I have two boolean matrixes for each group

And I am not sure using which statistical test I can say which pathway is significantly altered between two groups

Any help? Thanks

My matrixes look like this

    > head(group1)
patients BER CPF CR CS FA HR MMR NER NHEJ OD p53 TLS TM UR DR AM
1 2SKsnsuD9my3Mona.vep.txt_1   0   0  0  0  0  0   0   0    0  0   1   0  0  0  0  0
2 4Pyv3CFxV1xnub78.vep.txt_1   0   0  0  0  0  0   0   0    0  0   0   0  1  0  0  0
3 8X6mBq2k2pJ07trv.vep.txt_1   0   1  0  0  1  1   0   0    0  0   0   0  0  0  0  0
4 aoZMTHJebqIv4XPB.vep.txt_1   0   0  1  0  1  1   0   0    0  0   0   0  1  0  0  0
5 eI178OJnaJgJiChV.vep.txt_1   1   0  0  0  0  0   0   1    1  0   0   0  0  0  0  0
6 iwyHwDFnhwBqHpiY.vep.txt_1   0   0  0  0  1  0   0   0    0  0   1   0  0  0  0  0
>

set.seed(123)
training.samples <- data\$Response %>%
createDataPartition(p = 0.8, list = FALSE)
train.data  <- data[training.samples, ]
test.data <- data[-training.samples, ]

model <- glm( Response ~., data = train.data, family = binomial)

fisher wilcox t.test • 311 views
2
Entering edit mode
@kevin
Last seen 23 hours ago
Republic of Ireland

Hi again,

I would try binary logistic regression, with all variables encoded as binary factors. In my mind, a Wilcoxon or Student's t-test is not appropriate here, due to the fact that the data is just 0 and 1.

Please also keep in mind that your question does not relate to any Bioconductor package.

Kevin

Edit: if you tabulate the data into counts (How many have the 0 condition? vs How many have the 1 condition?), then you could justify a Fisher's or Chi-squared test here, in my opinion. You may have to think about what, exactly, you want to compare. To me, it seems like there will be many pairwise comparisons here.