Question: edgeR or wilcoxon rank test? Which is right?
0
gravatar for jivarajivaraj
8 months ago by
jivarajivaraj10 wrote:

Hi,

I have histopathologic response to neoadjuvant chemoradiation in 56 cancer samples. A total of 26 samples were classified as minor and 30 as major histopathologic responders (TRG1-2 and TRG4-5 respectively). I have done edgeR and wilcoxon test to find genes driving the difference of tumor samples of patients with major or minor response as below.

group= as.factor(c(rep ("TRG1-2",26), rep("TRG4-5", 30)))


> group
[1] TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2
[17] TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG1-2 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5
[33] TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5
[49] TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5 TRG4-5
Levels: TRG1-2 TRG4-5
> dim(df)
[1] 2560   56
> y <- DGEList(counts = df, group = condition) 
> y <- estimateDisp(y) 
Design matrix not provided. Switch to the classic mode.
> sqrt(y$common.dispersion)
[1] 0.6280918
> EdgeR <- exactTest(y) 
> topTags(EdgeR)
Comparison of groups:  TRG4-5-TRG1-2 
           logFC   logCPM       PValue          FDR
PPBP  -4.3340878 9.503884 3.564802e-11 9.125894e-08
CDK6  -1.5518198 8.712466 1.458599e-07 1.867006e-04
IL1B   1.7324695 9.178351 2.623373e-05 1.908504e-02
CXCL8  1.6455933 8.340310 3.129262e-05 1.908504e-02
EGR1   0.8468036 8.652308 4.432857e-05 1.908504e-02
IFIT2  0.8957873 7.535228 5.199642e-05 1.908504e-02
IL6    1.3926323 6.951407 5.218565e-05 1.908504e-02
BDNF   1.4176689 6.605966 7.471018e-05 2.134076e-02
PTGS2  1.4746062 8.352272 7.547266e-05 2.134076e-02
FOS    0.9891503 9.263358 8.336234e-05 2.134076e-02

And wilcoxon test as below

> library(GSALightning)

df1= cpm (df,log=TRUE)
> results <- wilcoxTest(df1,group, tests = "unpaired"))
There were 48 warnings (use warnings() to see them)
> head(results[,1:4])
       p-value:up-regulated in TRG1-2 p-value:up-regulated in TRG4-5
ACTB                       0.02007199                      0.9799280
ATP5F1                     0.51624724                      0.4837528
DDX5                       0.87211880                      0.1278812
EEF1G                      0.76612743                      0.2338726
GAPDH                      0.12111916                      0.8788808
NCL                        0.44491768                      0.5550823
       q-value:up-regulated in TRG1-2 q-value:up-regulated in TRG4-5
ACTB                        0.9998235                      0.9822301
ATP5F1                      0.9998235                      0.6930090
DDX5                        0.9998235                      0.4650225
EEF1G                       0.9998235                      0.5331378
GAPDH                       0.9998235                      0.9138647
NCL                         0.9998235                      0.7347522
>

The list of significant genes either up-regulated in TRG1-2 or TRG4-5 are 100% different with edgeR results. Please help me to know which results are wright and which is wrong

Thank you for any suggestion

cancer edger R rna-seq wilcox • 286 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by jivarajivaraj10
Answer: edgeR or wilcoxon rank test? Which is right?
2
gravatar for Gordon Smyth
8 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

It is quite common for different DE tests to rank genes differently and, if there are only a few DE genes, then the top genes can easily be non-overlapping. This would be true even if neither of the DE tests are not "wrong", but I'm not a big fan of the Wilcoxon test for this sort of data.

If this is sequencing data of some sort, then the Wilcoxon test would be wrong if applied to counts because it doesn't account for differences in sequencing depth between samples.

Even if you convert to CPMs, the observations would still not be identically distributed under the null hypothesis, which the Wilcoxon Test assumes.

Another issue is that is not correct to apply FDR correction to up and down p-values separately, which the wilcoxTest function seems to be doing.

I wonder what the warning messages are that wilcoxTest has generated.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Gordon Smyth39k

Thanks a lot, this is edgeseq a sort of RNAseq that does not need RNA extraction. However I fed cpm normalized data after log by cpm function in edgeR into wilcoxon test and same group for edgeR. Is wilcoxon not wrong yet even with normalized read counts?

I saw people use mann withney for such data for I am not sure what to do

Thank you for any help

I used t test on normalized data but error saying no difference detected

ADD REPLYlink written 8 months ago by jivarajivaraj10

This is very confusing. I don't recall seeing the cpm function in your original question.

ADD REPLYlink written 8 months ago by Gordon Smyth39k

Sorry, I just edited my post. I have used cpm log values for any t-test or non-parametric test

ADD REPLYlink written 8 months ago by jivarajivaraj10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 415 users visited in the last hour