How to apply KEGG enrichment analysis to the overlap of multiple contrasts in MArrayLM fit?
1
0
Entering edit mode
@antoinefelden-16088
Last seen 6.2 years ago

I ran a DGE analysis with 4 different contrasts with lmFit() + contrasts.fit() in limma, and I'm interested in the overlap of the four different contrasts. I identified the genes that are indeed differentially expressed in all four contrasts, and coded that in tfit$genes$test ("yes" if differentially expressed, "no" if not). The MArrayLM object is pasted below.

I did manage to run a Kegg pathway enrichment analysis for each of the contrasts individually, but I'm after a way to run a single analysis for the set of DE genes in all contrasts. Is that feasible?

An object of class "MArrayLM"
$coefficients
   Contrasts
         ARvsCA     ARvsEU      ARvsAU     ARvsNZ
  1 -0.18067896 -0.2044603 -0.22881771 -0.1833862
  2 -0.04079345 -1.1859285 -0.39206763 -0.3653143
  3 -0.12733594  0.1763288 -0.07934863 -0.1252855
  4  0.07648264  0.6875827  0.13266024  0.3442508
  5  0.09678434  0.4514540  0.13137207  0.2943875
10387 more rows ...

$stdev.unscaled
   Contrasts
       ARvsCA    ARvsEU    ARvsAU    ARvsNZ
  1 0.1820675 0.2300422 0.1936519 0.1780754
  2 0.1499697 0.1868217 0.1574127 0.1458230
  3 0.1480576 0.1898012 0.1567854 0.1442379
  4 0.1883755 0.2695961 0.2030905 0.1910940
  5 0.1406236 0.1774499 0.1479352 0.1374007
10387 more rows ...

$sigma
[1] 1.2656682 1.3338325 0.5922579 1.9853202 0.9379173
10387 more elements ...

$df.residual
[1] 21 21 21 21 21
10387 more elements ...

$cov.coefficients
         Contrasts
Contrasts    ARvsCA    ARvsEU ARvsAU    ARvsNZ
   ARvsCA 0.3666667 0.2000000    0.2 0.2000000
   ARvsEU 0.2000000 0.5333333    0.2 0.2000000
   ARvsAU 0.2000000 0.2000000    0.4 0.2000000
   ARvsNZ 0.2000000 0.2000000    0.2 0.3428571

$rank
[1] 5

$genes
    gene_id line test
1 gene12245    1   no
2 gene12244    2   no
3 gene12247    3   no
4 gene12246    4   no
5 gene12241    5   no
10387 more rows ...

$Amean
       1        2        3        4        5 
4.749884 8.665232 5.995541 4.329715 6.534481 
10387 more elements ...

$method
[1] "ls"

$design
  AR_heads AU_heads CA_heads EU_heads NZ_heads
1        1        0        0        0        0
2        1        0        0        0        0
3        1        0        0        0        0
4        1        0        0        0        0
5        1        0        0        0        0
21 more rows ...

$contrasts
          Contrasts
Levels     ARvsCA ARvsEU ARvsAU ARvsNZ
  AR_heads      1      1      1      1
  AU_heads      0      0     -1      0
  CA_heads     -1      0      0      0
  EU_heads      0     -1      0      0
  NZ_heads      0      0      0     -1

$df.prior
[1] 2.659777

$s2.prior
[1] 0.6719199

$s2.post
[1] 1.4973681 1.6546414 0.3868724 3.5739382 0.8563319
10387 more elements ...

$df.total
[1] 23.65978 23.65978 23.65978 23.65978 23.65978
10387 more elements ...

$t
   Contrasts
        ARvsCA     ARvsEU     ARvsAU     ARvsNZ
  1 -0.1937939 -0.2378606 -0.3853473 -0.2105622
  2  0.0000000 -4.3627269 -1.2572034 -1.2144966
  3  0.0000000  0.3288759  0.0000000  0.0000000
  4  0.0000000  1.0792898  0.0000000  0.5722939
  5  0.0000000  1.9118965  0.0000000  1.2338677
10387 more rows ...

$p.value
   Contrasts
       ARvsCA       ARvsEU    ARvsAU    ARvsNZ
  1 0.5071515 0.5252303689 0.4194142 0.4945336
  2 0.8719553 0.0001139932 0.1181080 0.1248538
  3 0.5476869 0.3794963643 0.7396859 0.5572903
  4 0.8441133 0.2050344009 0.7492352 0.3837871
  5 0.6637349 0.0347921880 0.5483615 0.1158894
10387 more rows ...

$treat.lfc
[1] 0.1375035
limma kegg • 1.1k views
ADD COMMENT
0
Entering edit mode

Yes, it's easy enough. But you need to have generally recognized gene IDs (usually Entrez Gene Ids) order to run kegga(). Your gene_ids don't seem to be Entrez Ids. Have you just anonymized them?

ADD REPLY
0
Entering edit mode

Hi Gordon,

Sorry I reported the wrong MArrayLM fit. Before using it as input for kegga(), I did change the gene_id into RefSeq identifiers as below:

$genes
    gene_id line test
1 105678280    1   no
2 105678292    2   no
3 105678279    3   no
4 105678278    4   no
5 105678296    5   no
10387 more rows ...

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 17 minutes ago
WEHI, Melbourne, Australia

If your data is mouse and fit$genes$gene_id contains Entrez Gene Ids, then you could proceed like this:

results <- decideTests(fit)
MyGeneSet <- fit$genes$gene_id[ rowSums( results != 0 ) == 4 ]
k <- kegga(MyGeneSet, universe=fit$genes$gene_id, species="Mm")
topKEGG(k)

 

 

 

ADD COMMENT
0
Entering edit mode

Thanks a lot, that works!

ADD REPLY

Login before adding your answer.

Traffic: 565 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6