getting a list of probe IDs from topTable (limma version 3.18.13)
Entering edit mode
rk489 • 0
Last seen 7.1 years ago
United Kingdom

I'm using limma to find differentially expressed genes in an affymetrix array. However, when I try to extract the probe ids after applying eBayes to lmfit using topTable (probe IDs are in the first column of the resulting data frame), it gives me the 2nd column of the data frame (which is logFC). If I try to index column 1, it gives me column 2 and so on. When I checked the output of the eBayes(lmfit) command, I found that $genes output was missing. I'm using top_table.rx2(1) to extract the probe IDs (since I'm using python). It would be great if someone could explain what I'm doing wrong. Here's a snippet of the code:


 import rpy2
 import rpy2.robjects as robjects
 import rpy2.rinterface as ri

 fit = lmFit(eset, design)
    fit = eBayes(fit)
    if exp_type == 'not_ctl':
        top_table = topTable(fit, coef='status', sort='none', n='Inf', **{'adjust.method':'BH'})

    elif exp_type == 'ctl':
        top_table = topTable(fit, coef='age', sort='none', n='Inf', **{'adjust.method':'BH'})

    print head(top_table)


                  logFC  AveExpr             t    P.Value adj.P.Val          B

1007_s_at  4.581859e-02 9.736476  2.826155e+00 0.01488350 0.9703936  -6.951610

1053_at   -1.670440e-02 2.867159 -2.561034e+00 0.02444413 0.9703936  -7.438039

117_at    -1.502654e-04 2.918910 -1.690970e-02 0.98677806 1.0000000 -10.282083

121_at     1.910957e-17 3.254613  3.204963e-13 1.00000000 1.0000000 -10.282237

1255_g_at  1.197955e-03 2.238973  6.973398e-01 0.49847404 1.0000000 -10.024492

1294_at   -5.849747e-04 2.281106 -3.429330e-01 0.73740026 1.0000000 -10.218988


limma rpy • 1.3k views
Entering edit mode
Aaron Lun ★ 28k
Last seen 10 hours ago
The city by the bay

This sounds like an issue with rpy2 rather than limma, so you'd be better off asking the rpy2 developers. I've never used rpy2 before, but nonetheless, here's some ideas:

  • Probe IDs are stored as the row names in the original data.frame, rather than as a separate column. If you were to do top_table[,1] in R, you would get the log-FCs. Instead, you would need to call rownames(top_table) to get the probe IDs.
  • Python uses zero-indexing, while R uses 1-indexing, so there may be some adjustment you need to do for the column indices when using rpy2. I can't say for sure, though.

Also, it wouldn't hurt to use the latest version of limma (3.26.2); you're several releases behind.

Entering edit mode

Thanks Aaron. rownames(top_table)  seems to work. However, I don't think its a problem of rpy2 because I'm using it at a lot of places in my code but it only gives error while indexing the top_table. Or maybe its a problem while indexing data frames? Thanks again! 


Login before adding your answer.

Traffic: 659 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6