getting a list of probe IDs from topTable (limma version 3.18.13)
1
0
Entering edit mode
rk489 • 0
@rk489-9202
Last seen 5.0 years ago
United Kingdom

I'm using limma to find differentially expressed genes in an affymetrix array. However, when I try to extract the probe ids after applying eBayes to lmfit using topTable (probe IDs are in the first column of the resulting data frame), it gives me the 2nd column of the data frame (which is logFC). If I try to index column 1, it gives me column 2 and so on. When I checked the output of the eBayes(lmfit) command, I found that \$genes output was missing. I'm using top_table.rx2(1) to extract the probe IDs (since I'm using python). It would be great if someone could explain what I'm doing wrong. Here's a snippet of the code:

 import rpy2
import rpy2.robjects as robjects
import rpy2.rinterface as ri

fit = lmFit(eset, design)
fit = eBayes(fit)

if exp_type == 'not_ctl':
top_table = topTable(fit, coef='status', sort='none', n='Inf', **{'adjust.method':'BH'})

elif exp_type == 'ctl':
top_table = topTable(fit, coef='age', sort='none', n='Inf', **{'adjust.method':'BH'})



                  logFC  AveExpr             t    P.Value adj.P.Val          B

1007_s_at  4.581859e-02 9.736476  2.826155e+00 0.01488350 0.9703936  -6.951610

1053_at   -1.670440e-02 2.867159 -2.561034e+00 0.02444413 0.9703936  -7.438039

117_at    -1.502654e-04 2.918910 -1.690970e-02 0.98677806 1.0000000 -10.282083

121_at     1.910957e-17 3.254613  3.204963e-13 1.00000000 1.0000000 -10.282237

1255_g_at  1.197955e-03 2.238973  6.973398e-01 0.49847404 1.0000000 -10.024492

1294_at   -5.849747e-04 2.281106 -3.429330e-01 0.73740026 1.0000000 -10.218988



limma rpy • 903 views
3
Entering edit mode
Aaron Lun ★ 27k
@alun
Last seen 1 hour ago
The city by the bay

This sounds like an issue with rpy2 rather than limma, so you'd be better off asking the rpy2 developers. I've never used rpy2 before, but nonetheless, here's some ideas:

• Probe IDs are stored as the row names in the original data.frame, rather than as a separate column. If you were to do top_table[,1] in R, you would get the log-FCs. Instead, you would need to call rownames(top_table) to get the probe IDs.
• Python uses zero-indexing, while R uses 1-indexing, so there may be some adjustment you need to do for the column indices when using rpy2. I can't say for sure, though.

Also, it wouldn't hurt to use the latest version of limma (3.26.2); you're several releases behind.

0
Entering edit mode

Thanks Aaron. rownames(top_table)  seems to work. However, I don't think its a problem of rpy2 because I'm using it at a lot of places in my code but it only gives error while indexing the top_table. Or maybe its a problem while indexing data frames? Thanks again!