limma results versus topTable
2
0
Entering edit mode
@anne-deslattes-mays-5991
Last seen 7.1 years ago
United States

Hello,

I'm trying to create a reduced matrix after running lmFit, contrasts.fit and eBayes from limma.   To do so, ideally the rownames of the results match the rownames of the original matrix.

For example, this is the case with topTable:

>CasevsCtrl.top100 = topTable(iefit,coef=1,n=100,sort="p",p=0.05)[,-1]
>head(CasevsCtrl.top100)
        AveExpr         t      P.Value    adj.P.Val        B
20876 0.8908066 -22.45433 2.598162e-93 2.307428e-89 201.9057
20881 0.8300441 -21.67655 5.092048e-88 2.261124e-84 189.7929
44967 0.9034543 -19.07309 6.212197e-71 1.839017e-67 150.6919
57087 0.9278437 -18.70029 1.411561e-68 3.134017e-65 145.3004
57078 0.9373613 -18.54277 1.373644e-67 2.439866e-64 143.0397
57079 0.9444422 -18.36227 1.839180e-66 2.722293e-63 140.4619

The rownames for this results allow subsetting from the original matrix.   However, if one would rather have sorted extremes from the top under and top over expressed -- I use the results table -- which does not preserve the original rownames from the original matrix, making subsetting a little trickier.

>results = decideTests(iefit,adjust="BH",p=0.05)

To get the relative under expressing results -- one looks for the results in the column of interest == -1.  To get the relative over expressing results -- one looks for the results in the column of interest == 1.  However, now the rownames are not preserved relative to the original matrix. See below:

> head(results[results[,1] == -1,])

    Contrasts
      CasevsCtrl 
  11             -1   
  15             -1   
  17             -1   
  27             -1   
  165            -1  
  167            -1  
> head(results[results[,1] == 1,])
    Contrasts
      CasevsCtrl 
  141             1   
  143             1   
  175             1   
  182             1   
  207             1  
  218             1  

 

 Typically I would like to use the rownames as a means to subset a matrix with the most significant differences from the original matrix -- but this is hampered if the rownames are not preserved.  My question is -- am I doing this wrong?  Is there another way to subset?

Thanks in advance,

Anne

> sessionInfo()

R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] circlize_0.4.1              ComplexHeatmap_1.12.0      
 [3] RTCGA.clinical_20151101.4.0 RTCGA_1.4.0                
 [5] edgeR_3.16.5                limma_3.30.13              
 [7] plyr_1.8.4                  rtracklayer_1.34.2         
 [9] GenomicRanges_1.26.4        GenomeInfoDb_1.10.3        
[11] AnnotationDbi_1.36.2        IRanges_2.8.2              
[13] S4Vectors_0.12.2            Biobase_2.34.0             
[15] AnnotationHub_2.6.5         BiocGenerics_0.20.0        
[17] readr_1.1.1                 stringr_1.2.0              
[19] tidyr_0.7.2                 reshape2_1.4.2             
[21] ggplot2_2.2.1               BiocInstaller_1.24.0       

loaded via a namespace (and not attached):
 [1] colorspace_1.3-2              rjson_0.2.15                 
 [3] class_7.3-14                  modeltools_0.2-21            
 [5] mclust_5.3                    XVector_0.14.1               
 [7] GlobalOptions_0.0.12          ggpubr_0.1.5                 
 [9] flexmix_2.3-14                bit64_0.9-7                  
[11] interactiveDisplayBase_1.12.0 mvtnorm_1.0-6                
[13] xml2_1.1.1                    splines_3.3.3                
[15] mnormt_1.5-5                  robustbase_0.92-7            
[17] knitr_1.17                    Rsamtools_1.26.2             
[19] broom_0.4.2                   km.ci_0.5-2                  
[21] kernlab_0.9-25                cluster_2.0.6                
[23] shiny_1.0.5                   httr_1.3.1                   
[25] assertthat_0.2.0              Matrix_1.2-11                
[27] lazyeval_0.2.0                htmltools_0.3.6              
[29] tools_3.3.3                   bindrcpp_0.2                 
[31] gtable_0.2.0                  glue_1.1.1                   
[33] dplyr_0.7.4                   ggthemes_3.4.0               
[35] Rcpp_0.12.13                  trimcluster_0.1-2            
[37] Biostrings_2.42.1             nlme_3.1-131                 
[39] fpc_2.1-10                    psych_1.7.8                  
[41] rvest_0.3.2                   mime_0.5                     
[43] XML_3.98-1.9                  dendextend_1.5.2             
[45] DEoptimR_1.0-8                zlibbioc_1.20.0              
[47] MASS_7.3-47                   zoo_1.8-0                    
[49] scales_0.5.0                  BSgenome_1.42.0              
[51] hms_0.3                       SummarizedExperiment_1.4.0   
[53] RColorBrewer_1.1-2            yaml_2.1.14                  
[55] curl_3.0                      memoise_1.1.0                
[57] gridExtra_2.3                 KMsurv_0.1-5                 
[59] stringi_1.1.5                 RSQLite_2.0                  
[61] BiocParallel_1.8.2            shape_1.4.3                  
[63] rlang_0.1.2                   pkgconfig_2.0.1              
[65] prabclus_2.2-6                bitops_1.0-6                 
[67] lattice_0.20-35               purrr_0.2.4                  
[69] bindr_0.1                     GenomicAlignments_1.10.1     
[71] cmprsk_2.2-7                  bit_1.1-12                   
[73] magrittr_1.5                  R6_2.2.2                     
[75] DBI_0.7                       whisker_0.3-2                
[77] foreign_0.8-69                survival_2.41-3              
[79] RCurl_1.95-4.8                nnet_7.3-12                  
[81] tibble_1.3.4                  survMisc_0.5.4               
[83] viridis_0.4.0                 GetoptLong_0.1.6             
[85] locfit_1.5-9.1                data.table_1.10.4-2          
[87] blob_1.1.0                    digest_0.6.12                
[89] diptest_0.75-7                xtable_1.8-2                 
[91] httpuv_1.3.5                  munsell_0.4.3                
[93] viridisLite_0.2.0             survminer_0.4.0   
limma • 1.1k views
ADD COMMENT
0
Entering edit mode

Okay -- I am wrong the rownames are in fact preserved.  However, I still do not get what I want -- what I want is to have the rows that are most significantly over and under expressed -- the results gives the breakdown, but they are not sorted by p-value -- so I imagine I can do this.   But advice as to the most elegant way to approach this is appreciated.

 

ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

Using the example data from ?lmFit, we get an MArrayLM object called 'fit'.

> alldat <- topTable(fit,2,Inf)
> rslt <- decideTests(fit)
> ups <- row.names(rslt)[rslt[,2] > 0]
> dns <- row.names(rslt)[rslt[,2] < 0]
> alldat[row.names(alldat) %in% ups,]
       logFC AveExpr    t  P.Value adj.P.Val    B
Gene 1  2.07   0.818 9.19 9.40e-06   0.00094 4.04
Gene 2  1.79   1.328 7.12 6.73e-05   0.00336 1.97

There are no down-regulated genes, but you get the point.

ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 28 minutes ago
WEHI, Melbourne, Australia

James has given a good way using row names. But there are lots of ways to do this without using row names. Why not simply:

tab <- topTable(iefit,coef=1,n=Inf,sort="p",p=0.05)
tabUp <- tab[tab$logFC>0,]
tabDn <- tab[tab$logFC<0,]

Or else using decideTests:

tab <- topTable(iefit,coef=1,n=Inf,sort="none")
results = decideTests(iefit,adjust="BH",p=0.05)
tabUp <- tab[results[,1]<0,]
tabDn <- tab[results[,1]>0,]
ADD COMMENT

Login before adding your answer.

Traffic: 720 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6