SPIA package problem
0
0
Entering edit mode
@january-weiner-3999
Last seen 7.4 years ago
Hello, I am trying to use SPIA on some mouse results from a mgug4122a Agilent microarray. Summary of the problem is as follows: I have two vectors: DE_gr_iii and ALL_gr_iii (created following the SPIA vignette, see below). > class( DE_gr_iii ) [1] "numeric" > class( ALL_gr_iii ) [1] "character" > names( DE_gr_iii ) <- ALL_gr_iii > DE_gr_iii[1:10] 12808 78369 71897 241568 102075 27273 0.15805260 0.75696349 -0.02208268 -0.53025986 -0.09489560 0.16656121 20321 57435 18010 18010 -0.13020754 -0.19411325 -0.02297658 -0.03317089 > ALL_gr_iii[1:10] [1] "12808" "78369" "71897" "241568" "102075" "27273" "20321" "57435" [9] "18010" "18010" > length( DE_gr_iii ) [1] 1918 Now when I run spia, I get the following error: > res <- spia( de=DE_gr_iii, all=ALL_gr_iii, organism="mmu", nB = 2000, plots=F, beta=NULL ) Error in spia(de = DE_gr_iii, all = ALL_gr_iii, organism = "mmu", nB = 2000, : de must be a vector of log2 fold changes. The names of de should be included in the refference array! The DE_gr_iii is definitely the log2 fold change vector. I'm not sure what is meant by the reference array since I don't see it in the SPIA vignette, but I assume that the reference is either the data file mmuSPIA.RData or the ALL vector. I am not sure whether ALL should really contain all Entrez IDs from the microarray, but I think not; I have tried also with all Entrez IDs, and it did not work; I also used the Colorectal cancer data set from the SPIA package only with first 100 values for the DE_Colorectal and ALL_Colorectal vectors, and it run w/o problems. The log fold changes were taken from a microarray experiment. I don't think there is a problem with that because I tried also to fake the values by taking them from the Colorectal cancer data provided with SPIA. I don't think that there is a problem with the length of the data. I tried also another data set with 20,000 genes, and the error was the same. Furthermore, I tried to run the Colorectal data set using only first 100 values, and there were no problems running that. The SPIA package seems to be correctly installed, because I can run the example from the vignette without any problems. The Entrez IDs that I used were derived from the Agilent annotation package for this chip: > a2sel$EID <- unlist( mget( as.character( a2sel$SCode ), mgug4122aENTREZID ) ) (a2sel is a data frame containing the fold changes, gene information etc.; agilent identifiers are stored in the SCode column) I removed any identifiers that were not mapped to Entrez: > length( which( is.na( a2sel$EID ) ) ) [1] 7022 > a2sel <- a2sel[ !is.na( a2sel$EID ),] > length( which( is.na( a2sel$EID ) ) ) [1] 0 The last hypothesis was that for whatever reason there is a problem with Entrez IDs (that they do not match the IDs from the mmuSPIA.RData file provided by the distribution). I tested this by using the identifiers that are directly to be found in the mmuSPIA.RData pathway info. I loaded the pathway info from the mmuSPIA.RData file: > load( file=paste( system.file( "extdata/mmuSPIA.RData", package="SPIA" ) ) ) I have chosen a pathway that contains several interactions of the type "activation" and used the colnames and rownames of the matrix for my ALL vector: > all_ttt <- c( colnames( path.info[["04010"]]$activation ), rownames( path.info[["04010"]]\$activation ) ) > length( all_ttt ) [1] 564 I generated some random fold changes: > de_ttt <- runif( length( all_ttt ), -10, 10 ) > names( de_ttt ) <- all_ttt The result was, again, error: > res <- spia( de=de_ttt, all=all_ttt, organism="mmu", nB = 2000, plots=F, beta=NULL ) Error in spia(de = de_ttt, all = all_ttt, organism = "mmu", nB = 2000, plots = F, : de must be a vector of log2 fold changes. The names of de should be included in the refference array! I have no idea what the problem is. Thanks in advance for any help -- maybe I should use another package? I have lost two days on this problem already. j. P.S. > sessionInfo() R version 2.10.1 (2009-12-14) i486-pc-linux-gnu locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C [3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8 [7] LC_PAPER=en_US.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] mgug4122a.db_2.3.6 SPIA_1.4.0 org.Mm.eg.db_2.3.6 [4] BioIDMapper_2.1 gWidgetsRGtk2_0.0-65 gWidgets_0.0-41 [7] lattice_0.18-3 XML_3.1-0 RCurl_1.4-2 [10] bitops_1.0-4.1 hgu95av2.db_2.3.5 org.Hs.eg.db_2.3.6 [13] GO.db_2.3.5 annotate_1.24.1 GOstats_2.12.0 [16] RSQLite_0.8-3 DBI_0.2-5 graph_1.26.0 [19] Category_2.12.1 AnnotationDbi_1.8.2 Biobase_2.6.1 loaded via a namespace (and not attached): [1] genefilter_1.24.3 grid_2.10.1 GSEABase_1.8.0 RBGL_1.24.0 [5] RGtk2_2.12.15 splines_2.10.1 survival_2.35-8 tcltk_2.10.1 [9] tools_2.10.1 xtable_1.5-6 -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web : www.mpiib-berlin.mpg.de Tel : +49-30-28460514