Search
Question: Gene set enrichment analysis for GO categories and KEGG pathway
0
gravatar for libya.tahani
23 months ago by
Libyan Arab Jamahiriya
libya.tahani0 wrote:

Hello,

 

when I try to run Gene set enrichment analysis for GO categories and KEGG pathway,
 I got the probes that I considered interesting using the code
> library(hgu95av2)
> allg<-get("hgu133aENTREZID")
> allg<-as.data.frame(unlist(as.list(allg)))
> myids<-unique(allg[rownames(dat.s),])
here every thing was ok , but I have to get an actual test which run in three steps , since the GO hierarchy consists
of three distinct ontologies. These are biological process (BP), molecular
function (MF) and cellular component (CC) . I faced some error with the following code:

 

> params<-new("GOHyperGParams", geneIds=myids,
+  annotation=c("hgu95av2"), ontology="BP", pvalueCutoff=0.05,
+  conditional=FALSE, testDirection="over")
Loading required package: hgu95av2.db
Failed with error:  ‘hgu95av2PFAM is defunct. Please use select() if you need access to PFAM or PROSITE accessions.’
In addition: Warning messages:
1: In (function ()  :
  hgu95av2CHR is deprecated. Please use an appropriate TxDb object or package for this kind of data.
2: In (function ()  :
  hgu95av2CHRLENGTHS is deprecated. Please use an appropriate TxDb object or package for this kind of data.
3: In (function ()  :
  hgu95av2CHRLOC is deprecated. Please use an appropriate TxDb object or package for this kind of data.
Error in DatPkgFactory(annotation) : 
  annotation package 'hgu95av2.db' not available

 

and the same error with ontology=MF , ontology=CC  !!!

also the same error with KEGG pathway :

> params<-new("KEGGHyperGParams", geneIds=myids,
+  annotation="hgu95av2", pvalueCutoff=0.05,
+  testDirection="over")
Loading required package: hgu95av2.db
Failed with error:  ‘hgu95av2PFAM is defunct. Please use select() if you need access to PFAM or PROSITE accessions.’
In addition: Warning messages:
1: In (function ()  :
  hgu95av2CHR is deprecated. Please use an appropriate TxDb object or package for this kind of data.
2: In (function ()  :
  hgu95av2CHRLENGTHS is deprecated. Please use an appropriate TxDb object or package for this kind of data.
3: In (function ()  :
  hgu95av2CHRLOC is deprecated. Please use an appropriate TxDb object or package for this kind of data.
Error in DatPkgFactory(annotation) : 
  annotation package 'hgu95av2.db' not available

was I miss any thing above or ??

Tahani,

 

ADD COMMENTlink modified 23 months ago • written 23 months ago by libya.tahani0

Any advice about please?

Tahani.

ADD REPLYlink written 23 months ago by libya.tahani0
0
gravatar for James W. MacDonald
23 months ago by
United States
James W. MacDonald45k wrote:

You are doing some really random things here - did you find some code somewhere that you are following? For example, why are you using the hgu133aENTREZID BiMap, if your data are from the hgu95av2 array? Those are two completely different arrays!

Also, in general you use a set of interesting genes to do GO analysis, but you are using all the genes that are on the hgu133a array. That doesn't make any sense. In addition, you don't give output from your sessionInfo(), so it's not clear what version of R/BioC you are using.

Anyway, this works for me:

> univ <- unique(select(hgu95av2.db, keys(hgu95av2.db), "ENTREZID")[,2])
'select()' returned 1:many mapping between keys and columns
## a fake set of significant genes, just for an example
> fake <- univ[sample(1:length(univ), 200)]
> fake
  [1] "23152"     "101927562" "4905"      "9856"      "7169"      "100008586"
  [7] "1072"      "57148"     "9743"      "10553"     "9659"      "100526664"
 [13] "6217"      "5500"      "10810"     "4150"      "7275"      "642236"   
 [19] "3746"      "5716"      "7399"      "8969"      "113"       "8034"     
 [25] "22846"     "2295"      "28513"     "100101112" "6161"      "55746"    
 [31] "873"       "10236"     "81550"     "23604"     "27335"     "90410"    
 [37] "4629"      "8314"      "2079"      "4035"      "1349"      "9149"     
 [43] "339166"    "2630"      "3008"      "5810"      "898"       "3233"     
 [49] "10007"     "474382"    "8653"      "3164"      "51032"     "6489"     
 [55] "6804"      "8544"      "22990"     "2289"      "8804"      "4793"     
 [61] "142684"    "8564"      "91851"     "3188"      "6294"      "443"      
 [67] "8518"      "92856"     "8692"      "721"       "6122"      "51304"    
 [73] "9295"      "23359"     "5209"      "3003"      "156"       "23235"    
 [79] "23316"     "100049076" "5831"      "2053"      "9723"      "8973"     
 [85] "85389"     "25759"     "8499"      "578"       "1501"      "2671"     
 [91] "2825"      "429"       "4340"      "3421"      "51326"     "56171"    
 [97] "645051"    "10794"     "81605"     "23063"     "5700"      "8317"     
[103] "2584"      "11016"     "146542"    "55186"     "3611"      "89"       
[109] "23478"     "4105"      "107"       "427"       "347344"    "5079"     
[115] "10227"     "22891"     "7073"      "23366"     "64795"     "51232"    
[121] "390502"    "10473"     "116987"    "10129"     "3600"      "100996761"
[127] "6015"      "6643"      "639"       "10489"     "740"       "1586"     
[133] "5027"      "22906"     "56061"     "29906"     "6787"      "2267"     
[139] "954"       "4726"      "5152"      "9831"      "2215"      "1783"     
[145] "81578"     "563"       "203"       "100134444" "6753"      "8720"     
[151] "395"       "5266"      "9394"      "5036"      "2208"      "2923"     
[157] "7251"      "23233"     "1845"      "29984"     "1755"      "4286"     
[163] "25849"     "6152"      "6584"      "5606"      "10574"     "440387"   
[169] "5315"      "53916"     "5203"      "26135"     "8289"      "7539"     
[175] "51382"     "2147"      "3352"      "940"       "26030"     "10562"    
[181] "4621"      "9252"      "4496"      "84159"     "23135"     "9403"     
[187] "6534"      "6541"      "1442"      "5547"      "4289"      "260294"   
[193] "10645"     "6005"      "400818"    "9710"      "11046"     "100532731"
[199] "5371"      "100505984"
> p <- new("GOHyperGParams", geneIds = fake, universeGeneIds = univ, ontology = "BP", annotation="hgu95av2.db")
> hyp <- hyperGTest(p)
> summary(hyp)
       GOBPID       Pvalue OddsRatio   ExpCount Count Size
1  GO:0014854 0.0009078414 20.880952  0.2045877     3   10
2  GO:0010761 0.0015275690  9.321503  0.5114693     4   25
3  GO:0061512 0.0016149335 16.236626  0.2455053     3   12
4  GO:0006862 0.0025927574 13.281145  0.2864228     3   14
5  GO:0019674 0.0028384845  5.710029  0.9820211     5   48
6  GO:0009135 0.0031991557  4.548621  1.4525728     6   71
7  GO:0009179 0.0031991557  4.548621  1.4525728     6   71
8  GO:0009185 0.0034321033  4.479131  1.4730316     6   72
9  GO:0007568 0.0040824259  2.735963  4.3372598    11  212
10 GO:0009132 0.0057756577  3.990821  1.6367018     6   80
11 GO:0014732 0.0059122160 24.220859  0.1227526     2    6
12 GO:0014870 0.0059122160 24.220859  0.1227526     2    6
13 GO:0060586 0.0059122160 24.220859  0.1227526     2    6
14 GO:0006734 0.0059750470  6.108696  0.7365158     4   36
15 GO:0046031 0.0069336748  4.540509  1.2070676     5   59
16 GO:0006165 0.0079752631  4.377232  1.2479851     5   61
17 GO:0009629 0.0081662746 19.374233  0.1432114     2    7
18 GO:0014891 0.0081662746 19.374233  0.1432114     2    7
19 GO:0044803 0.0081662746 19.374233  0.1432114     2    7
20 GO:0051709 0.0081662746 19.374233  0.1432114     2    7
21 GO:0009059 0.0089531826  1.482393 55.0954743    70 2693
22 GO:0051881 0.0094937049  5.279839  0.8388097     4   41
23 GO:0019362 0.0096150668  3.553990  1.8208308     6   89
24 GO:0046496 0.0096150668  3.553990  1.8208308     6   89
25 GO:0046939 0.0097343239  4.153072  1.3093614     5   64
                                                  Term
1                               response to inactivity
2                                 fibroblast migration
3                       protein localization to cilium
4                                 nucleotide transport
5                                NAD metabolic process
6      purine nucleoside diphosphate metabolic process
7  purine ribonucleoside diphosphate metabolic process
8         ribonucleoside diphosphate metabolic process
9                                                aging
10            nucleoside diphosphate metabolic process
11                             skeletal muscle atrophy
12                       response to muscle inactivity
13       multicellular organismal iron ion homeostasis
14                              NADH metabolic process
15                               ADP metabolic process
16              nucleoside diphosphate phosphorylation
17                                 response to gravity
18                             striated muscle atrophy
19                multi-organism membrane organization
20    regulation of killing of cells of other organism
21                  macromolecule biosynthetic process
22      regulation of mitochondrial membrane potential
23               pyridine nucleotide metabolic process
24           nicotinamide nucleotide metabolic process
25                          nucleotide phosphorylation

## try without specifying the universeGeneIds argument
> p <- new("GOHyperGParams", geneIds = fake, ontology = "BP", annotation="hgu95av2.db")

## try with 'hgu95av2" rather than "hg95av2.db"
> p <- new("GOHyperGParams", geneIds = fake, ontology = "BP", annotation="hgu95av2")

So I cannot replicate the error you get, using the current released version.

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] GOstats_2.36.0       graph_1.48.0         Category_2.36.0     
 [4] GO.db_3.2.2          Matrix_1.2-2         hgu95av2.db_3.2.2   
 [7] org.Hs.eg.db_3.2.3   RSQLite_1.0.0        DBI_0.3.1           
[10] AnnotationDbi_1.32.1 IRanges_2.4.4        S4Vectors_0.8.3     
[13] Biobase_2.30.0       BiocGenerics_0.16.1

loaded via a namespace (and not attached):
 [1] splines_3.2.2          xtable_1.8-0           lattice_0.20-33       
 [4] tools_3.2.2            grid_3.2.2             AnnotationForge_1.12.0
 [7] genefilter_1.52.0      survival_2.38-3        RBGL_1.46.0           
[10] GSEABase_1.32.0        compiler_3.2.2         XML_3.98-1.3          
[13] annotate_1.48.0      
ADD COMMENTlink written 23 months ago by James W. MacDonald45k
0
gravatar for libya.tahani
23 months ago by
Libyan Arab Jamahiriya
libya.tahani0 wrote:

first, Thanks for reply .. yes, I follow the code from some book and try to analysis a DNA microarray data step by step as it written in that book because I still a new in this field . so when I faced some error I schedule it here to get some advice with thankful. .  

I"m using this :

 

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

 

locale:
[1] LC_COLLATE=Arabic_Libya.1256  LC_CTYPE=Arabic_Libya.1256    LC_MONETARY=Arabic_Libya.1256 LC_NUMERIC=C                 
[5] LC_TIME=Arabic_Libya.1256    

 

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

 

other attached packages:
 [1] annotate_1.46.1      XML_3.98-1.3         globaltest_5.22.0    KEGGREST_1.8.1       KEGG.db_3.1.2        xtable_1.8-0        
 [7] GOstats_2.34.0       graph_1.46.0         Category_2.34.2      GO.db_3.1.2          Matrix_1.2-3         org.Hs.eg.db_3.1.2  
[13] RSQLite_1.0.0        DBI_0.3.1            AnnotationDbi_1.30.1 GenomeInfoDb_1.4.3   IRanges_2.2.9        S4Vectors_0.6.6     
[19] hgu95av2_2.2.0       marray_1.46.0        limma_3.24.15        simpleaffy_2.44.0    gcrma_2.40.0         genefilter_1.50.0   
[25] hgu95av2cdf_2.16.0   affy_1.46.1          Biobase_2.28.0       BiocGenerics_0.14.0  BiocInstaller_1.18.5

 

loaded via a namespace (and not attached):
 [1] XVector_0.8.0          tools_3.2.2            zlibbioc_1.14.0        preprocessCore_1.30.0  lattice_0.20-33       
 [6] png_0.1-7              stringr_1.0.0          httr_1.0.0             hgu95av2.db_3.1.3      Biostrings_2.36.4     
[11] grid_3.2.2             GSEABase_1.30.2        R6_2.1.1               survival_2.38-3        RBGL_1.44.0           
[16] magrittr_1.5           splines_3.2.2          AnnotationForge_1.10.1 stringi_1.0-1          affyio_1.36.0      

 ............................................................................................................

ok , from your work above , when I try to run that :

> univ <- unique(select(hgu95av2.db, keys(hgu95av2.db), "ENTREZID")[,2])
Error in unique(select(hgu95av2.db, keys(hgu95av2.db), "ENTREZID")[, 2]) : 
  error in evaluating the argument 'x' in selecting a method for function 'unique': Error in select(hgu95av2.db, keys(hgu95av2.db), "ENTREZID") : 
  error in evaluating the argument 'x' in selecting a method for function 'select': Error: object 'hgu95av2.db' not found

what I do about?

Tahani.

 

ADD COMMENTlink written 23 months ago by libya.tahani0

If you are replying to something, click the 'ADD COMMENT' link below what they wrote, and then write in the box that pops up, instead of using the 'Add your answer' box that you see below. By definition you are not adding an answer if you are asking another question or commenting on what somebody else has said.

To answer your question, if you ever get an error saying something like "Error: object 'hgu95av2.db' not found", this is R telling you that it can't find that thing. If it is a package that's because you haven't loaded the package yet.

While R and Bioconductor are free to use, and the support on this site is free as well, they do not come without a cost, and that cost is the time you have to spend to learn how to use R and Bioconductor tools. I realize you are a beginner, and R is tough to learn at first. But you should look at the help you get on this (or any support forum) as a precious resource that you don't want to use up.

Asking questions that could just as easily be answered yourself by a quick Google search (searching for 'R error object not found' came up with tons of useful links) is a dangerous thing to do, because people might start to think that you value your own time more than you value theirs.

You don't want that to happen! If you get an error you don't understand, use Google. Try to figure things out on your own. This will help you learn R much faster as well. If you are completely stuck, and cannot see a way forward, then ask a question.

ADD REPLYlink written 23 months ago by James W. MacDonald45k
0
gravatar for libya.tahani
23 months ago by
Libyan Arab Jamahiriya
libya.tahani0 wrote:

and about  hgu133aENTREZID  sorry it's my mistake in typing I mean 

 allg<-get("hgu95av2ENTREZID")

 

ADD COMMENTlink written 23 months ago by libya.tahani0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 206 users visited in the last hour