Can I create a gene set using GAGE?
1
0
Entering edit mode
Emilia ▴ 40
@emiliabaffo
Last seen 3.6 years ago
Argentina, Rosario (Universidad Naciona…

Hello! I want to perform a GSEA analysis on the list of genes that I got from doing a differential expression analysis. I want to include a pathway that we're working with but that doesn't exist as such in KEGG. It includes genes that are of course part of other KEGG pathways (TP53, SNAI1, SNA2, PRKAA1/2, etc). Is there a way I can manually indicate this pathway of interest?

From what I read I understand that it is possible, however I don't quite understand how. I checked the gage vignette and it says "In addition, the users may derive other own gene sets using the kegg.gsets and go.gsets functions" so I checked the kegg.gsets function here but I honestly didn't understand a thing of the example given, I don't get any of the steps done. I ran the example code to see if that shed any light on the matter but I still don't get it, I don't understand why is doing each step and what I am supposed to get from the objects created.

Sorry if it's a really dumb question but can anyone help me understand how it's done? or does anyone have a link to another resource where each step is explained in detail?

gage pathview gageData • 1.4k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 3 days ago
United States

Here's an example

> library(gage)
> z <- kegg.gsets()
## what is this?
> class(z)
[1] "list"

## what's in it?
> sapply(z, class)
   kg.sets sigmet.idx    sig.idx    met.idx   dise.idx 
    "list"  "integer"  "integer"  "integer"  "integer" 

## Note that the man page for kegg.gsets would tell you this as well

> z$kg.sets[1:3]
$`hsa00010 Glycolysis / Gluconeogenesis`
 [1] "10327"  "124"    "125"    "126"    "127"    "128"    "130"    "130589"
 [9] "131"    "160287" "1737"   "1738"   "2023"   "2026"   "2027"   "217"   
[17] "218"    "219"    "2203"   "221"    "222"    "223"    "224"    "226"   
[25] "229"    "230"    "2538"   "2597"   "26330"  "2645"   "2821"   "3098"  
[33] "3099"   "3101"   "387712" "3939"   "3945"   "3948"   "441531" "501"   
[41] "5105"   "5106"   "5160"   "5161"   "5162"   "5211"   "5213"   "5214"  
[49] "5223"   "5224"   "5230"   "5232"   "5236"   "5313"   "5315"   "55276" 
[57] "55902"  "57818"  "669"    "7167"   "80201"  "83440"  "84532"  "8789"  
[65] "92483"  "92579"  "9562"  

$`hsa00020 Citrate cycle (TCA cycle)`
 [1] "1431"  "1737"  "1738"  "1743"  "2271"  "3417"  "3418"  "3419"  "3420" 
[10] "3421"  "4190"  "4191"  "47"    "48"    "4967"  "50"    "5091"  "5105" 
[19] "5106"  "5160"  "5161"  "5162"  "55753" "6389"  "6390"  "6391"  "6392" 
[28] "8801"  "8802"  "8803" 

$`hsa00030 Pentose phosphate pathway`
 [1] "132158" "2203"   "221823" "226"    "229"    "22934"  "230"    "2539"  
 [9] "25796"  "2821"   "414328" "51071"  "5211"   "5213"   "5214"   "5226"  
[17] "5236"   "55276"  "5631"   "5634"   "6120"   "64080"  "6888"   "7086"  
[25] "729020" "8277"   "84076"  "8789"   "9104"   "9563"

So this thing we made is a list with four entries. The first entry is another list, the names of which are the KeGG pathways and the values are NCBI Gene IDs for the genes that are in each pathway. The other entries are useful for some things, but you don't need to concern yourself with them.

Assuming you want to do an analysis using all the KeGG pathways PLUS your new one, you just want to append your new one to the end. So you need to make a list with the right name and some NCBI Gene IDs. Here's a fake example

> mygenes <-  c("TP53", "SNAI1", "SNAI2", "PRKAA1", "PRKAA2")
> mygeneids <- mapIds(org.Hs.eg.db, mygenes, "ENTREZID", "ALIAS")
'select()' returned 1:1 mapping between keys and columns
> mygeneids
  TP53  SNAI1  SNAI2 PRKAA1 PRKAA2 
"7157" "6615" "6591" "5562" "5563" 
> newpath <- list("newpathway" = mygeneids)
> newpath
$newpathway
  TP53  SNAI1  SNAI2 PRKAA1 PRKAA2 
"7157" "6615" "6591" "5562" "5563" 

> z[[1]] <- c(z[[1]], newpath)

> z[[1]][(length(z[[1]]) - 1):length(z[[1]])]

$`hsa05418 Fluid shear stress and atherosclerosis`
  [1] "10000"  "1003"   "10365"  "1147"   "119391" "1432"   "1499"   "1514"  
  [9] "1535"   "163688" "1728"   "1843"   "1906"   "207"    "208"    "221357"
 [17] "2353"   "25828"  "27035"  "2817"   "2938"   "2939"   "2940"   "2941"  
 [25] "2944"   "2946"   "2947"   "2948"   "2949"   "2950"   "2952"   "2953"  
 [33] "3162"   "3320"   "3326"   "3383"   "3458"   "3551"   "3552"   "3553"  
 [41] "3554"   "3674"   "3685"   "3690"   "3725"   "3791"   "387"    "387082"
 [49] "406902" "4205"   "4208"   "4217"   "4257"   "4258"   "4259"   "4313"  
 [57] "4318"   "445"    "4688"   "4780"   "4790"   "4846"   "4880"   "5154"  
 [65] "5155"   "51588"  "5175"   "51806"  "5290"   "5291"   "5293"   "5295"  
 [73] "5296"   "5327"   "5562"   "5563"   "5590"   "5598"   "5599"   "5600"  
 [81] "5601"   "5602"   "5603"   "5607"   "5608"   "5609"   "5747"   "5879"  
 [89] "5880"   "5881"   "59341"  "596"    "5970"   "60"     "6300"   "6347"  
 [97] "6382"   "6383"   "6385"   "6401"   "6416"   "652"    "653361" "653689"
[105] "657"    "658"    "659"    "6612"   "6613"   "6714"   "6885"   "7056"  
[113] "71"     "7124"   "7132"   "7157"   "7184"   "7295"   "7341"   "7412"  
[121] "7422"   "7850"   "801"    "805"    "808"    "810"    "8503"   "8517"  
[129] "857"    "858"    "859"    "8878"   "90"     "9181"   "91860"  "92"    
[137] "93"     "9446"   "9817"  

$newpathway
  TP53  SNAI1  SNAI2 PRKAA1 PRKAA2 
"7157" "6615" "6591" "5562" "5563"

And you can see that we have tacked the new pathway on the end of the existing one.

If the code I show above is confusing to you then you need to read An Introduction to R, at the very least and get up to speed with R in general. You cannot expect to do high level analyses with low-level understanding of the tool you intend to use.

ADD COMMENT
0
Entering edit mode

That's very clear, thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6