Question

Pathview with non-KEGG organism

0

Entering edit mode

Christian De Santis ▴ 150

@christian-de-santis-6143

Last seen 9.6 years ago

Hi Wejun, I am new to BIOC and Pathview/Gage packages. I am analysing microarray data from an experiment on Atlantic salmon and I am attempting to visualize the results in Pathview, if possible. Following up a previous thread (https://stat.ethz.ch/pipermail/bioconductor/2013-August/054161.html), I have been trying to do a similar thing and I believe I have similar limitation. As for the previous user, I have obtained KEGG Orthology annotation using KAAS. Briefly, the principal steps of my workflow look like the following: > DIET12_14_KO <- read.csv("DIET12_14_KO.csv",header=T, sep=",") # Upload the KEGG annotation file from KAAS > DIET12_14_KO[1:3,] ProbeName KO 1 Omy#AB024321 K04079 2 Omy#BG360545 K13506 3 Omy#BX072887 K00412 > MAlist[1:3,1:6] # Visualize my expression list DIET14 DIET14.1 DIET14.2 DIET14.3 DIET02 DIET02.1 Omy#AB024321 0.06296557 0.08865075 0.1186315 -0.1847021 -0.41212414 -0.42385673 Omy#BG360545 -0.50762181 -0.35763304 -0.4939668 -0.6973216 -0.11339368 0.15489712 Omy#BX072887 0.23447458 0.22487856 0.3930821 0.1515031 -0.04694996 -0.04836203 > dim(MAlist) [1] 7955 16 > D2 <- as.matrix(DIET12_14_KO) # create the two column character matrix for id.map argument > D2[1:3,] ProbeName KO [1,] "Omy#AB024321" "K04079" [2,] "Omy#BG360545" "K13506" [3,] "Omy#BX072887" "K00412" > gene.data <- mol.sum(MAlist, id.map = D2) > gene.data [1:3,1:6] DIET14 DIET14 DIET14 DIET14 DIET02 DIET02 K00006 0.7170382 0.5351467 0.1207924 0.1782242 0.228860514 -0.5426538 K00008 -0.8112601 -0.5910453 -0.7691811 -0.1919992 -0.003848065 0.1771637 K00011 1.9645823 1.2305297 2.3335377 1.4813718 0.185036373 -1.2886788 > dim(gene.data) [1] 2449 16 I am a bit stuck here. I should now have the data in the correct format for the pathview argument "gene.data" with genes as row and samples as column and KO ids as row names. From my understanding, to proceed I will now need a KO gene set data for non-model species? Or could I use one from a close species like zebrafish? Also, one thing I have not clear is if the gene.data should include the expression values of all sample (i.e. biological replicates) or the average value per treatment. Your help will be very much appreciated. Regards, Christian -- The University of Stirling has been ranked in the top 12 of UK universities for graduate employment*. 94% of our 2012 graduates were in work and/or further study within six months of graduation. *The Telegraph The University of Stirling is a charity registered in Scotland, number SC 011159. [[alternative HTML version deleted]]

Annotation pathview Annotation pathview • 1.7k views

ADD COMMENT • link updated 10.6 years ago by Luo Weijun ★ 1.6k • written 10.6 years ago by Christian De Santis ▴ 150

score 0 · Answer 1 · 2013-09-16

Christian, You?ve done the gene ID mapping to KO correctly. To proceed with the GAGE pathway analysis, you will need the KO gene set data (which I will send you next). The KO gene set data will be provided in the next release of gageData package too. To see whether KEGG includes your research species, you may check: library(pathview) data(korg) head(korg) If it is included, you don?t really have to map your gene ID to KO given that you can get the corresponding gene set data. As you have multiple samples/replicates, you may choose to visualize the average gene expression of all samples together or each individual sample separately using Pathview. Pathview will also be able to integrate/plot multiple states/samples on the same graph by splitting each node, from next devel release (version 1.17): http://bioconductor.org/packages/devel/bioc/html/pathview.html. So stay tunned. HTH. Weijun -------------------------------------------- On Mon, 9/16/13, Christian De Santis <christian.desantis at="" stir.ac.uk=""> wrote: Subject: Pathview with non-KEGG organism To: "'bioconductor at r-project.org'" <bioconductor at="" r-project.org=""> Date: Monday, September 16, 2013, 4:44 AM Hi Wejun, ? I am new to BIOC and Pathview/Gage packages. I am analysing microarray data from an experiment on Atlantic salmon and I am attempting to visualize the results in Pathview, if possible. ? Following up a previous thread (https://stat.ethz.ch/pipermail/bioconductor/2013-August/054161.html), I have been trying to do a similar thing and I believe I have similar limitation. As for the previous user, I have obtained KEGG Orthology annotation using KAAS. Briefly, the principal steps of my workflow look like the following:?? ? > DIET12_14_KO <- read.csv("DIET12_14_KO.csv",header=T, sep=",") # Upload the KEGG annotation file from KAAS > DIET12_14_KO[1:3,] ???? ProbeName???? KO 1 Omy#AB024321 K04079 2 Omy#BG360545 K13506 3 Omy#BX072887 K00412 > MAlist[1:3,1:6] # Visualize my expression list ????????????????? DIET14??? DIET14.1?? DIET14.2?? DIET14.3????? DIET02??? DIET02.1 Omy#AB024321? 0.06296557? 0.08865075? 0.1186315 -0.1847021 -0.41212414 -0.42385673 Omy#BG360545 -0.50762181 -0.35763304 -0.4939668 -0.6973216 -0.11339368? 0.15489712 Omy#BX072887? 0.23447458? 0.22487856? 0.3930821? 0.1515031 -0.04694996 -0.04836203 > dim(MAlist) [1] 7955?? 16 > D2 <- as.matrix(DIET12_14_KO) # create the two column character matrix for id.map argument > D2[1:3,] ???? ProbeName????? KO????? [1,] "Omy#AB024321" "K04079" [2,] "Omy#BG360545" "K13506" [3,] "Omy#BX072887" "K00412" > gene.data <- mol.sum(MAlist, id.map = D2) > gene.data [1:3,1:6] ?????????? DIET14???? DIET14???? DIET14???? DIET14?????? DIET02???? DIET02 K00006? 0.7170382? 0.5351467? 0.1207924? 0.1782242? 0.228860514 -0.5426538 K00008 -0.8112601 -0.5910453 -0.7691811 -0.1919992 -0.003848065? 0.1771637 K00011? 1.9645823? 1.2305297? 2.3335377? 1.4813718? 0.185036373 -1.2886788 > dim(gene.data) [1] 2449?? 16 ? I am a bit stuck here. I should now have the data in the correct format for the pathview argument ?gene.data? with genes as row and samples as column and KO ids as row names. From my understanding, to proceed I will now need a KO gene set data for non-model species? Or could I use one from a close species like zebrafish? ? Also, one thing I have not clear is if the gene.data should include the expression values of all sample (i.e. biological replicates) or the average value per treatment. ? Your help will be very much appreciated. ? Regards, Christian? ? ? ? The University of Stirling has been ranked in the top 12 of UK universities for graduate employment*. 94% of our 2012 graduates were in work and/or further study within six months of graduation. *The Telegraph The University of Stirling is a charity registered in Scotland, number SC 011159.

score 0 · Answer 2 · 2013-09-16

KO gene set data attached. You may load it and proceed as described previously in: https://stat.ethz.ch/pipermail/bioconductor/2013-August/054161.html -------------------------------------------- On Mon, 9/16/13, Christian De Santis <christian.desantis at="" stir.ac.uk=""> wrote: Subject: Pathview with non-KEGG organism To: "'bioconductor at r-project.org'" <bioconductor at="" r-project.org=""> Date: Monday, September 16, 2013, 4:44 AM Hi Wejun, ? I am new to BIOC and Pathview/Gage packages. I am analysing microarray data from an experiment on Atlantic salmon and I am attempting to visualize the results in Pathview, if possible. ? Following up a previous thread (https://stat.ethz.ch/pipermail/bioconductor/2013-August/054161.html), I have been trying to do a similar thing and I believe I have similar limitation. As for the previous user, I have obtained KEGG Orthology annotation using KAAS. Briefly, the principal steps of my workflow look like the following:?? ? > DIET12_14_KO <- read.csv("DIET12_14_KO.csv",header=T, sep=",") # Upload the KEGG annotation file from KAAS > DIET12_14_KO[1:3,] ???? ProbeName???? KO 1 Omy#AB024321 K04079 2 Omy#BG360545 K13506 3 Omy#BX072887 K00412 > MAlist[1:3,1:6] # Visualize my expression list ????????????????? DIET14??? DIET14.1?? DIET14.2?? DIET14.3????? DIET02??? DIET02.1 Omy#AB024321? 0.06296557? 0.08865075? 0.1186315 -0.1847021 -0.41212414 -0.42385673 Omy#BG360545 -0.50762181 -0.35763304 -0.4939668 -0.6973216 -0.11339368? 0.15489712 Omy#BX072887? 0.23447458? 0.22487856? 0.3930821? 0.1515031 -0.04694996 -0.04836203 > dim(MAlist) [1] 7955?? 16 > D2 <- as.matrix(DIET12_14_KO) # create the two column character matrix for id.map argument > D2[1:3,] ???? ProbeName????? KO????? [1,] "Omy#AB024321" "K04079" [2,] "Omy#BG360545" "K13506" [3,] "Omy#BX072887" "K00412" > gene.data <- mol.sum(MAlist, id.map = D2) > gene.data [1:3,1:6] ?????????? DIET14???? DIET14???? DIET14???? DIET14?????? DIET02???? DIET02 K00006? 0.7170382? 0.5351467? 0.1207924? 0.1782242? 0.228860514 -0.5426538 K00008 -0.8112601 -0.5910453 -0.7691811 -0.1919992 -0.003848065? 0.1771637 K00011? 1.9645823? 1.2305297? 2.3335377? 1.4813718? 0.185036373 -1.2886788 > dim(gene.data) [1] 2449?? 16 ? I am a bit stuck here. I should now have the data in the correct format for the pathview argument ?gene.data? with genes as row and samples as column and KO ids as row names. From my understanding, to proceed I will now need a KO gene set data for non-model species? Or could I use one from a close species like zebrafish? ? Also, one thing I have not clear is if the gene.data should include the expression values of all sample (i.e. biological replicates) or the average value per treatment. ? Your help will be very much appreciated. ? Regards, Christian? ? ? ? The University of Stirling has been ranked in the top 12 of UK universities for graduate employment*. 94% of our 2012 graduates were in work and/or further study within six months of graduation. *The Telegraph The University of Stirling is a charity registered in Scotland, number SC 011159.