Question

Argument gene.data for the correct implementation of fuction pathview of the pathview package

0

Entering edit mode

svlachavas ▴ 840

@svlachavas-7225

Last seen 20 days ago

Germany/Heidelberg/German Cancer Resear…

Dear ALL,

based on one of my previous questions-posts of using the function mroast (https://support.bioconductor.org/p/66191/#66209), i would like to ask about the correct use of the argument "gene.data" of the pathview function in order to plot specific KEGG pathways that i have found DE based on the function mroast. Thus far, my code is below:

My design matrix of my limma paired analysis:

Intercept) conditionCancer pairs2 pairs3 pairs4 pairs5 pairs6 pairs7 pairs8 pairs9 pairs10 pairs11.....
1 1 0 0 0 0 0 0 0 0 0 0 0
2 1 1 0 0 0 0 0 0 0 0 0 0
3 1 0 1 0 0 0 0 0 0 0 0 0
4 1 1 1 0 0 0 0 0 0 0 0 0
5 1 0 0 1 0 0 0 0 0 0 0 0
6 1 1 0 1 0 0 0 0 0 0 0 0
7 1 0 0 0 1 0 0 0 0 0 0 0
8 1 1 0 0 1 0 0 0 0 0 0 0
9 1 0 0 0 0 1 0 0 0 0 0 0
10 1 1 0 0 0 1 0 0 0 0 0 0
11 1 0 0 0 0 0 1 0 0 0 0 0
12 1 1 0 0 0 0 1 0 0

x <- hgu133aPATH2PROBE

mapped_probes <- mappedkeys(x)

xx <- as.list(x[mapped_probes])

indices <- ids2indices(xx, rownames(data.trusted.eset))

res <- mroast(data.trusted.eset, indices, design, contrast=2)

library(pathview)

x <- hgu133aENTREZID

xx<- as.list(x)

entrezid <- sapply(rownames(fit2), function(x) xx[x], USE.NAMES=FALSE) # where fit2 is the output of ebayes function (fit2 <- eBayes(fit, trend=TRUE))

gene.data <- fit2$coefficients

rownames(gene.data) <- entrezid

path <- pathview(gene.data=gene.data....)

So regarding to the argument gene.data, should i leave it to include all the coefficients or should i use the specific coefficient im interested in, which is "conditionCancer" ?? (and i have defined it with contrast=2 above in the mroast function ?)

head(fit2$coefficients)
(Intercept) conditionCancer pairs2 pairs3 pairs4 pairs5 pairs6 pairs7
1007_s_at 10.215808 0.22343373 0.52147061 0.60510597 1.0420911 0.5031775 0.06009372 0.1640625
1438_at 5.911114 2.53652962 0.70450123 -1.01602942 -0.6130980 -1.1538897 -1.33425754 -2.3958104
1487_at 9.050908 -0.20654769 -0.06396680 0.28847187 -0.0528669 0.3469220 0.04466553 0.4973197
1598_g_at 10.279677 -1.11014791 -0.61752751 0.12946488 0.4559134 0.4962469 -0.05954733 -0.1744488
1729_at 7.941900 -0.02386093 0.69056733 0.05095071 0.7857444 0.8638121 0.16643800 0.2573498
200000_s_at 9.991276 0.04892901 0.08502522 -0.35774486 0.2756260 -0.3458185 -0.71372258 -0.1826813

mroast limma pathview pathway analysis • 1.9k views

ADD COMMENT • link updated 10.3 years ago by Luo Weijun ★ 1.6k • written 10.3 years ago by svlachavas ▴ 840

score 0 · Answer 1 · 2015-03-31

0

Entering edit mode

Kamil Slowikowski ▴ 30

@kamil-slowikowski-6901

Last seen 14 months ago

United States

If you wish to view the difference between cancer and control samples, then you should use the second column of your coefficients matrix. The pathview figure will show a network of genes that are colored by log2 fold-change between cancer and control.

From ?pathview:

gene.data

either vector (single sample) or a matrix-like data (multiple sample). Vector should be numeric with gene IDs as names or it may also be character of gene IDs. Character vector is treated as discrete or count data. Matrix-like data structure has genes as rows and samples as columns. Row names should be gene IDs. Here gene ID is a generic concepts, including multiple types of gene, transcript and protein uniquely mappable to KEGG gene IDs. KEGG ortholog IDs are also treated as gene IDs as to handle metagenomic data. Check details for mappable ID types. Default gene.data=NULL.

ADD COMMENT • link 10.3 years ago Kamil Slowikowski ▴ 30

0

Entering edit mode

Dear Kamil,

thank you again for your notification. I also checked the function

When i first tried

gene.data <- fit2$coefficients[,2] # where 2 is the column of the coefficient cancer vs control samples

but then when i wrote

rownames(gene.data) <- entrezid
Error in `rownames<-`(`*tmp*`, value = list(`1007_s_at` = NA, `1438_at` = "2049", :
attempt to set 'rownames' on an object with no dimensions

so i used gene.data <- as.matrix(gene.data)

and then rownames(gene.data) <- entrezid

and no problem showed. However, my main warning is that due to entrezid, various rownames of probesets are NAs or duplicates. So you think i should remove them in the entrezid object prior of assigning them to the rownames ?

ADD REPLY • link 10.3 years ago svlachavas ▴ 840

score 0 · Answer 2 · 2015-03-31

Pathview can visualize various data values you have, log2 fold changes, absolute expression levels, coefficients, even p-values. Make sure you know what you did when you interpret the graphs. It is recommended that you combine the redundant gene or molecular IDs before input into pathview. You may use the mol.sum function from pathview to do that. Check ?mol.sum for details. -------------------------------------------- On Tue, 3/31/15, svlachavas [bioc] <noreply@bioconductor.org> wrote: Subject: [bioc] Argument gene.data for the correct implementation of fuction pathview of the pathview package To: luo_weijun@yahoo.com Date: Tuesday, March 31, 2015, 5:43 AM Activity on a post you are following on support.bioconductor.org User svlachavas wrote Question: Argument gene.data for the correct implementation of fuction pathview of the pathview package: Dear ALL, based on one of my previous questions-posts of using the function mroast, i would like to ask about the correct use of the argument "gene.data" of the pathview function in order to plot specific KEGG pathways that i have found DE based on the function mroast. Thus far, my code is below: My design matrix of my limma paired analysis: Intercept) conditionCancer pairs2 pairs3 pairs4 pairs5 pairs6 pairs7 pairs8 pairs9 pairs10 pairs11..... 1 � � � � � �1 � � � � � � � 0 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 2 � � � � � �1 � � � � � � � 1 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 3 � � � � � �1 � � � � � � � 0 � � �1 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 4 � � � � � �1 � � � � � � � 1 � � �1 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 5 � � � � � �1 � � � � � � � 0 � � �0 � � �1 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 6 � � � � � �1 � � � � � � � 1 � � �0 � � �1 � � �0 � � �0 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 7 � � � � � �1 � � � � � � � 0 � � �0 � � �0 � � �1 � � �0 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 8 � � � � � �1 � � � � � � � 1 � � �0 � � �0 � � �1 � � �0 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 9 � � � � � �1 � � � � � � � 0 � � �0 � � �0 � � �0 � � �1 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 10 � � � � � 1 � � � � � � � 1 � � �0 � � �0 � � �0 � � �1 � � �0 � � �0 � � �0 � � �0 � � � 0 � � � 0 11 � � � � � 1 � � � � � � � 0 � � �0 � � �0 � � �0 � � �0 � � �1 � � �0 � � �0 � � �0 � � � 0 � � � 0 12 � � � � � 1 � � � � � � � 1 � � �0 � � �0 � � �0 � � �0 � � �1 � � �0 � � �0 x <- hgu133aPATH2PROBE mapped_probes <- mappedkeys(x) xx <- as.list(x[mapped_probes]) indices <- ids2indices(xx, rownames(data.trusted.eset)) res <- mroast(data.trusted.eset, indices, design, contrast=2) library(pathview) x <- hgu133aENTREZID xx<- as.list(x) entrezid <- sapply(rownames(fit2), function(x) xx[x], USE.NAMES=FALSE)�# where fit2 is the output of ebayes function (fit2 <- eBayes(fit, trend=TRUE)) gene.data <- fit2$coefficients rownames(gene.data) <- entrezid path <- pathview(gene.data=gene.data....) So regarding to the argument gene.data, should i leave it to include all �the coefficients or should i use the specific coefficient im interested in, which is "conditionCancer" ?? (and i have defined it with contrast=2 above in the mroast function ?) head(fit2$coefficients)� � � � � � � � � � � �(Intercept) � �conditionCancer�� pairs2 � � �pairs3 � � pairs4 � � pairs5 � � �pairs6 � � pairs7 1007_s_at � � 10.215808 � � �0.22343373 �0.52147061 �0.60510597 �1.0420911 �0.5031775 �0.06009372 �0.1640625 1438_at � � � �5.911114 � � �2.53652962 �0.70450123 -1.01602942 -0.6130980 -1.1538897 -1.33425754 -2.3958104 1487_at � � � �9.050908 � � -0.20654769 -0.06396680 �0.28847187 -0.0528669 �0.3469220 �0.04466553 �0.4973197 1598_g_at � � 10.279677 � � -1.11014791 -0.61752751 �0.12946488 �0.4559134 �0.4962469 -0.05954733 -0.1744488 1729_at � � � �7.941900 � � -0.02386093 �0.69056733 �0.05095071 �0.7857444 �0.8638121 �0.16643800 �0.2573498 200000_s_at � �9.991276 � � �0.04892901 �0.08502522 -0.35774486 �0.2756260 -0.3458185 -0.71372258 -0.1826813 � � You may reply via email or visit Argument gene.data for the correct implementation of fuction pathview of the pathview package