Appropriate use of the function mroast to find KEGG pathways for differential expression
3
0
Entering edit mode
svlachavas ▴ 830
@svlachavas-7225
Last seen 13 months ago
Germany/Heidelberg/German Cancer Resear…

Dear All,

i have performed a limma paired analysis on an affymetrix expression dataset(13 cancer vs 13 control samples).

My initial non-specific filtered expressionSet which was used as input for the limma analysis: data.trusted.eset

and my design matrix looks like this:

(Intercept) conditionCancer pairs2 pairs3 pairs4 pairs5 pairs6 pairs7 pairs8 pairs9 pairs10 pairs11.....
1            1               0      0      0      0      0      0      0      0      0       0       0
2            1               1      0      0      0      0      0      0      0      0       0       0
3            1               0      1      0      0      0      0      0      0      0       0       0
4            1               1      1      0      0      0      0      0      0      0       0       0
5            1               0      0      1      0      0      0      0      0      0       0       0
6            1               1      0      1      0      0      0      0      0      0       0       0
7            1               0      0      0      1      0      0      0      0      0       0       0
8            1               1      0      0      1      0      0      0      0      0       0       0
9            1               0      0      0      0      1      0      0      0      0       0       0
10           1               1      0      0      0      1      0      0      0      0       0       0
11           1               0      0      0      0      0      1      0      0      0       0       0
12           1               1      0      0      0      0      1      0      0

 

    The specific coefficient im interested is "conditionCancer"

So i used the code below:

x <- hgu133aPATH2PROBE

mapped_probes <- mappedkeys(x)

xx <- as.list(x[mapped_probes])

indices <- ids2indices(xx, rownames(data.trusted.eset))

res <- mroast(data.trusted.eset, indices, design, contrast=2)

Firstly, my main question is if the implementation of the mroast fuction is correct regarding the argument contrast=2 in which i believe refers to the coefficient="conditionCancer" ?

Secondly, should i filter my data.trusted.eset with the output of the topTable function after removing possible duplicates and probesets mapping to NAs, in order to remove possible false results ?

For instance:

filtered <- data.trusted.eset[rownames(top3),] # where top3 is the final dataframe of topTable function after removing duplicates and NAs

and then use filtered instead of the initial expressionset ??

mroast limma pathway analysis differential gene expression affymetrix • 1.8k views
ADD COMMENT
1
Entering edit mode
@kamil-slowikowski-6901
Last seen 6 months ago
United States

Yes, contrast=2 is correct: it will use the second column of your design matrix.

You might consider reading the documentation for mroast with:

library(limma)
?mroast

When you perform multiple tests, such as differential expression for multiple genes, you should adjust for multiple testing. Read about the different adjustment methods, and then try using the decideTests function in the limma package:

?p.adjust
?decideTests
ADD COMMENT
0
Entering edit mode

Dear kslowikowski, thank you for your verification about the argument contrast. Moreover, i often use the function treat() or rank the probesets(genes) based on various criteria, but i have never considered decideTests, so i will see how it works

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

You should filter out non-expressed probes from the eset, in the same way you would do so before running lmFit() and eBayes(). However there is no need to remove duplicates or probe sets without good annotation.

ADD COMMENT
0
Entering edit mode
svlachavas ▴ 830
@svlachavas-7225
Last seen 13 months ago
Germany/Heidelberg/German Cancer Resear…

Dear Mr. Gordon Smyth, i have already performed non-specifici intensity filtering to remove any non-expressed probesets. Thus, if i dont have to remove duplicates or probesets without good annotation then my expressionset is ready to be used. Please excuse for asking one to more question, but it is also very important regarding the downstream process:

after the function mroast, i checked also a tutorial about an easy way to see the pathways, which needs a data matrix with ENTREZ Gene ID as the rownames. In detail:

library(pathview)

x <- hgu133aENTREZID

xx<- as.list(x)

entrezid <- sapply(rownames(fit2), function(x) xx[x], USE.NAMES=FALSE) # where fit2 is the output of ebayes function (fit2 <- eBayes(fit, trend=TRUE))

So my question refers to the next row:

gene.data <- fit2$coefficients

rownames(gene.data) <- entrezid

path <- pathview(gene.data=gene.data....)

So in " gene.data <- fit2$coefficients": should i use it just like this or i have to specify the coefficient im interested in, as is in contrast above ?

> head(fit2$coefficients) ??
                     (Intercept)    conditionCancer    pairs2      pairs3     pairs4     pairs5      pairs6     pairs7
1007_s_at     10.215808      0.22343373  0.52147061  0.60510597  1.0420911  0.5031775  0.06009372  0.1640625
1438_at        5.911114      2.53652962  0.70450123 -1.01602942 -0.6130980 -1.1538897 -1.33425754 -2.3958104
1487_at        9.050908     -0.20654769 -0.06396680  0.28847187 -0.0528669  0.3469220  0.04466553  0.4973197
1598_g_at     10.279677     -1.11014791 -0.61752751  0.12946488  0.4559134  0.4962469 -0.05954733 -0.1744488
1729_at        7.941900     -0.02386093  0.69056733  0.05095071  0.7857444  0.8638121  0.16643800  0.2573498
200000_s_at    9.991276      0.04892901  0.08502522 -0.35774486  0.2756260 -0.3458185 -0.71372258 -0.1826813

ADD COMMENT
0
Entering edit mode

This seems to be a question about the correct inputs into the pathview package, for which limma has no responsibility. You'd be better off putting this into a separate question, and tagging it appropriately so the pathview guys can see it.

ADD REPLY
0
Entering edit mode

Dear Aaron, thank you for your suggestion--i will also put this in a separate question but i have also used here because it depends of the coefficient that is included in limma, thats why i firstly asked it here 

ADD REPLY

Login before adding your answer.

Traffic: 916 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6