Search
Question: GOseq package: Which genes are included in the "numDEInCat"?
0
gravatar for Xavier
10 months ago by
Xavier10
Xavier10 wrote:

Hi,

I've performed KEGG pathway analyses using the GOseq package and now I was wondering how do I know which genes were included in the "numDEInCat"? See below for example: 

Category Over_represented_pvalue Under_represented_pvalue numDEInCat numInCat pathway
01100 .. ... 7 1179 ...
00785 ... ... 1 3 ...
00450 ... ... 1 17 ...
00230 ... ... 2 172 ...

Thus in this example - output, when looking at category 01100, 7 DEGs were included in the pathway. How can I find the ensemble ID's of these genes? I can't find anything regarding this in the manual (http://bioconductor.org/packages/release/bioc/vignettes/goseq/inst/doc/goseq.pdf)

Anyone an idea? 

Best,

Xavier

 

ADD COMMENTlink modified 10 months ago • written 10 months ago by Xavier10
1
gravatar for b.nota
10 months ago by
b.nota290
Netherlands
b.nota290 wrote:

You can try the following:

kegg <- lapply(en2eg,grepKEGG,eg2kegg)
KEGG <- goseq(pwf, gene2cat=kegg)

enriched.kegg <- KEGG$category[p.adjust(KEGG$over_represented_pvalue, method="BH")<.05]

allKEGGs <- stack(kegg)
allKEGG_sig <- allKEGGs[allKEGGs$values %in% enriched.kegg,]

allKEGG_sig2 <- allKEGG_sig[allKEGG_sig$ind %in% sig_genes,]

Where sig_genes are your significant genes.

Does this work?

 

 

 

ADD COMMENTlink written 10 months ago by b.nota290
1
gravatar for Xavier
10 months ago by
Xavier10
Xavier10 wrote:

I know see, everything seems to work now. Again, thank you so much for your help! 

ADD COMMENTlink written 10 months ago by Xavier10
0
gravatar for Xavier
10 months ago by
Xavier10
Xavier10 wrote:

Unfortunately this doesn't work. The 5th command ( "allKEGG_sig <- allKEGGs[allKEGGs$values %in% enriched.kegg,]") does not work as it gives no data. Any other solutions?

ADD COMMENTlink written 10 months ago by Xavier10

Do you have enriched kegg pathways?

Try:

head(enriched.kegg)

You can also skip the enriched.kegg step of course, but you'll get pathways that are not significantly enriched...

kegg <- lapply(en2eg,grepKEGG,eg2kegg)
KEGG <- goseq(pwf, gene2cat=kegg)

allKEGGs <- stack(kegg)
allKEGG_sig <- allKEGGs[allKEGGs$values %in% KEGG$category,]

allKEGG_sig2 <- allKEGG_sig[allKEGG_sig$ind %in% sig_genes,]

Does this work?, if not please show some code and errors given by R. It's hard to blind code here.

 

ADD REPLYlink modified 10 months ago • written 10 months ago by b.nota290
# This was the "original" code:
kegg <- lapply(en2eg, grepKEGG, eg2kegg)
KEGG=goseq(pwf, gene2cat=kegg)
KEGG.sig <- KEGG[KEGG$over_represented_pvalue<0.05,]

# This is the code adjusted based on your input:
kegg <- lapply(en2eg, grepKEGG, eg2kegg)
KEGG=goseq(pwf, gene2cat=kegg)
allKEGGs <- stack(kegg)
KEGG.sig <- KEGG[KEGG$over_represented_pvalue<0.05,]
allKEGG_sig <- allKEGGs[allKEGGs$values %in% enriched.kegg,]
allKEGG_sig2 <- allKEGG_sig[allKEGG_sig$ind %in% sig_genes,]

# These are the significant KEGGs (So yes, there are significant categories):

> KEGG.sig
   category over_represented_pvalue under_represented_pvalue numDEInCat numInCat
86    01100             0.004357014                0.9992979          7     1179
74    00785             0.005852647                0.9999894          1        3
35    00450             0.033955040                0.9994927          1       17
17    00230             0.048094216                0.9952265          2      172

# Up until command # 4 (KEGG.sig) everything seems to work. The 5th command (allKEGG_sig) gives howerever the following "error" when I checked if the command works:

> allKEGG_sig
[1] values ind   
<0 rows> (or 0-length row.names)

Thank you so far for your help/efforts.

ADD REPLYlink written 10 months ago by Xavier10

All right, I see two problems...

First, you'll need to adjust your p-values for multiple testing. The p-values that you use now are not corrected for this yet.

However, if you ignore this fact (which is not wise), your new script in line 5 still has the enriched.kegg part (which you didn't make in your script). So replace it with your KEGG.sig$category.

ADD REPLYlink written 10 months ago by b.nota290
0
gravatar for Xavier
10 months ago by
Xavier10
Xavier10 wrote:

# Thanks, commend #5 now works:
allKEGG_sig <- allKEGGs[allKEGGs$values %in% KEGG.sig$category,]

# However, the last command does not work:
allKEGG_sig2 <- allKEGG_sig[allKEGG_sig$ind %in% sig_genes,]

# It gives the following error:
> allKEGG_sig2 <- allKEGG_sig[allKEGG_sig$ind %in% sig_genes,]
Error in allKEGG_sig$ind %in% sig_genes : object 'sig_genes' not found

ADD COMMENTlink written 10 months ago by Xavier10

Yes, that's why I said in my first post:

"Where sig_genes are your significant genes."

I don't know how you called your list of significant genes that you used to make your pwf or gene.vector object.

ADD REPLYlink written 10 months ago by b.nota290
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 248 users visited in the last hour