Error while running piano's runGSA
2
0
Entering edit mode
rubi ▴ 110
@rubi-6462
Last seen 5.7 years ago

Hi,

I have gene set data for which I'm trying to use R piano package runGSA function to perform a gene set analysis.

I can't seem to be able to paste my data here but I hope the error message may be enough:

library(piano)

#I'm constructing the gsc object.

gsc <- cbind(ids,pathways)
colnames(gsc) <- c("g","s")
gsc <- loadGSC(gsc)

#run
sub_pathway.gsea <- runGSA(geneLevelStats=gene.level.stats,directions=effects,gsc=gsc)

It starts running:

Running gene set analysis:
Checking arguments...done!
Final gene/gene-set association: 350 genes and 61 gene sets
  Details:
  Calculating gene set statistics from 350 out of 350 gene-level statistics
  Using all 350 gene-level statistics for significance estimation
  Removed 0 genes from GSC due to lack of matching gene statistics
  Removed 0 gene sets containing no genes after gene removal
  Removed additionally 0 gene sets not matching the size limits
  Loaded additional information for 0 gene sets
Calculating gene set statistics...done!
Calculating gene set significance...

 

And then crashes with:

Error in sample.int(length(x), size, replace, prob) : 
  cannot take a sample larger than the population when 'replace = FALSE'

My gene.level.stats has many 1's and their corresponding effects (dirs) are 0, if that has any contribution to the error

Help would be highly appreciated,

rubi

 

 

piano runGSA • 1.6k views
ADD COMMENT
2
Entering edit mode
Leif Väremo ▴ 70
@leif-varemo-5897
Last seen 4.6 years ago
Sweden

I am not sure what causes the error unfortunately. If possible it would be useful if you could send the gene.level.stats, effects, and gsc objects to piano.rpkg@gmail.com

On another note, runGSA is useful for genome-wide analysis, but will be less useful when there is only a small list of genes (e.g. top significant genes or similar). You have 350 genes in your case which is quite a small number. Just be aware that the gene-set significances are calculated by permuting the gene-level statistics, and that typically requires that they represent a larger/representative population of both significant and non-significant genes.

If your 350 genes represent "interesting" genes, a hypergeometric test could be a better approach. You can perform that in many ways, the runGSAhyper in the piano package is one of them. Other alternatives could be submitting the gene list to DAVID or Enrichr.

ADD COMMENT
1
Entering edit mode
Aedin Culhane ▴ 510
@aedin-culhane-1526
Last seen 4.6 years ago
United States

Hi

Can you post a couple of rows/cols of gene.level.stats and also the first rows of gsc.  It looks like that the identifiers don't match.

Aedin

ADD COMMENT
0
Entering edit mode

The identifiers should match, since it prints: "Final gene/gene-set association: 350 genes and 61 gene sets". Otherwise there should be an error I believe.

ADD REPLY

Login before adding your answer.

Traffic: 740 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6