Question: Running Roast with Gene Symbols of Curated Genes
2.0 years ago by
lattalic0 wrote:

I am running gene set analysis on curated gene sets from the MSigDB. I was successfully able to run roast using the following coding

summarized.counts <- read.table("C:/Users/Alicia/Documents/summarized.counts.matrix", row.names=1, header=TRUE, sep="\t")
design_two<-model.matrix(~0+ factor(c(1,1,1,2,2,2,3,3,3)))
colnames(design_two)<-c("stage1", "stage2", "stage3")
con<-makeContrasts(stage3-stage1, levels=design_two)
gene_set<-c("CYSLTR2", "GPR17","LTB4R", "LTB4R2", "GNB5", "GIP", "GNB2", "SCT", "VIP", "GNG8")
ind<-ids2indices(gene_set, row.names(summarized.counts))
dge.edgeR=estimateDisp(dge.edgeR, design_two, robust=TRUE)
rst<-mroast(dge.edgeR, index=ind, design=design_two, nrot=9999, contrast=con)

I am concerned about my results though. 

   NGenes  PropDown    PropUp Direction PValue    FDR PValue.Mixed FDR.Mixed
Set1      3 0.3333333 0.6666667        Up 0.0626 0.0626        1e-04     1e-04

Why would the output indicate a NGene number of 3 when I included 10 genes in my set?

(this is just practice data, my final gene sets will contain larger quantities of gene"

written 2.0 years ago by lattalic0
2.0 years ago by
James W. MacDonald46k wrote:

The ids2indices function just looks at the (in your case) row.names of your gene counts, and finds which of them are in your gene_set. In this case there were only three genes in your gene_set that were also in the row.names of your gene counts. You could test that yourself by doing

sum(row.names(summarized.counts) %in% gene_set)
written 2.0 years ago by James W. MacDonald46k

Thanks, I figured that was the issue was I'm not super confident with this stuff yet. I really appreciate your help and that command line you provided. 

written 2.0 years ago by lattalic0
