Camera gene set interpretation
2
0
Entering edit mode
@nataliapietrosemoli-10587
Last seen 7.1 years ago

Dear all,

After some careful reading, I've settled for CAMERA as my favaorite gene set enrichment test.

However, in one of my datasets (condition 3 vs condition 1; 3 replicates, microarray data), I get a very confusing error:

I get  virtually the  same enriched gene sets (and direction!) when I perform the test on the first or on the second condition.  When I look at the genes, they are also the same! How can this be possible?

Here's some snipets:

Mi data are in this data.frame:

colnames(final.table)

'data.frame':    22518 obs. of  9 variables:
 $ Probe.Set.ID: Factor w/ 22518 levels "1415676_a_at",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ SYMBOL      : Factor w/ 12142 levels "0610006L08Rik",..: 8523 3533 8506 1316 6878 1654 6973 8500 12091 6863 ...
 $ Entrez_IDs  : Factor w/ 12142 levels "100017","100019",..: 3274 7450 8495 2718 11955 1075 11391 3264 10576 11953 ...
 $ Cond1_1.CEL : num  11.57 8.29 10.63 9.87 8.32 ...
 $ Cond1_2.CEL : num  11.48 8.16 10.47 9.63 8.06 ...
 $ Cond1_3.CEL : num  11.39 8.41 11.01 10.18 8.63 ...
 $ Cond3_1.CEL : num  11.11 7.93 10.58 8.89 7.98 ...
 $ Cond3_2.CEL : num  11.13 8.16 10.51 9.07 7.67 ...
 $ Cond3_3.CEL : num  11.04 8.62 11.04 9.1 7.82 ...

And then I construct my test:

grp = c( "Cond1",  "Cond3")

Group=as.factor(c(rep(grp[1], length(grep(grp[1], colnames(final.table),ignore.case = F ))),
          rep(grp[2], length(grep(grp[2], colnames(final.table),ignore.case = F )))))
  des = model.matrix(~0+Group)
  colnames(des) <- levels(Group)
 
  contr <- makeContrasts(Cond3-Cond1, levels = des)

  str_split(rownames(final.table), pattern = "_") ->k
  unlist(lapply(k,function(x) x[1]))->k

ids2indices(iset,k)-> indices

camera(final.table, indices, des, contrast=2)-> current.cam.up.Cond3 
camera(final.table,, indices, des, contrast=1)-> current.cam.up.Cond1 

Here I'm showing the first 6 gene sets, but you can trust me that they are pretty much the same ones...

current.cam.up.Cond3 

                                   NGenes Direction       PValue          FDR
HALLMARK_MYC_TARGETS_V1               452        Up 3.867097e-12 1.933549e-10
HALLMARK_OXIDATIVE_PHOSPHORYLATION    379        Up 1.868544e-10 4.671361e-09
HALLMARK_UNFOLDED_PROTEIN_RESPONSE    262        Up 1.813248e-05 3.022080e-04
HALLMARK_MTORC1_SIGNALING             492        Up 4.983641e-05 6.229551e-04
HALLMARK_TNFA_SIGNALING_VIA_NFKB      419        Up 4.852641e-04 4.852641e-03
HALLMARK_MYC_TARGETS_V2               138        Up 7.818134e-04 6.515112e-03

> head(res.fil.up.A)
                                   NGenes Direction       PValue          FDR
HALLMARK_MYC_TARGETS_V1               452        Up 6.978126e-14 3.489063e-12
HALLMARK_OXIDATIVE_PHOSPHORYLATION    379        Up 1.145075e-10 2.862686e-09
HALLMARK_E2F_TARGETS                  509        Up 1.303369e-08 2.172282e-07
HALLMARK_G2M_CHECKPOINT               592        Up 4.696686e-05 5.870858e-04
HALLMARK_UNFOLDED_PROTEIN_RESPONSE    262        Up 6.785470e-05 6.785470e-04
HALLMARK_MTORC1_SIGNALING             492        Up 2.174479e-04 1.812066e-03
                                                                                    

I would expect that if the genes sets are significicant and the same in the two conditions, the direction should change..

 

Thanks a lot for helping solve this puzzle,

 

Natalia

 

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] mouse4302cdf_2.18.0   genefilter_1.54.2     affyPLM_1.48.0        preprocessCore_1.34.0
 [5] gcrma_2.44.0          affy_1.50.0           mouse4302.db_3.2.3    org.Mm.eg.db_3.3.0   
 [9] AnnotationDbi_1.34.4  rgl_0.97.0            dplyr_0.5.0           ggrepel_0.6.5        
[13] ggplot2_2.2.1         RColorBrewer_1.1-2    stringr_1.1.0         knitr_1.15.1         
[17] gplots_3.0.1          calibrate_1.7.2       MASS_7.3-45           FactoMineR_1.34      
[21] oligo_1.36.1          Biostrings_2.40.2     XVector_0.12.1        IRanges_2.6.1        
[25] S4Vectors_0.10.3      oligoClasses_1.34.0   limma_3.28.21         Biobase_2.32.0       
[29] BiocGenerics_0.18.0  

loaded via a namespace (and not attached):
 [1] jsonlite_1.2               splines_3.3.1              foreach_1.4.3              gtools_3.5.0              
 [5] shiny_1.0.0                assertthat_0.1             highr_0.6                  RSQLite_1.1-2             
 [9] lattice_0.20-34            digest_0.6.11              GenomicRanges_1.24.3       colorspace_1.3-2          
[13] Matrix_1.2-8               htmltools_0.3.5            httpuv_1.3.3               plyr_1.8.4                
[17] XML_3.98-1.5               affxparser_1.44.0          zlibbioc_1.18.0            xtable_1.8-2              
[21] scales_0.4.1               gdata_2.17.0               affyio_1.42.0              ff_2.2-13                 
[25] annotate_1.50.1            tibble_1.2                 SummarizedExperiment_1.2.3 lazyeval_0.2.0            
[29] survival_2.40-1            magrittr_1.5               mime_0.5                   evaluate_0.10             
[33] memoise_1.0.0              BiocInstaller_1.22.3       tools_3.3.1                munsell_0.4.3             
[37] cluster_2.0.5              flashClust_1.01-2          GenomeInfoDb_1.8.7         caTools_1.17.1            
[41] RCurl_1.95-4.8             grid_3.3.1                 iterators_1.0.8            htmlwidgets_0.8           
[45] leaps_3.0                  labeling_0.3               bitops_1.0-6               gtable_0.2.0              
[49] codetools_0.2-15           DBI_0.5-1                  R6_2.2.0                   bit_1.1-12                
[53] KernSmooth_2.23-15         stringi_1.1.2              Rcpp_0.12.9                scatterplot3d_0.3-38

 

 

limma camera gene set analysis gene set enrichment • 2.3k views
ADD COMMENT
0
Entering edit mode

I was just trying a shortcut to typing my whole code. Which is here. And it works, my question still remains the same.

runTest <- function(iset,name, final.table, grp){
  c(grep(grp[1], colnames(final.table),ignore.case = F ), grep(grp[2], colnames(final.table),ignore.case = F))->curr.ind
  exp.mat.0<-final.table[,curr.ind]
  dummy_ind <- seq(from=1, to =dim(exp.mat.0)[1])
  rownames(exp.mat.0)<- paste0(final.table$Entrez_IDs,"_",dummy_ind)  # because we have non-unique values
 
  # remove NA rows
  complete.cases(exp.mat.0)->a
  exp.mat.0 <- exp.mat.0[a,]
 
  Group=as.factor(c(rep(grp[1], length(grep(grp[1], colnames(final.table),ignore.case = F ))),
          rep(grp[2], length(grep(grp[2], colnames(final.table),ignore.case = F )))))
  des = model.matrix(~0+Group)
  colnames(des) <- levels(Group)
 
  contr <- makeContrasts(Cond3-Cond1, levels = des)
  str_split(rownames(exp.mat.0), pattern = "_") ->k
  unlist(lapply(k,function(x) x[1]))->k
 
  ids2indices(iset,k)-> indices
  camera(exp.mat.0, indices, des, contrast=2)-> current.cam  #.up.Q
  camera(exp.mat.0, indices, des, contrast=1)-> current.cam
 
  entrez.ids<- lapply(indices, function(x) k[x])
  symbol.ids <- lapply(entrez.ids, function(x) match(x, final.table$Entrez_IDs))
  col.id <- which(names(final.table) == "SYMBOL")
  symbol.names <- lapply(symbol.ids, function(x) paste(as.character(final.table[unlist(x),col.id]), collapse = ";"))
 
  genes.cam <-vector()
  match(rownames(current.cam),names(symbol.names))->b
  genes.cam <-unlist(symbol.names[b])
 
  cbind(current.cam, genes.cam)->current.camera.tot
 
  res <-list( "camera"= current.camera.tot)
 
  write.table(cbind(rownames(current.camera.tot),current.camera.tot), file=paste0(outdir,name,".",grp[2],"vs",grp[1],".camera.txt"), row.names = FALSE, sep = "\t", col.names =
                c("geneSet",colnames(current.camera.tot)))  
  return(res)
}

load(paste0(gene.sets.dir,"mouse_c2_v5p1.rdata"))
load(paste0(gene.sets.dir,"mouse_H_v5p1.rdata"))
isets <- list()
isets[[1]] <- Mm.H
isets[[2]] <- Mm.c2
names(isets) <- c("H","C2")

# get enrichments
enrichment.H <- list()
enrichment.C2 <- list()

for (i in 1:dim(comb.all)[2]){
 
  runTest(isets[[1]], "H", mat, c(comb.all[2*i-1],comb.all[2*i]))->enrichment.H[[i]]  
  #runTest(isets[[2]], "C2", mat, c(comb.all[2*i-1],comb.all[2*i]))->enrichment.C2[[i]]
 
  names(enrichment.H)[[i]] <-  paste0(comb.all[2*i],".vs.",comb.all[2*i-1])
  #names(enrichment.C2)[[i]] <-  paste0(comb.all[2*i-1],".vs.",comb.all[2*i])
 
}

ADD REPLY
0
Entering edit mode

My answer remains the same. Did you read it?

ADD REPLY
0
Entering edit mode

If I''ve previously defined my design:

> des
  Mock_3h Z_3h
1       1    0
2       1    0
3       1    0
4       0    1
5       0    1
6       0    1

wouldn't this instruction:

camera(exp.mat.0, indices, des, contrast=2)

specifiy that I run my test for my condition Z_3h?

this is how I interpret the contrast parameter.

"Can be an integer specifying a column of 'design'..."

 

ADD REPLY
0
Entering edit mode

What James is saying is that since your design does not have an intercept term (ie. you are using a "cell means" model), you need to test a contrast among the columns of your design matrix by specifying a vector of coefficients which is as long as the number of columns of your design matrix, and not just a single column (index) into our design matrix.

In your case, this call:

camera(exp.mat.0, indices, des, contrast=2)

is essentially testing something you don't really care to answer, ie. you are asking if the mean expression of the genes is different from zero in your "Z_3h" group.

You probably would rather ask if the mean expression is different between your two groups, in which case you need to pass the vector that specifies your contrast across the columns of your design (as mentioned in the docs), so something like:

camera(exp.mat.0, indices, des, contrast=c(-1, 1))

Which is testing the difference in expression calculated by your Z_3h - Mock_3h groups, given the design you specified in your comment above.

Putting it a third way: when James said camera knows nothing about your contrast matrix, he means that it doesn't know that the "2" in your contrast=2 is referencing the second column of a contrast matrix you built somewhere else (again, beating a dead horse: that contrast matrix has as many rows as you have columns in your design matrix). You need to pass the vector that is the second column of your contrast matrix directly.

Make sense?

ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 4 hours ago
United States

It's probably more useful if you show the actual code you used, rather than just re-typing things. For example, this line:

camera(final.table,, indices, des, contrast=1)-> current.cam.up.Cond1 

won't actually run, due to the double commas after final.table. Plus, while the assignment operator is bi-directional, it's easier for people to decipher your code if you follow the convention of right to left assignment (<-) rather than randomly switching back and forth in your code.

Also do note that camera doesn't use (or even know about) your contrast matrix, so specifying contrast = 1 or contrast = 2 isn't a useful thing to do when you use a cell means model. In that situation the statistic you are using is testing that the expression of each gene is different from zero, rather than different between the two groups. This should be clear from the help page:

contrast: contrast of the linear model coefficients for which the test
          is required. Can be an integer specifying a column of
          'design', or else a numeric vector of same length as the
          number of columns of 'design'.
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 4 hours ago
WEHI, Melbourne, Australia

You could use

camera(final.table, indices, des, contrast=c(-1,1))

But why not make it even easier:

des <- model.matrix(~group)
camera(final.table, indices, des)
ADD COMMENT
0
Entering edit mode

Thank you guys for the answers, took some time, but finally got it!

 

ADD REPLY

Login before adding your answer.

Traffic: 627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6