Search
Question: glmTreat Errors for multi coefficient and analysis of deviance situations.
0
gravatar for BharatAnanth
6 days ago by
BharatAnanth0 wrote:

Hi

  I am fitting some time course RNA-seq data using glmQLFit and I then wanted to use glmTreat to test either of two coefficients with 0.5 lfc. Here is the simplified version (with fewer samples).

set.seed(0)
T <- 24
t <- rep(seq(3,42, by=3), 11)
c <- cos(2*pi*t/T)  # a time course basis function
s <- sin(2*pi*t/T)
batch_effect <- factor(ceiling(runif(14*11)*11))
subjects <- sprintf("P%02d", rep(1:11, each = 14))
design1 <- model.matrix(~ batch_effect + c + s)

fit <- glmQLFit(y, design, robust=TRUE)

These work

treat1 <- glmTreat(fit, coef="c", lfc=0.5)
treat2 <- glmTreat(fit, coef=c("c", "s"), lfc=0)

but this gives an error

treat3 <- glmTreat(fit, coef=c("c","s"), lfc=0.5)

error: Error in .addCompressedMatrices(offset.old, offset.adj) :
  matrix dimensions are not consistent for non-repeating number of columns

Instead, if I use contrasts instead of specifying coefficients, I get a different error

treat4 <- glmTreat(fit, contrast = con, lfc=0.5)

error: Error in data.frame(logCPM = glmfit$AveLogCPM, PValue = p.value, row.names = rownames(glmfit)) :
  row names supplied are of the wrong length

 

Is glmTreat not design to be used for multi-coefficient or analysis of deviance situtations? The help page seems to suggested multiple coefficients at least are permitted.

Thank you for your help.

 

My sessionInfo is

R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.12.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bindrcpp_0.2               sva_3.24.4                 BiocParallel_1.10.1        genefilter_1.58.1          mgcv_1.8-22               
 [6] nlme_3.1-131               SummarizedExperiment_1.6.5 DelayedArray_0.2.7         matrixStats_0.52.2         Biobase_2.36.2            
[11] GenomicRanges_1.28.4       GenomeInfoDb_1.12.3        IRanges_2.10.5             S4Vectors_0.14.7           BiocGenerics_0.22.1       
[16] ggplot2_2.2.1              PoiClaClu_1.0.2            magrittr_1.5               edgeR_3.18.1               limma_3.32.5              

loaded via a namespace (and not attached):
 [1] tidyr_0.7.2             bit64_0.9-7             splines_3.4.2           Formula_1.2-2           assertthat_0.2.0        statmod_1.4.30         
 [7] latticeExtra_0.6-28     blob_1.1.0              cellranger_1.1.0        GenomeInfoDbData_0.99.0 yaml_2.1.15             RSQLite_2.0            
[13] backports_1.1.1         lattice_0.20-35         glue_1.2.0              digest_0.6.12           RColorBrewer_1.1-2      XVector_0.16.0         
[19] checkmate_1.8.5         colorspace_1.3-2        htmltools_0.3.6         Matrix_1.2-11           plyr_1.8.4              XML_3.98-1.9           
[25] pkgconfig_2.0.1         zlibbioc_1.22.0         purrr_0.2.4             xtable_1.8-2            scales_0.5.0            pracma_2.1.1           
[31] htmlTable_1.11.0        tibble_1.3.4            annotate_1.54.0         nnet_7.3-12             lazyeval_0.2.1          readxl_1.0.0           
[37] survival_2.41-3         memoise_1.1.0           foreign_0.8-69          tools_3.4.2             data.table_1.10.4-3     stringr_1.2.0          
[43] munsell_0.4.3           locfit_1.5-9.1          cluster_2.0.6           AnnotationDbi_1.38.2    compiler_3.4.2          rlang_0.1.2            
[49] grid_3.4.2              RCurl_1.95-4.8          rstudioapi_0.7          htmlwidgets_0.9         labeling_0.3            bitops_1.0-6           
[55] base64enc_0.1-3         gtable_0.2.0            DBI_0.7                 R6_2.2.2                gridExtra_2.3           knitr_1.17             
[61] dplyr_0.7.4             bit_1.1-12              bindr_0.1               Hmisc_4.0-3             stringi_1.1.6           Rcpp_0.12.12           
[67] geneplotter_1.54.0      rpart_4.1-11            acepack_1.4.1           tidyselect_0.2.3      

ADD COMMENTlink modified 6 days ago by Ryan C. Thompson6.2k • written 6 days ago by BharatAnanth0
0
gravatar for Ryan C. Thompson
6 days ago by
The Scripps Research Institute, La Jolla, CA
Ryan C. Thompson6.2k wrote:

As far as I know, the treat method is only defined for a single coefficient or contrast at a time. I'm not even sure what the null hypothesis would be for testing multiple coefficients at once.

ADD COMMENTlink written 6 days ago by Ryan C. Thompson6.2k

Normally the null hypothesis for the glm would be (in my case) H0: c=0 AND s =0.

What I want to test is the null hypothesis H0: |c| < lfc AND |s| <lfc using glmTreat. Is this inconceivable?
 

ADD REPLYlink written 6 days ago by BharatAnanth0

The problem with that is that the region defined by your null hypothesis is now dependent on how you've parametrized your design matrix in ways that are probably not intuitive. That is, tests using different design matrices that would be equivalent in glmLRT or glmQLFTest would give different results in glmTreat.

ADD REPLYlink written 6 days ago by Ryan C. Thompson6.2k

Just to give a few other possible ways to define the null hypothesis for multi-coefficient treat that make equal sense, consider: |c| + |s| < threshold, or c^2 + s^2 < threshold^2. Of course, all of these are still dependent on the parametrization.

ADD REPLYlink modified 5 days ago • written 6 days ago by Ryan C. Thompson6.2k

To elaborate on Ryan's example, say we have two contrasts, A - B and A - C. If we test A - B = 0 and A - C = 0 via an ANODEV, this is the same null hypothesis as A - B = 0 and B - C = 0. However, if we ask for |A - B| <= 1 and |A - C| <= 1, this is not equivalent to |A - B| <= 1 and |B - C| <= 1. (For example, the first version is true while the second is not when A = 0, B = 1 and C = -1.)

This isn't an insurmountable difficulty - it just requires some care when phrasing the contrasts - but it does highlight the issues with dealing with composite null hypotheses. Indeed, when you start stacking together contrasts to try to perform a log-fold-change-aware equivalent of an ANODEV, you will become increasingly conservative if you only perform testing at the lfc limit, like treat; or you will need to consider the correlation in the true effect sizes between contrasts to obtain an appropriate p-value, which does not seem possible on a gene-by-gene basis. Maybe something could be made to work by sharing information across genes with empirical Bayes, but this would require some thought.

ADD REPLYlink modified 5 days ago • written 5 days ago by Aaron Lun17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 415 users visited in the last hour