MAST reported warning "Coefficients ... are never estimible and will be dropped."
1
1
Entering edit mode
shao ▴ 100
@shao-6241
Last seen 6.3 years ago
Germany

Hi,

I run into such warnings with MAST when fitting the hurdle model to

a sc-rnaseq data. Data has been log2-transformed(log2(cpm + 1)).

The model is:

zlm.SingleCellAssay(~ FvF + batches + sex, sca)

After fitting, MAST reported "In .nextMethod(object = object, value = value) :
  Coefficients batchesb4, sexU are never estimible and will be dropped."

Are there something wrong with the model? How to understand the warning?

Here are the relation between predictors:

> table(colData(sca)$FvF, colData(sca)$sex)
       
          F   M   U
  Fed     7 199   0
  Ch10    0 231   0
  Fast   13   7  31
  HFD     0 147   0
  Refed   0   0  55
> table(colData(sca)$FvF, colData(sca)$batches)
       
         b1  b2  b3  b4  b5  b6
  Fed    75  36  81   0   0  14
  Ch10    0   0   0 231   0   0
  Fast    0   0   0   0  31  20
  HFD     0   0   0 147   0   0
  Refed   0   0   0   0  55   0

> table(colData(sca)$sex, colData(sca)$batches)
   
     b1  b2  b3  b4  b5  b6
  F   0   0   0   0   0  20
  M  75  36  81 378   0  14
  U   0   0   0   0  86   0

Thanks!

 

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Manjaro Linux

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] parallel  stats4    grid      stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] MAST_1.0.5                 SummarizedExperiment_1.4.0
 [3] Biobase_2.34.0             GenomicRanges_1.26.4      
 [5] GenomeInfoDb_1.10.3        IRanges_2.8.2             
 [7] S4Vectors_0.12.2           BiocGenerics_0.20.0       
 [9] data.table_1.10.4          magrittr_1.5              
[11] RColorBrewer_1.1-2         cowplot_0.7.0             
[13] ggplot2_2.2.1              colorout_1.1-2            

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.10     XVector_0.14.1   zlibbioc_1.20.0  munsell_0.4.3   
 [5] lattice_0.20-34  colorspace_1.3-2 stringr_1.2.0    plyr_1.8.4      
 [9] tools_3.3.3      gtable_0.2.0     abind_1.4-5      lazyeval_0.2.0  
[13] assertthat_0.1   tibble_1.2       Matrix_1.2-8     reshape2_1.4.2  
[17] bitops_1.0-6     RCurl_1.95-4.8   stringi_1.1.3    scales_0.4.1    

 

 

 

mast sc-rnaseq • 2.2k views
ADD COMMENT
2
Entering edit mode
@andrew_mcdavid-11488
Last seen 13 months ago
United States

Hi ccshao,

The warning means that some levels of your factors are never present together--they are completely confounded. When FvF == "Refed", it's always the case that Sex== "U", so you can't tell if a change in express is due to one, or the other, or both of those factors.  One of those variables must be dropped in order for the model to have a well-defined solution.  Same is true with batches=="b4" and FvF == "HFB".

If you don't care about those particular levels of those factors, then there's nothing to worry about.  If you do, then you may need to collect more data in order to identify the effect of interest--the current data aren't informative.

 

-Andrew

ADD COMMENT
0
Entering edit mode

Thanks! Based your answer I examined the cross tables between predictors. I think Sex "U" are dropped not only because the correlation between "U" and "Refed". The Sex "U" are the samples when bathches is "b5", which are 31 in "Fast" and 55 "Refed"so one of them is redundant and dropped. I check the samples annation and indeed it is true. Same for "b4" and "M".

ADD REPLY

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6