Search
Question: MAST - not getting Hurdle p values
0
gravatar for jka8119
3 months ago by
jka81190
jka81190 wrote:

Hi,

I am trying to follow analysis steps as outlined in http://bioconductor.org/packages/release/bioc/vignettes/MAST/inst/doc/MAITAnalysis.html

I run into an issue when running

summaryDt <- summaryCond$datatable 

fcHurdle <- merge(summaryDt[contrast=='conditionStim' & component=='H',.(primerid, `Pr(>Chisq)`)], #hurdle P values summaryDt[contrast=='conditionStim' & component=='logFC', .(primerid, coef, ci.hi, ci.lo)], by='primerid') #logFC coefficients

My summaryCond$datatable only has 'C', 'D', 'S' and 'logFC' components but not 'H'. Because of this issue, I cannot perform subsequent MAST analysis steps. Not sure what I am doing incorrectly. I am comparing single cell RNA-seq data from two groups of cells.

Any help would be appreciated!

>sessionInfo()

R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] MAST_1.4.1                              SummarizedExperiment_1.8.1              DelayedArray_0.4.1                      matrixStats_0.53.1                     
 [5] RColorBrewer_1.1-2                      rsvd_0.9                                NMF_0.20.6                              cluster_2.0.6                          
 [9] rngtools_1.2.4                          pkgmaker_0.22                           registry_0.5                            stringr_1.3.0                          
[13] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.30.3                  GenomicRanges_1.30.2                    GenomeInfoDb_1.14.0                    
[17] knitr_1.20                              data.table_1.10.4-3                     reshape2_1.4.3                          limma_3.34.8                           
[21] GSEABase_1.40.1                         graph_1.56.0                            annotate_1.56.1                         XML_3.98-1.10                          
[25] AnnotationDbi_1.40.0                    IRanges_2.12.0                          S4Vectors_0.16.0                        Biobase_2.38.0                         
[29] BiocGenerics_0.24.0                     GGally_1.3.2                            ggplot2_2.2.1                          

loaded via a namespace (and not attached):
 [1] httr_1.3.1               RMySQL_0.10.13           bit64_0.9-7              foreach_1.4.4            assertthat_0.2.0         blob_1.1.0              
 [7] GenomeInfoDbData_1.0.0   Rsamtools_1.30.0         progress_1.1.2           pillar_1.1.0             RSQLite_2.0              lattice_0.20-35         
[13] digest_0.6.15            XVector_0.18.0           colorspace_1.3-2         Matrix_1.2-12            plyr_1.8.4               pkgconfig_2.0.1         
[19] biomaRt_2.34.2           zlibbioc_1.24.0          xtable_1.8-2             scales_0.5.0             BiocParallel_1.12.0      tibble_1.4.2            
[25] lazyeval_0.2.1           magrittr_1.5             memoise_1.1.0            doParallel_1.0.11        tools_3.4.1              prettyunits_1.0.2       
[31] gridBase_0.4-7           munsell_0.4.3            Biostrings_2.46.0        compiler_3.4.1           rlang_0.2.0              grid_3.4.1              
[37] RCurl_1.95-4.10          iterators_1.0.9          bitops_1.0-6             gtable_0.2.0             codetools_0.2-15         abind_1.4-5             
[43] DBI_0.7                  reshape_0.8.7            R6_2.2.2                 GenomicAlignments_1.14.1 rtracklayer_1.38.3       bit_1.1-12              
[49] stringi_1.1.6            Rcpp_0.12.15            

ADD COMMENTlink modified 3 months ago by Andrew_McDavid120 • written 3 months ago by jka81190

Are you directly cutting and pasting code from vignette and seeing this issue?

ADD REPLYlink written 3 months ago by Andrew_McDavid120

 

Correct. The interesting things is that MAST correctly identifies differentially expressed genes but does not give me the Hurdle p value. 

Joe

ADD REPLYlink written 3 months ago by jka81190

I am unable to replicate.  The vignette compiles fine and summaryCond$datatable contains all the expected components:

table(summaryCond$datatable$component)

C D H logFC S 6900 6900 2300 4600 6900

Are you trying to adapt the code in the vignette to your own data and it's not working?

ADD REPLYlink modified 3 months ago • written 3 months ago by Andrew_McDavid120

Yes. Please see below. My starting data is 'normmatrix', a numeric matrix with genes as rows and cells as columns.

>suppressPackageStartupMessages({
  library(ggplot2)
  library(GGally)
  library(GSEABase)
  library(limma)
  library(reshape2)
  library(data.table)
  library(knitr)
  library(TxDb.Hsapiens.UCSC.hg19.knownGene)
  library(stringr)
  library(NMF)
  library(rsvd)
  library(RColorBrewer)
  library(MAST)
})

>options(stringsAsFactors = F)
>genes <- as.data.frame(as.character(row.names(normmatrix)), stringsAsFactors = F)
>colnames(genes) <- "gene"
>rownames(genes) <- genes$gene
>colnames(cellidentities) <- "celltype"
>cellidentities$celltype <- as.character(cellidentities$celltype)

>sle <- normmatrix[,row.names(subset(cellidentities, cellidentities$celltype %in% c("Th_conv", "Plasma_cells")))]
>cellsubset <- subset(cellidentities, cellidentities$celltype %in% c("Th_conv", "Plasma_cells"))
>scaRaw <- FromMatrix(as.matrix(sle), cellsubset, subset(genes, genes$gene %in% row.names(sle)))

>cdr2 <-colSums(assay(scaRaw)>0)

>colData(scaRaw)$cngeneson <- scale(cdr2)

>cond<-factor(colData(scaRaw)$celltype)
>cond<-relevel(cond,"Th_conv")
>colData(scaRaw)$condition<-cond
>zlmCond <- zlm(~condition+cngeneson, scaRaw)

>summaryCond <- summary(zlmCond, doRLT = 'conditionPlasma_cells', logFC = TRUE, level = 0.95) 

>summaryDt <- summaryCond$datatable

 

ADD REPLYlink written 3 months ago by jka81190

What is the output of 

summaryCond <- summary(zlmCond, doRLT = TRUE, logFC = TRUE, level = 0.95)
table(summaryCond$datatable$component, summaryCond$datatable$contrast, exclude = NULL)

 

ADD REPLYlink written 3 months ago by Andrew_McDavid120

Please see below.

> table(summaryCond$datatable$component, summaryCond$datatable$contrast, exclude = NULL)
       
        (Intercept) cngeneson conditionPlasma_cells
  C           20922     20922                 20922
  D           20922     20922                 20922
  logFC           0     20922                 20922
  S           20922     20922                 20922

ADD REPLYlink written 3 months ago by jka81190
0
gravatar for Andrew_McDavid
3 months ago by
Andrew_McDavid120 wrote:

There is a typo in the arguments to `summary`: `doRLT` should read `doLRT`.  No error is triggered because the generic signature includes `...`.

Perhaps the function should check that all of the dot args match and throw an error manually?

ADD COMMENTlink written 3 months ago by Andrew_McDavid120

It was the typo. Sometimes it is the smallest things... I appreciate your help.

ADD REPLYlink written 3 months ago by jka81190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 118 users visited in the last hour