Why does "robust=TRUE" not work as a third argument in estimateDisp of edgeR 3.34.0?
1
0
Entering edit mode
@ee488e5a
Last seen 9 months ago
United States

I am trying to do a classic test for significant differential gene expression in edgeR 3.34.0, as implemented as a bioconda package that I am invoking in a versioned environment. I have two different versions of the R script that I am running in batch mode. The first version, which lacks "robust=TRUE" as the third argument of estimateDisp, works perfectly well. The second version, which I am appending below and which has "robust=TRUE" as the third argument of estimateDisp, fails with the following error message:

> geno_comp <- estimateDisp(geno_comp,geno_design,robust=TRUE)
Error in fitFDistRobustly(var, df1 = df, covariate = covariate, winsor.tail.p = winsor.tail.p) : 
  statmod package required but is not installed
Calls: estimateDisp ... estimateDisp.default -> squeezeVar -> fitFDistRobustly
Execution halted

What am I doing wrong?

Here is the full R script that doesn't quite work, because of that one "robust=TRUE" in an otherwise functional batch file:

# Load edgeR:
library(edgeR)
# Import data:
setwd("/ocean/projects/mcb190015p/schwarze/Acey/2021.07.24/edgeR_09")
edgeR_input_data <- read.delim("/ocean/projects/mcb190015p/schwarze/Acey/2021.07.24/annots/Acey_salmon_data_2021.07.28.01.reads.subset_03.min10.tsv.txt",row.names="Gene")
genotypes <- factor(c("12dpi_noDEX","12dpi_noDEX","12dpi_noDEX","12dpi_DEX","12dpi_DEX","12dpi_DEX","intestines_19dpi_noDEX","intestines_19dpi_noDEX","intestines_19dpi_noDEX","intestines_19dpi_DEX","intestines_19dpi_DEX","intestines_19dpi_DEX","non.intestines_19dpi_noDEX","non.intestines_19dpi_noDEX","non.intestines_19dpi_noDEX","non.intestines_19dpi_DEX","non.intestines_19dpi_DEX","non.intestines_19dpi_DEX"))
geno_comp <- DGEList(counts=edgeR_input_data,group=genotypes)
# Check initial results:
geno_comp$samples
# Filter out weakly expressed genes using the following commands:
keep <- filterByExpr(geno_comp)
geno_comp <- geno_comp[keep, , keep.lib.sizes=FALSE]
geno_comp$samples
# Set up statistical data set:
geno_comp <- calcNormFactors(geno_comp)
geno_design <- model.matrix(~genotypes)
geno_comp <- estimateDisp(geno_comp,geno_design,robust=TRUE)
# 12dpi_noDEX vs. 12dpi_DEX:
comp_12dpi_noDEX_vs_12dpi_DEX <- exactTest(geno_comp, pair=c("12dpi_DEX","12dpi_noDEX"))
deg_12dpi_noDEX_vs_12dpi_DEX <- topTags(comp_12dpi_noDEX_vs_12dpi_DEX, n=Inf, p.value=1)
write.table(as.data.frame(deg_12dpi_noDEX_vs_12dpi_DEX), file="12dpi_noDEX.vs.12dpi_DEX_edgeR_exactTest_all.data_2021.08.03.01.orig.tsv.txt", sep="\t", col.names=NA)
# intestines_19dpi_noDEX vs. non.intestines_19dpi_noDEX:
comp_intestines_19dpi_noDEX_vs_non.intestines_19dpi_noDEX <- exactTest(geno_comp, pair=c("non.intestines_19dpi_noDEX","intestines_19dpi_noDEX"))
deg_intestines_19dpi_noDEX_vs_non.intestines_19dpi_noDEX <- topTags(comp_intestines_19dpi_noDEX_vs_non.intestines_19dpi_noDEX, n=Inf, p.value=1)
write.table(as.data.frame(deg_intestines_19dpi_noDEX_vs_non.intestines_19dpi_noDEX), file="intestines_19dpi_noDEX.vs.non.intestines_19dpi_noDEX_edgeR_exactTest_all.data_2021.08.03.01.orig.tsv.txt", sep="\t", col.names=NA)
# intestines_19dpi_noDEX vs. intestines_19dpi_DEX:
comp_intestines_19dpi_noDEX_vs_intestines_19dpi_DEX <- exactTest(geno_comp, pair=c("intestines_19dpi_DEX","intestines_19dpi_noDEX"))
deg_intestines_19dpi_noDEX_vs_intestines_19dpi_DEX <- topTags(comp_intestines_19dpi_noDEX_vs_intestines_19dpi_DEX, n=Inf, p.value=1)
write.table(as.data.frame(deg_intestines_19dpi_noDEX_vs_intestines_19dpi_DEX), file="intestines_19dpi_noDEX.vs.intestines_19dpi_DEX_edgeR_exactTest_all.data_2021.08.03.01.orig.tsv.txt", sep="\t", col.names=NA)
# non.intestines_19dpi_noDEX vs. non.intestines_19dpi_DEX:
comp_non.intestines_19dpi_noDEX_vs_non.intestines_19dpi_DEX <- exactTest(geno_comp, pair=c("non.intestines_19dpi_DEX","non.intestines_19dpi_noDEX"))
deg_non.intestines_19dpi_noDEX_vs_non.intestines_19dpi_DEX <- topTags(comp_non.intestines_19dpi_noDEX_vs_non.intestines_19dpi_DEX, n=Inf, p.value=1)
write.table(as.data.frame(deg_non.intestines_19dpi_noDEX_vs_non.intestines_19dpi_DEX), file="non.intestines_19dpi_noDEX.vs.non.intestines_19dpi_DEX_edgeR_exactTest_all.data_2021.08.03.01.orig.tsv.txt", sep="\t", col.names=NA)
# Sum up:
sessionInfo()
# end it:
q()

The sessionInfo() output text for the non-failing version of this script (i.e., the one lacking "robust=TRUE" as the third argument to estimateDisp) is as follows:

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 8 (Core)

Matrix products: default
BLAS/LAPACK: /ocean/projects/mcb190015p/schwarze/anaconda3/envs/edger_3.34.0/lib/libopenblasp-r0.3.17.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] edgeR_3.34.0 limma_3.48.0

loaded via a namespace (and not attached):
[1] compiler_4.1.0  Rcpp_1.0.7      splines_4.1.0   grid_4.1.0     
[5] locfit_1.5-9.4  lattice_0.20-44
> # end it:
> q()
> proc.time()
   user  system elapsed 
 12.502   0.115  12.670
edgeR • 343 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 41 minutes ago
WEHI, Melbourne, Australia

statmod package required but is not installed

The error message gives a complete explanation for the problem. To use this option in edgeR you need to install the statmod package.

This is standard operating procedure for the limma and edgeR packages. All the packages needed for core functionality are listed as dependencies but packages that are only optionally needed are in the "Suggested" list. This is done to minimize the default footprint of the packages. If you try to use a non-default option that needs an extra package that you don't have installed, then you get an informative error message like the one above.

ADD COMMENT
0
Entering edit mode

Yes, installing statmod does solve it! However, for the benefit of anybody (like me) who is still figuring out R ... here's how I got statmod installed within the bioconda environment that I am running.

conda activate edger_3.34.0 ;
R ;
> install.packages("statmod")
> q()
conda deactivate ;

And, having done that, here's the line I added to my R batch job script to get the "robust=TRUE" argument to work:

library(edgeR)
library(statmod)
ADD REPLY
0
Entering edit mode

The standard recommended procedure to install Bioconductor packages and their dependencies is

library(BiocManager)
install( whatever )

You can use the dependencies argument of install() to control whether you get just the specified package, or the package and its dependencies, or everything including optional suggested packages.

ADD REPLY

Login before adding your answer.

Traffic: 135 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6