(Sorry, the formatting got messed up. here is the same post again, with proper formatting.)
Dear DEXSeq Developers,
I got tripped by some errors (in estimateSizeFactors and estimateExonFoldChanges functions) when working with a DEXSeqDataset object that is obtained by subsetting the columns (samples) of another DEXSeqDataset object.
I thought I would let you know of the error I got and how I fixed it, so that:
1) you could tell me if this is indeed a proper way to subset columns? (for instance, I fix the error by subsetting object@modelFrameBM ; am I missing any other slots that need to be subsetted too?)
2) as a feature request, could a simpler way to subset columns be included in future versions of DEXSeq, by doing the necessary modelFrameBM subsetting and droplevels() when the DEXSeqDataSet object is being subsetted?
The full code is here and sessionInfo is in PS below.
> data(pasillaDEXSeqDataSet, package="pasilla")
> dxr = DEXSeq(dxd)
using supplied model matrix
using supplied model matrix
> dxdsub = dxd[,dxd$type=="paired-end"]; for (col in names(colData(dxdsub))) { if (is.factor(dxdsub[[col]])) { dxdsub[[col]] = droplevels(dxdsub[[col]]); } }
> dxrsub = DEXSeq(dxdsub)
Error in `$<-.data.frame`(`*tmp*`, "sizeFactor", value = c(0.943530200961566, :
replacement has 392 rows, data has 686
> dxdsub@modelFrameBM = { mf=dxdsub@modelFrameBM; droplevels(mf[mf$type=="paired-end",]) }
> dxrsub = DEXSeq(dxdsub)
using supplied model matrix
using supplied model matrix
Thanks,
Mani
PS: > sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] DEXSeq_1.12.2 BiocParallel_1.0.3 DESeq2_1.6.3 RcppArmadillo_0.6.200.2.0 Rcpp_0.12.1 GenomicRanges_1.18.4 GenomeInfoDb_1.2.5 IRanges_2.0.1
[9] S4Vectors_0.4.0 Biobase_2.26.0 BiocGenerics_0.12.1 BiocInstaller_1.16.5
loaded via a namespace (and not attached):
[1] acepack_1.3-3.3 annotate_1.44.0 AnnotationDbi_1.28.2 base64enc_0.1-3 BatchJobs_1.6 BBmisc_1.9 biomaRt_2.22.0 Biostrings_2.34.1 bitops_1.0-6 brew_1.0-6 checkmate_1.6.3
[12] cluster_1.15.3 codetools_0.2-9 colorspace_1.2-6 DBI_0.3.1 digest_0.6.8 fail_1.3 foreach_1.4.3 foreign_0.8-61 Formula_1.2-1 genefilter_1.48.1 geneplotter_1.44.0
[23] ggplot2_1.0.1 grid_3.1.2 gridExtra_2.0.0 gtable_0.1.2 Hmisc_3.17-0 hwriter_1.3.2 iterators_1.0.8 lattice_0.20-29 latticeExtra_0.6-26 locfit_1.5-9.1 magrittr_1.5
[34] MASS_7.3-35 munsell_0.4.2 nnet_7.3-8 plyr_1.8.3 proto_0.3-10 RColorBrewer_1.1-2 RCurl_1.95-4.7 reshape2_1.4.1 rpart_4.1-8 Rsamtools_1.18.3 RSQLite_1.0.0
[45] scales_0.3.0 sendmailR_1.2-1 splines_3.1.2 statmod_1.4.22 stringi_1.0-1 stringr_1.0.0 survival_2.37-7 tools_3.1.2 XML_3.98-1.3 xtable_1.8-0 XVector_0.6.0
[56] zlibbioc_1.12.0
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi,
A note to the DEXSeq authors/maintainers: it looks like the
colData
andmodelFrameBM
slots of a DEXSeqDataSet object are conceptually tightly coupled. However some operations in the API for these objects (e.g.[
, thecolData()
setter, and maybe others) are not keeping the 2 slots coupled, and as a result the modified object is not always valid. Adding some checks to the validity method for DEXSeqDataSet objects would help in avoiding these invalid objects. Also the"["
and"colData<-"
methods for SummarizedExperiment0 objects would probably need to be overridden by methods for DEXSeqDataSet objects that preserve validity.Cheers,
H.
Thanks Mani for reporting this!
As Herve mentions, it was a problem that I did not add a "[" method for the DEXSeqDataSet to couple modelFrameBM with the rest of the object. I have implemented such method and added it to the latest versions of DEXSeq (both devel and release branch). The example that you wrote with pasilla should now work.
Let me know!
Alejandro Reyes
Thanks Herve for your quick response, and Alejandro for promptly resolving this issue in the release! I will try it out when I get a chance to install the latest release.
Mani
I've tried my example in the latest release (DEXSeq 1.16.1) and subsetting columns is now extremely simple to use and works great. Thanks again Alejandro!