proper way to subset columns of a DEXSeqDataset object
1
0
Entering edit mode
@narayanan-manikandan-nihniaid-e-5829
Last seen 10.0 years ago
United States
Dear DEXSeq Developers, I got tripped by errors in estimateSizeFactors and estimateExonFoldChanges functions when working with a DEXSeqDataset object that is obtained by subsetting the columns (samples) of another DEXSeqDataset object. I thought I would let you know of the error I got and how I fixed it, so that: 1) you could tell me if this is indeed a proper way to subset columns? (for instance, I fix the error by subsetting object@modelFrameBM ; am I missing any other slots that need to be subsetted too?) 2) as a feature request, could a simpler way to subset columns be included in future versions of DEXSeq, by doing the necessary modelFrameBM subsetting and droplevels() when the DEXSeqDataSet is being subsetted? The full code is here and sessionInfo is in PS below. > data(pasillaDEXSeqDataSet, package="pasilla") > dxr = DEXSeq(dxd) using supplied model matrix using supplied model matrix > dxdsub = dxd[,dxd$type=="paired-end"]; for (col in names(colData(dxdsub))) { if (is.factor(dxdsub[[col]])) { dxdsub[[col]] = droplevels(dxdsub[[col]]); } } > dxrsub = DEXSeq(dxdsub) Error in `$<-.data.frame`(`*tmp*`, "sizeFactor", value = c(0.943530200961566, : replacement has 392 rows, data has 686 > dxdsub@modelFrameBM = { mf=dxdsub@modelFrameBM; droplevels(mf[mf$type=="paired-end",]) } > dxrsub = DEXSeq(dxdsub) using supplied model matrix using supplied model matrix Thanks, Mani > sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] DEXSeq_1.12.2 BiocParallel_1.0.3 DESeq2_1.6.3 RcppArmadillo_0.6.200.2.0 Rcpp_0.12.1 GenomicRanges_1.18.4 GenomeInfoDb_1.2.5 IRanges_2.0.1 [9] S4Vectors_0.4.0 Biobase_2.26.0 BiocGenerics_0.12.1 BiocInstaller_1.16.5 loaded via a namespace (and not attached): [1] acepack_1.3-3.3 annotate_1.44.0 AnnotationDbi_1.28.2 base64enc_0.1-3 BatchJobs_1.6 BBmisc_1.9 biomaRt_2.22.0 Biostrings_2.34.1 bitops_1.0-6 brew_1.0-6 checkmate_1.6.3 [12] cluster_1.15.3 codetools_0.2-9 colorspace_1.2-6 DBI_0.3.1 digest_0.6.8 fail_1.3 foreach_1.4.3 foreign_0.8-61 Formula_1.2-1 genefilter_1.48.1 geneplotter_1.44.0 [23] ggplot2_1.0.1 grid_3.1.2 gridExtra_2.0.0 gtable_0.1.2 Hmisc_3.17-0 hwriter_1.3.2 iterators_1.0.8 lattice_0.20-29 latticeExtra_0.6-26 locfit_1.5-9.1 magrittr_1.5 [34] MASS_7.3-35 munsell_0.4.2 nnet_7.3-8 plyr_1.8.3 proto_0.3-10 RColorBrewer_1.1-2 RCurl_1.95-4.7 reshape2_1.4.1 rpart_4.1-8 Rsamtools_1.18.3 RSQLite_1.0.0 [45] scales_0.3.0 sendmailR_1.2-1 splines_3.1.2 statmod_1.4.22 stringi_1.0-1 stringr_1.0.0 survival_2.37-7 tools_3.1.2 XML_3.98-1.3 xtable_1.8-0 XVector_0.6.0 [56] zlibbioc_1.12.0
DEXSeq • 1.9k views
2
Entering edit mode
@narayanan-manikandan-nihniaid-e-5829
Last seen 10.0 years ago
United States

(Sorry, the formatting got messed up. here is the same post again, with proper formatting.)

Dear DEXSeq Developers,

I got tripped by some errors (in estimateSizeFactors and estimateExonFoldChanges functions) when working with a DEXSeqDataset object that is obtained by subsetting the columns (samples) of another DEXSeqDataset object.

I thought I would let you know of the error I got and how I fixed it, so that: 
1) you could tell me if this is indeed a proper way to subset columns? (for instance, I fix the error by subsetting object@modelFrameBM ; am I missing any other slots that need to be subsetted too?) 
2) as a feature request, could a simpler way to subset columns be included in future versions of DEXSeq, by doing the necessary modelFrameBM subsetting and droplevels() when the DEXSeqDataSet object is being subsetted? 

The full code is here and sessionInfo is in PS below.

> data(pasillaDEXSeqDataSet, package="pasilla")
> dxr = DEXSeq(dxd)
using supplied model matrix
using supplied model matrix
> dxdsub = dxd[,dxd$type=="paired-end"];  for (col in names(colData(dxdsub))) {  if (is.factor(dxdsub[[col]])) {  dxdsub[[col]] = droplevels(dxdsub[[col]]); }  }
> dxrsub = DEXSeq(dxdsub)
Error in `$<-.data.frame`(`*tmp*`, "sizeFactor", value = c(0.943530200961566,  : 
  replacement has 392 rows, data has 686
> dxdsub@modelFrameBM = {  mf=dxdsub@modelFrameBM; droplevels(mf[mf$type=="paired-end",])  }
> dxrsub = DEXSeq(dxdsub)
using supplied model matrix
using supplied model matrix

 

Thanks, 
Mani

 

 

PS: > sessionInfo()

R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DEXSeq_1.12.2             BiocParallel_1.0.3        DESeq2_1.6.3              RcppArmadillo_0.6.200.2.0 Rcpp_0.12.1               GenomicRanges_1.18.4      GenomeInfoDb_1.2.5        IRanges_2.0.1            
 [9] S4Vectors_0.4.0           Biobase_2.26.0            BiocGenerics_0.12.1       BiocInstaller_1.16.5     

loaded via a namespace (and not attached):
 [1] acepack_1.3-3.3      annotate_1.44.0      AnnotationDbi_1.28.2 base64enc_0.1-3      BatchJobs_1.6        BBmisc_1.9           biomaRt_2.22.0       Biostrings_2.34.1    bitops_1.0-6         brew_1.0-6           checkmate_1.6.3     
[12] cluster_1.15.3       codetools_0.2-9      colorspace_1.2-6     DBI_0.3.1            digest_0.6.8         fail_1.3             foreach_1.4.3        foreign_0.8-61       Formula_1.2-1        genefilter_1.48.1    geneplotter_1.44.0  
[23] ggplot2_1.0.1        grid_3.1.2           gridExtra_2.0.0      gtable_0.1.2         Hmisc_3.17-0         hwriter_1.3.2        iterators_1.0.8      lattice_0.20-29      latticeExtra_0.6-26  locfit_1.5-9.1       magrittr_1.5        
[34] MASS_7.3-35          munsell_0.4.2        nnet_7.3-8           plyr_1.8.3           proto_0.3-10         RColorBrewer_1.1-2   RCurl_1.95-4.7       reshape2_1.4.1       rpart_4.1-8          Rsamtools_1.18.3     RSQLite_1.0.0       
[45] scales_0.3.0         sendmailR_1.2-1      splines_3.1.2        statmod_1.4.22       stringi_1.0-1        stringr_1.0.0        survival_2.37-7      tools_3.1.2          XML_3.98-1.3         xtable_1.8-0         XVector_0.6.0       
[56] zlibbioc_1.12.0     

 

0
Entering edit mode

Hi,

A note to the DEXSeq authors/maintainers: it looks like the colData and modelFrameBM slots of a DEXSeqDataSet object are conceptually tightly coupled. However some operations in the API for these objects (e.g. [, the colData() setter, and maybe others) are not keeping the 2 slots coupled, and as a result the modified object is not always valid. Adding some checks to the validity method for DEXSeqDataSet objects would help in avoiding these invalid objects. Also the "[" and "colData<-" methods for SummarizedExperiment0 objects would probably need to be overridden by methods for DEXSeqDataSet objects that preserve validity.

Cheers,

H.

ADD REPLY
0
Entering edit mode

Thanks Mani for reporting this!

As Herve mentions, it was a problem that I did not add a "[" method for the DEXSeqDataSet to couple modelFrameBM with the rest of the object. I have implemented such method and added it to the latest versions of DEXSeq (both devel and release branch). The example that you wrote with pasilla should now work.

Let me know!

Alejandro Reyes

ADD REPLY
0
Entering edit mode

Thanks Herve for your quick response, and Alejandro for promptly resolving this issue in the release! I will try it out when I get a chance to install the latest release.

Mani 

0
Entering edit mode

I've tried my example in the latest release (DEXSeq 1.16.1) and subsetting columns is now extremely simple to use and works great. Thanks again Alejandro!

Login before adding your answer.

Traffic: 1048 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6