Question: SGSeq: class of "sample_info" argument of analyzeFeatures, data.frame or DataFrame!
gravatar for Chao-Jen Wong
3.8 years ago by
USA/Seattle/Fred Hutchinson Cancer Research Center
Chao-Jen Wong30 wrote:

This is just a minor problem that most of people won't encounter if they set up the "sample_info" argument as a data.frame instance. 

By reading the man page of analyzeFeatures, I thought "sample_info" argument could be either data.frame or DataFrame. And the fact is it could be either, but have different constraint. If I let sample_info to be DataFrame, then the rownames must be either NULL or the same as "sample_name" column. Otherwise,  an error occurs when generatitng the SummarizedExperiemnt for the feature counts.  On the other hand, if sample_info is a data.frame, it wouldn't cause any problem whatever its rownames are. The reason is the internal function would convert the data.frame to DataFrame and the rownames would automatically set to  NULL.  Below is the reproducible example.  

path <- system.file("extdata", package = "SGSeq")
si$file_bam <- file.path(path, "bams", si$file_bam)
si <- DataFrame(si)
rownames(si) <- 1:nrow(si)
sgfc <- analyzeFeatures(si[1,], gr)

Predict features...
N1 complete.
Process features...
Obtain counts...
N1 complete.
Error in FUN(X[[i]], ...) :
  assay colnames() must be NULL or equal colData rownames()


I encountered this problem because I made "sample_info" as a DataFrame and use it for other purpose. I know it could be fix if I just convert it to data.frame when using the SESeq package. But it is kind of funny, right?


> sessionInfo()
R Under development (unstable) (2016-02-27 r70236)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.3 LTS

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] SGSeq_1.5.5                 SummarizedExperiment_1.1.18
[3] Biobase_2.31.3              GenomicRanges_1.23.13
[5] GenomeInfoDb_1.7.6          IRanges_2.5.24
[7] S4Vectors_0.9.27            BiocGenerics_0.17.3
[9] BiocInstaller_1.21.3

loaded via a namespace (and not attached):
 [1] igraph_1.0.1             AnnotationDbi_1.33.7     XVector_0.11.4
 [4] magrittr_1.5             zlibbioc_1.17.0          GenomicAlignments_1.7.13
 [7] BiocParallel_1.5.16      tools_3.3.0              DBI_0.3.1
[10] rtracklayer_1.31.6       bitops_1.0-6             RUnit_0.4.31
[13] RCurl_1.95-4.7           biomaRt_2.27.2           RSQLite_1.0.0
[16] GenomicFeatures_1.23.22  Biostrings_2.39.7        Rsamtools_1.23.3
[19] XML_3.98-1.3

sgseq • 836 views
ADD COMMENTlink modified 3.8 years ago by Leonard Goldstein110 • written 3.8 years ago by Chao-Jen Wong30
Answer: SGSeq: class of "sample_info" argument of analyzeFeatures, data.frame or DataFra
gravatar for Leonard Goldstein
3.8 years ago by
United States
Leonard Goldstein110 wrote:

Thanks for reporting. Looks like this is caused by changes in the SummarizedExperiment package in bioc-devel. This is fixed in SGSeq 1.5.8 - available via svn now or biocLite() tomorrow.



ADD COMMENTlink written 3.8 years ago by Leonard Goldstein110

Thanks a lot.

ADD REPLYlink written 3.8 years ago by Chao-Jen Wong30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 292 users visited in the last hour