SGSeq: class of "sample_info" argument of analyzeFeatures, data.frame or DataFrame!
1
1
Entering edit mode
@chao-jen-wong-7035
Last seen 17 months ago
USA/Seattle/Fred Hutchinson Cancer Rese…

This is just a minor problem that most of people won't encounter if they set up the "sample_info" argument as a data.frame instance. 

By reading the man page of analyzeFeatures, I thought "sample_info" argument could be either data.frame or DataFrame. And the fact is it could be either, but have different constraint. If I let sample_info to be DataFrame, then the rownames must be either NULL or the same as "sample_name" column. Otherwise,  an error occurs when generatitng the SummarizedExperiemnt for the feature counts.  On the other hand, if sample_info is a data.frame, it wouldn't cause any problem whatever its rownames are. The reason is the internal function would convert the data.frame to DataFrame and the rownames would automatically set to  NULL.  Below is the reproducible example.  

library(SGSeq)
path <- system.file("extdata", package = "SGSeq")
si$file_bam <- file.path(path, "bams", si$file_bam)
rownames(si)
si <- DataFrame(si)
rownames(si)
rownames(si) <- 1:nrow(si)
sgfc <- analyzeFeatures(si[1,], gr)


Predict features...
N1 complete.
Process features...
Obtain counts...
N1 complete.
Error in FUN(X[[i]], ...) :
  assay colnames() must be NULL or equal colData rownames()

 

I encountered this problem because I made "sample_info" as a DataFrame and use it for other purpose. I know it could be fix if I just convert it to data.frame when using the SESeq package. But it is kind of funny, right?

SessionInfo()

> sessionInfo()
R Under development (unstable) (2016-02-27 r70236)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.3 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
[1] SGSeq_1.5.5                 SummarizedExperiment_1.1.18
[3] Biobase_2.31.3              GenomicRanges_1.23.13
[5] GenomeInfoDb_1.7.6          IRanges_2.5.24
[7] S4Vectors_0.9.27            BiocGenerics_0.17.3
[9] BiocInstaller_1.21.3

loaded via a namespace (and not attached):
 [1] igraph_1.0.1             AnnotationDbi_1.33.7     XVector_0.11.4
 [4] magrittr_1.5             zlibbioc_1.17.0          GenomicAlignments_1.7.13
 [7] BiocParallel_1.5.16      tools_3.3.0              DBI_0.3.1
[10] rtracklayer_1.31.6       bitops_1.0-6             RUnit_0.4.31
[13] RCurl_1.95-4.7           biomaRt_2.27.2           RSQLite_1.0.0
[16] GenomicFeatures_1.23.22  Biostrings_2.39.7        Rsamtools_1.23.3
[19] XML_3.98-1.3
>

sgseq • 1.7k views
ADD COMMENT
1
Entering edit mode
@leonard-goldstein-6845
Last seen 7 months ago
Australia

Thanks for reporting. Looks like this is caused by changes in the SummarizedExperiment package in bioc-devel. This is fixed in SGSeq 1.5.8 - available via svn now or biocLite() tomorrow.

Leonard

 

ADD COMMENT
0
Entering edit mode

Thanks a lot.

ADD REPLY

Login before adding your answer.

Traffic: 362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6