Question: rtracklayer import.gff3 mangling scores
0
gravatar for Tim Rayner
7.0 years ago by
Tim Rayner270
Tim Rayner270 wrote:
Hi, I've just run into what I think is a bug in the rtracklayer import.gff3 function (v1.16.1). If I import a GFF3 containing scores while stringsAsFactors=TRUE, the resulting scores are mangled. I haven't confirmed it, but I suspect the values are being converted to a factor upon import and then coerced to numeric (giving the factor level, not the original value). If I use options(stringsAsFactors=FALSE) the values remain intact. Best regards, Tim Rayner -- Bioinformatician Smith Lab, CIMR University of Cambridge United Kingdom Example GFF3 content: ##gff-version 3 ##date 2012-07-13 chr1 rtracklayer snp 189807684 189807684 0.20294398632582 * . ID=rs955894;name=rs955894 chr1 rtracklayer snp 198484784 198484784 0.269327708380075 * . ID=rs16843226;name=rs16843226 chr1 rtracklayer snp 237405093 237405093 0.379417274542624 * . ID=rs679735;name=rs679735 chr1 rtracklayer snp 80235819 80235819 0.418346673826376 * . ID=rs12022561;name=rs12022561 chr1 rtracklayer snp 84875173 84875173 0.302119655250906 * . ID=rs6576700;name=rs6576700 chr1 rtracklayer snp 112793146 112793146 0.390270490589027 * . ID=rs11102440;name=rs11102440 chr1 rtracklayer snp 244187847 244187847 0.249206080122631 * . ID=rs1000451;name=rs1000451 chr1 rtracklayer snp 8612104 8612104 0.583436890885292 * . ID=rs6577499;name=rs6577499 > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rtracklayer_1.16.1 GenomicRanges_1.8.6 IRanges_1.14.3 [4] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] Biostrings_2.24.1 bitops_1.0-4.1 BSgenome_1.24.0 RCurl_1.91-1 [5] Rsamtools_1.8.5 stats4_2.15.1 tools_2.15.1 XML_3.9-4 [9] zlibbioc_1.2.0
snp rtracklayer • 595 views
ADD COMMENTlink modified 7.0 years ago by Michael Lawrence11k • written 7.0 years ago by Tim Rayner270
Answer: rtracklayer import.gff3 mangling scores
0
gravatar for Michael Lawrence
7.0 years ago by
United States
Michael Lawrence11k wrote:
Hi Tim, Good catch. Added a test to catch that in the future. Fixed in 1.16.3 (and devel). Michael On Mon, Jul 16, 2012 at 6:25 AM, Tim Rayner <tfrayner@gmail.com> wrote: > Hi, > > I've just run into what I think is a bug in the rtracklayer > import.gff3 function (v1.16.1). If I import a GFF3 containing scores > while stringsAsFactors=TRUE, the resulting scores are mangled. I > haven't confirmed it, but I suspect the values are being converted to > a factor upon import and then coerced to numeric (giving the factor > level, not the original value). If I use > options(stringsAsFactors=FALSE) the values remain intact. > > Best regards, > > Tim Rayner > > -- > Bioinformatician > Smith Lab, CIMR > University of Cambridge > United Kingdom > > > > Example GFF3 content: > > ##gff-version 3 > ##date 2012-07-13 > chr1 rtracklayer snp 189807684 189807684 > 0.20294398632582 * . ID=rs955894;name=rs955894 > chr1 rtracklayer snp 198484784 198484784 > 0.269327708380075 * . ID=rs16843226;name=rs16843226 > chr1 rtracklayer snp 237405093 237405093 > 0.379417274542624 * . ID=rs679735;name=rs679735 > chr1 rtracklayer snp 80235819 80235819 > 0.418346673826376 * . ID=rs12022561;name=rs12022561 > chr1 rtracklayer snp 84875173 84875173 > 0.302119655250906 * . ID=rs6576700;name=rs6576700 > chr1 rtracklayer snp 112793146 112793146 > 0.390270490589027 * . ID=rs11102440;name=rs11102440 > chr1 rtracklayer snp 244187847 244187847 > 0.249206080122631 * . ID=rs1000451;name=rs1000451 > chr1 rtracklayer snp 8612104 8612104 0.583436890885292 > * . ID=rs6577499;name=rs6577499 > > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 > [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rtracklayer_1.16.1 GenomicRanges_1.8.6 IRanges_1.14.3 > [4] BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] Biostrings_2.24.1 bitops_1.0-4.1 BSgenome_1.24.0 RCurl_1.91-1 > [5] Rsamtools_1.8.5 stats4_2.15.1 tools_2.15.1 XML_3.9-4 > [9] zlibbioc_1.2.0 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 7.0 years ago by Michael Lawrence11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 129 users visited in the last hour