pdInfoBuilder fails on Affy's GeneChip Human Transcriptome Array 2.0
2
0
Entering edit mode
@guilherme-rocha-6354
Last seen 7.7 years ago
Dear all, I am trying to create the pfInfoBuilder packages for Affy's GeneChip Human Transcriptome Array 2.0. I am using the "original" pgf, clf, mps, and probeset.csv files from the library files from Affy's website ( http://www.affymetrix.com/Auth/analysis/downloads/lf/hta/HTA-2_0 /AGCC_library_installer_HTA-2_0.zip ). I was able to read the probeset.csv file using plain vanilla read.csv. Thus, it is likely the solution given to a similar problem with Arabidopsis chips does not apply ("pdInfoBuilder fails on the new Arabidopsis Gene ST 1.0 & 1.1 arrays", https://stat.ethz.ch/pipermail/bioconductor/2012-March/044231.html) Details are shown below. Any help greatly appreciated. Regards, Guilherme Rocha ---------------------------------------------------------------------- -------------------------------------- R Code and output: > library(pdInfoBuilder) Loading required package: Biobase Loading required package: BiocGenerics Loading required package: parallel Attaching package: 'BiocGenerics' The following objects are masked from 'package:parallel': clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following object is masked from 'package:stats': xtabs The following objects are masked from 'package:base': Filter, Find, Map, Position, Reduce, anyDuplicated, append, as.data.frame, as.vector, cbind, colnames, duplicated, eval, evalq, get, intersect, is.unsorted, lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind, rep.int, rownames, sapply, setdiff, sort, table, tapply, union, unique, unlist Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. Loading required package: RSQLite Loading required package: DBI Loading required package: affxparser Loading required package: oligo Loading required package: oligoClasses Welcome to oligoClasses version 1.24.0 ====================================================================== ========== Welcome to oligo version 1.26.0 ====================================================================== ========== Attaching package: 'oligo' The following object is masked from 'package:BiocGenerics': normalize > > base_dir = "./" > > pgf = paste(base_dir, "/HTA-2_0.r1.pgf", sep="") > clf = paste(base_dir, "/HTA-2_0.r1.clf", sep="") > prob = paste(base_dir, "/HTA-2_0.na33.hg19.probeset.csv", sep="") > core_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") > extended_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") > full_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") > > test_csv = read.csv(paste(base_dir, "/HTA-2_0.na33.hg19.probeset.csv", sep=""), skip=14, header=T) > > seed = new("AffyExonPDInfoPkgSeed", + pgfFile = pgf, + clfFile = clf, + probeFile = prob, + coreMps = core_mps, + extendedMps = extended_mps, + fullMps = full_mps, + author = "GR", + email = "anemailadress@gmail.com", + biocViews = "AnnotationData", + genomebuild = "GRCh37", + organism = "Human", + species = "Homo sapiens", + url = "") > > makePdInfoPackage(seed, destDir=base_dir); ====================================================================== ========== Building annotation package for Affymetrix Exon ST Array PGF.........: HTA-2_0.r1.pgf CLF.........: HTA-2_0.r1.clf Probeset....: HTA-2_0.na33.hg19.probeset.csv Transcript..: TheTranscriptFile Core MPS....: HTA-2_0.r1.Psrs.mps Full MPS....: HTA-2_0.r1.Psrs.mps Extended MPS: HTA-2_0.r1.Psrs.mps ====================================================================== ========== Parsing file: HTA-2_0.r1.pgf... OK Parsing file: HTA-2_0.r1.clf... OK Creating initial table for probes... OK Creating dictionaries... OK Parsing file: HTA-2_0.na33.hg19.probeset.csv... OK Parsing file: HTA-2_0.r1.Psrs.mps... OK Parsing file: HTA-2_0.r1.Psrs.mps... OK Parsing file: HTA-2_0.r1.Psrs.mps... OK Creating package in .//pd.hta.2.0 Inserting 850 rows into table chrom_dict... OK Inserting 5 rows into table level_dict... OK Inserting 11 rows into table type_dict... OK Inserting 577432 rows into table core_mps... OK Inserting 577432 rows into table full_mps... OK Inserting 577432 rows into table extended_mps... OK Inserting 1839617 rows into table featureSet... Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (RS_SQLite_exec: could not execute: datatype mismatch) > > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] pdInfoBuilder_1.26.0 oligo_1.26.0 oligoClasses_1.24.0 [4] affxparser_1.34.0 RSQLite_0.11.4 DBI_0.2-7 [7] Biobase_2.22.0 BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] BiocInstaller_1.12.0 Biostrings_2.30.0 GenomicRanges_1.14.1 [4] IRanges_1.20.0 XVector_0.2.0 affyio_1.30.0 [7] bit_1.1-10 codetools_0.2-8 ff_2.2-12 [10] foreach_1.4.1 iterators_1.0.6 preprocessCore_1.24.0 [13] splines_3.0.2 stats4_3.0.2 zlibbioc_1.8.0 [[alternative HTML version deleted]]
BiocViews Annotation Organism biocViews oligo oligoClasses BiocViews BiocViews Annotation • 2.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 days ago
United States
This is an existing problem. See the email sent to the listserv just hours ago, asking for an update on progress: https://stat.ethz.ch/pipermail/bioconductor/attachments/20140123/4847d 433/attachment.pl Best, Jim On 1/23/2014 1:34 PM, Guilherme Rocha wrote: > Dear all, > > I am trying to create the pfInfoBuilder packages for Affy's GeneChip Human > Transcriptome Array 2.0. > > I am using the "original" pgf, clf, mps, and probeset.csv files from the > library files from Affy's website ( > http://www.affymetrix.com/Auth/analysis/downloads/lf/hta/HTA-2_0 /AGCC_library_installer_HTA-2_0.zip > ). > > I was able to read the probeset.csv file using plain vanilla read.csv. > Thus, it is likely the solution given to a similar problem with > Arabidopsis chips does not apply ("pdInfoBuilder fails on the new > Arabidopsis Gene ST 1.0 & 1.1 arrays", > https://stat.ethz.ch/pipermail/bioconductor/2012-March/044231.html) > > Details are shown below. > > Any help greatly appreciated. > > Regards, > > Guilherme Rocha > > > -------------------------------------------------------------------- ---------------------------------------- > R Code and output: > >> library(pdInfoBuilder) > Loading required package: Biobase > Loading required package: BiocGenerics > Loading required package: parallel > > Attaching package: 'BiocGenerics' > > The following objects are masked from 'package:parallel': > > clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, > clusterExport, clusterMap, parApply, parCapply, parLapply, > parLapplyLB, parRapply, parSapply, parSapplyLB > > The following object is masked from 'package:stats': > > xtabs > > The following objects are masked from 'package:base': > > Filter, Find, Map, Position, Reduce, anyDuplicated, append, > as.data.frame, as.vector, cbind, colnames, duplicated, eval, evalq, > get, intersect, is.unsorted, lapply, mapply, match, mget, order, > paste, pmax, pmax.int, pmin, pmin.int, rank, rbind, rep.int, > rownames, sapply, setdiff, sort, table, tapply, union, unique, > unlist > > Welcome to Bioconductor > > Vignettes contain introductory material; view with > 'browseVignettes()'. To cite Bioconductor, see > 'citation("Biobase")', and for packages 'citation("pkgname")'. > > Loading required package: RSQLite > Loading required package: DBI > Loading required package: affxparser > Loading required package: oligo > Loading required package: oligoClasses > Welcome to oligoClasses version 1.24.0 > ==================================================================== ============ > Welcome to oligo version 1.26.0 > ==================================================================== ============ > > Attaching package: 'oligo' > > The following object is masked from 'package:BiocGenerics': > > normalize > >> base_dir = "./" >> >> pgf = paste(base_dir, "/HTA-2_0.r1.pgf", sep="") >> clf = paste(base_dir, "/HTA-2_0.r1.clf", sep="") >> prob = paste(base_dir, "/HTA-2_0.na33.hg19.probeset.csv", sep="") >> core_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") >> extended_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") >> full_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") >> >> test_csv = read.csv(paste(base_dir, > "/HTA-2_0.na33.hg19.probeset.csv", sep=""), skip=14, header=T) >> seed = new("AffyExonPDInfoPkgSeed", > + pgfFile = pgf, > + clfFile = clf, > + probeFile = prob, > + coreMps = core_mps, > + extendedMps = extended_mps, > + fullMps = full_mps, > + author = "GR", > + email = "anemailadress at gmail.com", > + biocViews = "AnnotationData", > + genomebuild = "GRCh37", > + organism = "Human", > + species = "Homo sapiens", > + url = "") >> makePdInfoPackage(seed, destDir=base_dir); > ==================================================================== ============ > Building annotation package for Affymetrix Exon ST Array > PGF.........: HTA-2_0.r1.pgf > CLF.........: HTA-2_0.r1.clf > Probeset....: HTA-2_0.na33.hg19.probeset.csv > Transcript..: TheTranscriptFile > Core MPS....: HTA-2_0.r1.Psrs.mps > Full MPS....: HTA-2_0.r1.Psrs.mps > Extended MPS: HTA-2_0.r1.Psrs.mps > ==================================================================== ============ > Parsing file: HTA-2_0.r1.pgf... OK > Parsing file: HTA-2_0.r1.clf... OK > Creating initial table for probes... OK > Creating dictionaries... OK > Parsing file: HTA-2_0.na33.hg19.probeset.csv... OK > Parsing file: HTA-2_0.r1.Psrs.mps... OK > Parsing file: HTA-2_0.r1.Psrs.mps... OK > Parsing file: HTA-2_0.r1.Psrs.mps... OK > Creating package in .//pd.hta.2.0 > Inserting 850 rows into table chrom_dict... OK > Inserting 5 rows into table level_dict... OK > Inserting 11 rows into table type_dict... OK > Inserting 577432 rows into table core_mps... OK > Inserting 577432 rows into table full_mps... OK > Inserting 577432 rows into table extended_mps... OK > Inserting 1839617 rows into table featureSet... Error in > sqliteExecStatement(con, statement, bind.data) : > RS-DBI driver: (RS_SQLite_exec: could not execute: datatype mismatch) >> sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pdInfoBuilder_1.26.0 oligo_1.26.0 oligoClasses_1.24.0 > [4] affxparser_1.34.0 RSQLite_0.11.4 DBI_0.2-7 > [7] Biobase_2.22.0 BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] BiocInstaller_1.12.0 Biostrings_2.30.0 GenomicRanges_1.14.1 > [4] IRanges_1.20.0 XVector_0.2.0 affyio_1.30.0 > [7] bit_1.1-10 codetools_0.2-8 ff_2.2-12 > [10] foreach_1.4.1 iterators_1.0.6 preprocessCore_1.24.0 > [13] splines_3.0.2 stats4_3.0.2 zlibbioc_1.8.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
@jason-hubbard-7740
Last seen 9.2 years ago
Seattle, WA

In case someone comes here looking for a pre-built pdinfo package, you can find one for HTA-2_0 here:

http://www.bioconductor.org/packages/release/data/annotation/html/pd.hta.2.0.html

ADD COMMENT

Login before adding your answer.

Traffic: 453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6