pdInfoBuilder fails on Affy's GeneChip Human Transcriptome Array 2.0

0

Entering edit mode

Guilherme Rocha ▴ 40

@guilherme-rocha-6354

Last seen 7.0 years ago

Dear all, I am trying to create the pfInfoBuilder packages for Affy's GeneChip Human Transcriptome Array 2.0. I am using the "original" pgf, clf, mps, and probeset.csv files from the library files from Affy's website ( http://www.affymetrix.com/Auth/analysis/downloads/lf/hta/HTA-2_0 /AGCC_library_installer_HTA-2_0.zip ). I was able to read the probeset.csv file using plain vanilla read.csv. Thus, it is likely the solution given to a similar problem with Arabidopsis chips does not apply ("pdInfoBuilder fails on the new Arabidopsis Gene ST 1.0 & 1.1 arrays", https://stat.ethz.ch/pipermail/bioconductor/2012-March/044231.html) Details are shown below. Any help greatly appreciated. Regards, Guilherme Rocha ---------------------------------------------------------------------- -------------------------------------- R Code and output: > library(pdInfoBuilder) Loading required package: Biobase Loading required package: BiocGenerics Loading required package: parallel Attaching package: 'BiocGenerics' The following objects are masked from 'package:parallel': clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following object is masked from 'package:stats': xtabs The following objects are masked from 'package:base': Filter, Find, Map, Position, Reduce, anyDuplicated, append, as.data.frame, as.vector, cbind, colnames, duplicated, eval, evalq, get, intersect, is.unsorted, lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind, rep.int, rownames, sapply, setdiff, sort, table, tapply, union, unique, unlist Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. Loading required package: RSQLite Loading required package: DBI Loading required package: affxparser Loading required package: oligo Loading required package: oligoClasses Welcome to oligoClasses version 1.24.0 ====================================================================== ========== Welcome to oligo version 1.26.0 ====================================================================== ========== Attaching package: 'oligo' The following object is masked from 'package:BiocGenerics': normalize > > base_dir = "./" > > pgf = paste(base_dir, "/HTA-2_0.r1.pgf", sep="") > clf = paste(base_dir, "/HTA-2_0.r1.clf", sep="") > prob = paste(base_dir, "/HTA-2_0.na33.hg19.probeset.csv", sep="") > core_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") > extended_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") > full_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") > > test_csv = read.csv(paste(base_dir, "/HTA-2_0.na33.hg19.probeset.csv", sep=""), skip=14, header=T) > > seed = new("AffyExonPDInfoPkgSeed", + pgfFile = pgf, + clfFile = clf, + probeFile = prob, + coreMps = core_mps, + extendedMps = extended_mps, + fullMps = full_mps, + author = "GR", + email = "anemailadress@gmail.com", + biocViews = "AnnotationData", + genomebuild = "GRCh37", + organism = "Human", + species = "Homo sapiens", + url = "") > > makePdInfoPackage(seed, destDir=base_dir); ====================================================================== ========== Building annotation package for Affymetrix Exon ST Array PGF.........: HTA-2_0.r1.pgf CLF.........: HTA-2_0.r1.clf Probeset....: HTA-2_0.na33.hg19.probeset.csv Transcript..: TheTranscriptFile Core MPS....: HTA-2_0.r1.Psrs.mps Full MPS....: HTA-2_0.r1.Psrs.mps Extended MPS: HTA-2_0.r1.Psrs.mps ====================================================================== ========== Parsing file: HTA-2_0.r1.pgf... OK Parsing file: HTA-2_0.r1.clf... OK Creating initial table for probes... OK Creating dictionaries... OK Parsing file: HTA-2_0.na33.hg19.probeset.csv... OK Parsing file: HTA-2_0.r1.Psrs.mps... OK Parsing file: HTA-2_0.r1.Psrs.mps... OK Parsing file: HTA-2_0.r1.Psrs.mps... OK Creating package in .//pd.hta.2.0 Inserting 850 rows into table chrom_dict... OK Inserting 5 rows into table level_dict... OK Inserting 11 rows into table type_dict... OK Inserting 577432 rows into table core_mps... OK Inserting 577432 rows into table full_mps... OK Inserting 577432 rows into table extended_mps... OK Inserting 1839617 rows into table featureSet... Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (RS_SQLite_exec: could not execute: datatype mismatch) > > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] pdInfoBuilder_1.26.0 oligo_1.26.0 oligoClasses_1.24.0 [4] affxparser_1.34.0 RSQLite_0.11.4 DBI_0.2-7 [7] Biobase_2.22.0 BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] BiocInstaller_1.12.0 Biostrings_2.30.0 GenomicRanges_1.14.1 [4] IRanges_1.20.0 XVector_0.2.0 affyio_1.30.0 [7] bit_1.1-10 codetools_0.2-8 ff_2.2-12 [10] foreach_1.4.1 iterators_1.0.6 preprocessCore_1.24.0 [13] splines_3.0.2 stats4_3.0.2 zlibbioc_1.8.0 [[alternative HTML version deleted]]

BiocViews Annotation Organism biocViews oligo oligoClasses BiocViews BiocViews Annotation • 1.8k views

ADD COMMENT • link updated 9.0 years ago by Jason Hubbard ▴ 20 • written 10.3 years ago by Guilherme Rocha ▴ 40

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 2 hours ago

United States

This is an existing problem. See the email sent to the listserv just hours ago, asking for an update on progress: https://stat.ethz.ch/pipermail/bioconductor/attachments/20140123/4847d 433/attachment.pl Best, Jim On 1/23/2014 1:34 PM, Guilherme Rocha wrote: > Dear all, > > I am trying to create the pfInfoBuilder packages for Affy's GeneChip Human > Transcriptome Array 2.0. > > I am using the "original" pgf, clf, mps, and probeset.csv files from the > library files from Affy's website ( > http://www.affymetrix.com/Auth/analysis/downloads/lf/hta/HTA-2_0 /AGCC_library_installer_HTA-2_0.zip > ). > > I was able to read the probeset.csv file using plain vanilla read.csv. > Thus, it is likely the solution given to a similar problem with > Arabidopsis chips does not apply ("pdInfoBuilder fails on the new > Arabidopsis Gene ST 1.0 & 1.1 arrays", > https://stat.ethz.ch/pipermail/bioconductor/2012-March/044231.html) > > Details are shown below. > > Any help greatly appreciated. > > Regards, > > Guilherme Rocha > > > -------------------------------------------------------------------- ---------------------------------------- > R Code and output: > >> library(pdInfoBuilder) > Loading required package: Biobase > Loading required package: BiocGenerics > Loading required package: parallel > > Attaching package: 'BiocGenerics' > > The following objects are masked from 'package:parallel': > > clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, > clusterExport, clusterMap, parApply, parCapply, parLapply, > parLapplyLB, parRapply, parSapply, parSapplyLB > > The following object is masked from 'package:stats': > > xtabs > > The following objects are masked from 'package:base': > > Filter, Find, Map, Position, Reduce, anyDuplicated, append, > as.data.frame, as.vector, cbind, colnames, duplicated, eval, evalq, > get, intersect, is.unsorted, lapply, mapply, match, mget, order, > paste, pmax, pmax.int, pmin, pmin.int, rank, rbind, rep.int, > rownames, sapply, setdiff, sort, table, tapply, union, unique, > unlist > > Welcome to Bioconductor > > Vignettes contain introductory material; view with > 'browseVignettes()'. To cite Bioconductor, see > 'citation("Biobase")', and for packages 'citation("pkgname")'. > > Loading required package: RSQLite > Loading required package: DBI > Loading required package: affxparser > Loading required package: oligo > Loading required package: oligoClasses > Welcome to oligoClasses version 1.24.0 > ==================================================================== ============ > Welcome to oligo version 1.26.0 > ==================================================================== ============ > > Attaching package: 'oligo' > > The following object is masked from 'package:BiocGenerics': > > normalize > >> base_dir = "./" >> >> pgf = paste(base_dir, "/HTA-2_0.r1.pgf", sep="") >> clf = paste(base_dir, "/HTA-2_0.r1.clf", sep="") >> prob = paste(base_dir, "/HTA-2_0.na33.hg19.probeset.csv", sep="") >> core_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") >> extended_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") >> full_mps = paste(base_dir, "/HTA-2_0.r1.Psrs.mps", sep="") >> >> test_csv = read.csv(paste(base_dir, > "/HTA-2_0.na33.hg19.probeset.csv", sep=""), skip=14, header=T) >> seed = new("AffyExonPDInfoPkgSeed", > + pgfFile = pgf, > + clfFile = clf, > + probeFile = prob, > + coreMps = core_mps, > + extendedMps = extended_mps, > + fullMps = full_mps, > + author = "GR", > + email = "anemailadress at gmail.com", > + biocViews = "AnnotationData", > + genomebuild = "GRCh37", > + organism = "Human", > + species = "Homo sapiens", > + url = "") >> makePdInfoPackage(seed, destDir=base_dir); > ==================================================================== ============ > Building annotation package for Affymetrix Exon ST Array > PGF.........: HTA-2_0.r1.pgf > CLF.........: HTA-2_0.r1.clf > Probeset....: HTA-2_0.na33.hg19.probeset.csv > Transcript..: TheTranscriptFile > Core MPS....: HTA-2_0.r1.Psrs.mps > Full MPS....: HTA-2_0.r1.Psrs.mps > Extended MPS: HTA-2_0.r1.Psrs.mps > ==================================================================== ============ > Parsing file: HTA-2_0.r1.pgf... OK > Parsing file: HTA-2_0.r1.clf... OK > Creating initial table for probes... OK > Creating dictionaries... OK > Parsing file: HTA-2_0.na33.hg19.probeset.csv... OK > Parsing file: HTA-2_0.r1.Psrs.mps... OK > Parsing file: HTA-2_0.r1.Psrs.mps... OK > Parsing file: HTA-2_0.r1.Psrs.mps... OK > Creating package in .//pd.hta.2.0 > Inserting 850 rows into table chrom_dict... OK > Inserting 5 rows into table level_dict... OK > Inserting 11 rows into table type_dict... OK > Inserting 577432 rows into table core_mps... OK > Inserting 577432 rows into table full_mps... OK > Inserting 577432 rows into table extended_mps... OK > Inserting 1839617 rows into table featureSet... Error in > sqliteExecStatement(con, statement, bind.data) : > RS-DBI driver: (RS_SQLite_exec: could not execute: datatype mismatch) >> sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] pdInfoBuilder_1.26.0 oligo_1.26.0 oligoClasses_1.24.0 > [4] affxparser_1.34.0 RSQLite_0.11.4 DBI_0.2-7 > [7] Biobase_2.22.0 BiocGenerics_0.8.0 > > loaded via a namespace (and not attached): > [1] BiocInstaller_1.12.0 Biostrings_2.30.0 GenomicRanges_1.14.1 > [4] IRanges_1.20.0 XVector_0.2.0 affyio_1.30.0 > [7] bit_1.1-10 codetools_0.2-8 ff_2.2-12 > [10] foreach_1.4.1 iterators_1.0.6 preprocessCore_1.24.0 > [13] splines_3.0.2 stats4_3.0.2 zlibbioc_1.8.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 10.3 years ago James W. MacDonald 65k

0

Entering edit mode

Jason Hubbard ▴ 20

@jason-hubbard-7740

Last seen 8.5 years ago

Seattle, WA

In case someone comes here looking for a pre-built pdinfo package, you can find one for HTA-2_0 here:

http://www.bioconductor.org/packages/release/data/annotation/html/pd.hta.2.0.html

ADD COMMENT • link 9.0 years ago Jason Hubbard ▴ 20

Login before adding your answer.