makePdInfoPackage for Primeview arrays

1

Entering edit mode

Max Kauer ▴ 140

@max-kauer-2254

Last seen 7.5 years ago

Hi, I am trying to make a pd.info package for the Affy Primeview array, but I get an error. Thanks for any help! Cheers, Max This is my code: library(pdInfoBuilder) cdf <- list.files( pathAnnotPr, pattern = ".cdf", full.names = TRUE ) cel <- list.files( pathC, pattern = ".CEL", full.names = TRUE )[1] # take first array tab <- list.files(pathAnnotPr, pattern = "_tab", full.names = TRUE) seed <- new("AffyExpressionPDInfoPkgSeed", cdfFile = cdf, celFile = cel, tabSeqFile = tab, author = "xx", email = "xx", biocViews = "AnnotationData", genomebuild = "hg19", organism = "Human", species = "Homo Sapiens", url = "xx" ) makePdInfoPackage( seed, destDir = "." ) Which produces this output/error (although a pd.primeview directory is created): ====================================================================== ====== ==== Building annotation package for Affymetrix Expression array CDF...............: PrimeView.cdf CEL...............: MJ_05042013_TAS_10_PrimeView.CEL Sequence TAB-Delim: PrimeView.probe_tab ====================================================================== ====== ==== Parsing file: PrimeView.cdf... OK Parsing file: MJ_05042013_TAS_10_PrimeView.CEL... OK Parsing file: PrimeView.probe_tab... OK Getting information for featureSet table... OK Getting information for pm/mm feature tables... OK Combining probe information with sequence information... OK Getting PM probes and sequences... OK Done parsing. Creating package in ./pd.primeview Inserting 49395 rows into table featureSet... OK Inserting 609663 rows into table pmfeature... Error in sqliteExecStatement(con, statement, bind.data) : RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must be unique) In addition: Warning messages: 1: In parseCdfCelProbe(object@cdfFile, object@celFile, object@tabSeqFile, : Probe sequences were not found for all PM probes. These probes will be removed from the pmSequence object. 2: In parseCdfCelProbe(object@cdfFile, object@celFile, object@tabSeqFile, : Probe sequences were not found for all MM probes. These probes will be removed from the mmSequence object. > sessionInfo() R version 3.0.0 (2013-04-03) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] pdInfoBuilder_1.24.0 oligo_1.24.0 oligoClasses_1.22.0 [4] affxparser_1.32.1 RSQLite_0.11.4 DBI_0.2-7 [7] Biobase_2.20.0 BiocGenerics_0.6.0 loaded via a namespace (and not attached): [1] affyio_1.28.0 BiocInstaller_1.10.2 Biostrings_2.28.0 [4] bit_1.1-10 codetools_0.2-8 ff_2.2-11 [7] foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.1 [10] iterators_1.0.6 preprocessCore_1.22.0 splines_3.0.0 [13] stats4_3.0.0 zlibbioc_1.6.0 > Max Kauer CHILDREN'S CANCER RESEARCH INSTITUTE [[alternative HTML version deleted]]

BiocViews Annotation Cancer Organism cdf probe affy biocViews BiocViews BiocViews Cancer • 2.6k views

ADD COMMENT • link updated 10.8 years ago by cstrato ★ 3.9k • written 10.8 years ago by Max Kauer ▴ 140

1

Entering edit mode

Benilton Carvalho ★ 4.3k

@benilton-carvalho-1375

Last seen 4.1 years ago

Brazil/Campinas/UNICAMP

Unfortunately, I cannot provide a quick fix for this. The reason is that pdInfoBuilder, for expression arrays, relies on the fact that one probe belongs to only one probeset. And this is not true for primeview chips. For example, the probe with chip coordinates X=135 and Y=147 is shared by two probesets (11715100_at and 11715102_x_at)... and this happens for thousands of other probesets. Before changing the code, I want to make sure I fully understand the background for this chip and why duplicity happens... and this may take a while... Will get back to the list once I have news on this front, b 2013/6/18 Max Kauer <maximilian.kauer at="" ccri.at="">: > Hi, > > I am trying to make a pd.info package for the Affy Primeview array, but I > get an error. > > Thanks for any help! > > Cheers, > > Max > > > > > > This is my code: > > > > library(pdInfoBuilder) > > cdf <- list.files( pathAnnotPr, pattern = ".cdf", full.names = TRUE ) > > cel <- list.files( pathC, pattern = ".CEL", full.names = TRUE )[1] # take > first array > > tab <- list.files(pathAnnotPr, pattern = "_tab", full.names = TRUE) > > > > seed <- new("AffyExpressionPDInfoPkgSeed", > > cdfFile = cdf, celFile = cel, > > tabSeqFile = tab, author = "xx", > > email = "xx", > > biocViews = "AnnotationData", > > genomebuild = "hg19", > > organism = "Human", species = "Homo Sapiens", > > url = "xx" > > ) > > makePdInfoPackage( seed, destDir = "." ) > > > > > > > > Which produces this output/error (although a pd.primeview directory is > created): > > > > ==================================================================== ======== > ==== > > Building annotation package for Affymetrix Expression array > > CDF...............: PrimeView.cdf > > CEL...............: MJ_05042013_TAS_10_PrimeView.CEL > > Sequence TAB-Delim: PrimeView.probe_tab > > ==================================================================== ======== > ==== > > Parsing file: PrimeView.cdf... OK > > Parsing file: MJ_05042013_TAS_10_PrimeView.CEL... OK > > Parsing file: PrimeView.probe_tab... OK > > Getting information for featureSet table... OK > > Getting information for pm/mm feature tables... > > OK > > Combining probe information with sequence information... OK > > Getting PM probes and sequences... OK > > Done parsing. > > Creating package in ./pd.primeview > > Inserting 49395 rows into table featureSet... OK > > Inserting 609663 rows into table pmfeature... Error in > sqliteExecStatement(con, statement, bind.data) : > > RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must be > unique) > > In addition: Warning messages: > > 1: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile, : > > Probe sequences were not found for all PM probes. These probes will be > removed from the pmSequence object. > > 2: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile, : > > Probe sequences were not found for all MM probes. These probes will be > removed from the mmSequence object. > > > > > > > >> sessionInfo() > > R version 3.0.0 (2013-04-03) > > Platform: x86_64-unknown-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=C LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] parallel stats graphics grDevices utils datasets methods > > [8] base > > > > other attached packages: > > [1] pdInfoBuilder_1.24.0 oligo_1.24.0 oligoClasses_1.22.0 > > [4] affxparser_1.32.1 RSQLite_0.11.4 DBI_0.2-7 > > [7] Biobase_2.20.0 BiocGenerics_0.6.0 > > > > loaded via a namespace (and not attached): > > [1] affyio_1.28.0 BiocInstaller_1.10.2 Biostrings_2.28.0 > > [4] bit_1.1-10 codetools_0.2-8 ff_2.2-11 > > [7] foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.1 > > [10] iterators_1.0.6 preprocessCore_1.22.0 splines_3.0.0 > > [13] stats4_3.0.0 zlibbioc_1.6.0 > >> > > > > Max Kauer > > CHILDREN'S CANCER RESEARCH INSTITUTE > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 10.8 years ago Benilton Carvalho ★ 4.3k

0

Entering edit mode

cstrato ★ 3.9k

@cstrato-908

Last seen 5.5 years ago

Austria

Dear Max, In principle you could also use package xps, which can handle PrimeView arrays. To create a root 'scheme' file (see vignette xps.pdf) you simply need to do: ### new R session: load library xps library(xps) ### define directories: # directory containing Affymetrix library files libdir <- "/Volumes/GigaDrive/Affy/libraryfiles" # directory containing Affymetrix annotation files anndir <- "/Volumes/GigaDrive/Affy/Annotation" # directory to store ROOT scheme files scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes" ### create scheme file: scheme.primeview <- import.expr.scheme("primeview", filedir = file.path(scmdir, "na33"), schemefile = file.path(libdir, "PrimeView.CDF"), probefile = file.path(libdir, "PrimeView.probe.tab"), annotfile = file.path(anndir, "Version12Nov", "PrimeView.na33.annot.csv")) For more information and examples see also the example scripts in xps/examples/script4schemes.R and xps/examples/script4xps.R Best regards, Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ On 6/18/13 3:25 PM, Max Kauer wrote: > Hi, > > I am trying to make a pd.info package for the Affy Primeview array, but I > get an error. > > Thanks for any help! > > Cheers, > > Max > > > > > > This is my code: > > > > library(pdInfoBuilder) > > cdf <- list.files( pathAnnotPr, pattern = ".cdf", full.names = TRUE ) > > cel <- list.files( pathC, pattern = ".CEL", full.names = TRUE )[1] # take > first array > > tab <- list.files(pathAnnotPr, pattern = "_tab", full.names = TRUE) > > > > seed <- new("AffyExpressionPDInfoPkgSeed", > > cdfFile = cdf, celFile = cel, > > tabSeqFile = tab, author = "xx", > > email = "xx", > > biocViews = "AnnotationData", > > genomebuild = "hg19", > > organism = "Human", species = "Homo Sapiens", > > url = "xx" > > ) > > makePdInfoPackage( seed, destDir = "." ) > > > > > > > > Which produces this output/error (although a pd.primeview directory is > created): > > > > ==================================================================== ======== > ==== > > Building annotation package for Affymetrix Expression array > > CDF...............: PrimeView.cdf > > CEL...............: MJ_05042013_TAS_10_PrimeView.CEL > > Sequence TAB-Delim: PrimeView.probe_tab > > ==================================================================== ======== > ==== > > Parsing file: PrimeView.cdf... OK > > Parsing file: MJ_05042013_TAS_10_PrimeView.CEL... OK > > Parsing file: PrimeView.probe_tab... OK > > Getting information for featureSet table... OK > > Getting information for pm/mm feature tables... > > OK > > Combining probe information with sequence information... OK > > Getting PM probes and sequences... OK > > Done parsing. > > Creating package in ./pd.primeview > > Inserting 49395 rows into table featureSet... OK > > Inserting 609663 rows into table pmfeature... Error in > sqliteExecStatement(con, statement, bind.data) : > > RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must be > unique) > > In addition: Warning messages: > > 1: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile, : > > Probe sequences were not found for all PM probes. These probes will be > removed from the pmSequence object. > > 2: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile, : > > Probe sequences were not found for all MM probes. These probes will be > removed from the mmSequence object. > > > > > > > >> sessionInfo() > > R version 3.0.0 (2013-04-03) > > Platform: x86_64-unknown-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=C LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] parallel stats graphics grDevices utils datasets methods > > [8] base > > > > other attached packages: > > [1] pdInfoBuilder_1.24.0 oligo_1.24.0 oligoClasses_1.22.0 > > [4] affxparser_1.32.1 RSQLite_0.11.4 DBI_0.2-7 > > [7] Biobase_2.20.0 BiocGenerics_0.6.0 > > > > loaded via a namespace (and not attached): > > [1] affyio_1.28.0 BiocInstaller_1.10.2 Biostrings_2.28.0 > > [4] bit_1.1-10 codetools_0.2-8 ff_2.2-11 > > [7] foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.1 > > [10] iterators_1.0.6 preprocessCore_1.22.0 splines_3.0.0 > > [13] stats4_3.0.0 zlibbioc_1.6.0 > >> > > > > Max Kauer > > CHILDREN'S CANCER RESEARCH INSTITUTE > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD COMMENT • link 10.8 years ago cstrato ★ 3.9k

0

Entering edit mode

Thanks everybody for your replies! Maybe a short addon: the reason I wanted to make this pd.info file was that I wanted to try SCAN.UPC on this arrays (which wants that file). Otherwise I tried already rma with the probe and cdf files from the Bioconductor site. And this worked just fine. At least it seemed fine to me - now with probes mapping to multiple probesets, I wonder if that could do something funny to the analysis. Best, Max -----Original Message----- From: cstrato [mailto:cstrato@aon.at] Sent: Tuesday, June 18, 2013 9:18 PM To: Max Kauer Cc: Bioconductor at r-project.org Subject: Re: [BioC] makePdInfoPackage for Primeview arrays Dear Max, In principle you could also use package xps, which can handle PrimeView arrays. To create a root 'scheme' file (see vignette xps.pdf) you simply need to do: ### new R session: load library xps library(xps) ### define directories: # directory containing Affymetrix library files libdir <- "/Volumes/GigaDrive/Affy/libraryfiles" # directory containing Affymetrix annotation files anndir <- "/Volumes/GigaDrive/Affy/Annotation" # directory to store ROOT scheme files scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes" ### create scheme file: scheme.primeview <- import.expr.scheme("primeview", filedir = file.path(scmdir, "na33"), schemefile = file.path(libdir, "PrimeView.CDF"), probefile = file.path(libdir, "PrimeView.probe.tab"), annotfile = file.path(anndir, "Version12Nov", "PrimeView.na33.annot.csv")) For more information and examples see also the example scripts in xps/examples/script4schemes.R and xps/examples/script4xps.R Best regards, Christian _._._._._._._._._._._._._._._._._._ C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a e.m.a.i.l: cstrato at aon.at _._._._._._._._._._._._._._._._._._ On 6/18/13 3:25 PM, Max Kauer wrote: > Hi, > > I am trying to make a pd.info package for the Affy Primeview array, > but I get an error. > > Thanks for any help! > > Cheers, > > Max > > > > > > This is my code: > > > > library(pdInfoBuilder) > > cdf <- list.files( pathAnnotPr, pattern = ".cdf", full.names = TRUE ) > > cel <- list.files( pathC, pattern = ".CEL", full.names = TRUE )[1] # > take first array > > tab <- list.files(pathAnnotPr, pattern = "_tab", full.names = TRUE) > > > > seed <- new("AffyExpressionPDInfoPkgSeed", > > cdfFile = cdf, celFile = cel, > > tabSeqFile = tab, author = "xx", > > email = "xx", > > biocViews = "AnnotationData", > > genomebuild = "hg19", > > organism = "Human", species = "Homo Sapiens", > > url = "xx" > > ) > > makePdInfoPackage( seed, destDir = "." ) > > > > > > > > Which produces this output/error (although a pd.primeview directory is > created): > > > > ====================================================================== > ====== > ==== > > Building annotation package for Affymetrix Expression array > > CDF...............: PrimeView.cdf > > CEL...............: MJ_05042013_TAS_10_PrimeView.CEL > > Sequence TAB-Delim: PrimeView.probe_tab > > ====================================================================== > ====== > ==== > > Parsing file: PrimeView.cdf... OK > > Parsing file: MJ_05042013_TAS_10_PrimeView.CEL... OK > > Parsing file: PrimeView.probe_tab... OK > > Getting information for featureSet table... OK > > Getting information for pm/mm feature tables... > > OK > > Combining probe information with sequence information... OK > > Getting PM probes and sequences... OK > > Done parsing. > > Creating package in ./pd.primeview > > Inserting 49395 rows into table featureSet... OK > > Inserting 609663 rows into table pmfeature... Error in > sqliteExecStatement(con, statement, bind.data) : > > RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must > be > unique) > > In addition: Warning messages: > > 1: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile, : > > Probe sequences were not found for all PM probes. These probes will > be removed from the pmSequence object. > > 2: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile, : > > Probe sequences were not found for all MM probes. These probes will > be removed from the mmSequence object. > > > > > > > >> sessionInfo() > > R version 3.0.0 (2013-04-03) > > Platform: x86_64-unknown-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=C LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] parallel stats graphics grDevices utils datasets methods > > [8] base > > > > other attached packages: > > [1] pdInfoBuilder_1.24.0 oligo_1.24.0 oligoClasses_1.22.0 > > [4] affxparser_1.32.1 RSQLite_0.11.4 DBI_0.2-7 > > [7] Biobase_2.20.0 BiocGenerics_0.6.0 > > > > loaded via a namespace (and not attached): > > [1] affyio_1.28.0 BiocInstaller_1.10.2 Biostrings_2.28.0 > > [4] bit_1.1-10 codetools_0.2-8 ff_2.2-11 > > [7] foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.1 > > [10] iterators_1.0.6 preprocessCore_1.22.0 splines_3.0.0 > > [13] stats4_3.0.0 zlibbioc_1.6.0 > >> > > > > Max Kauer > > CHILDREN'S CANCER RESEARCH INSTITUTE > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 10.8 years ago Max Kauer ▴ 140

0

Entering edit mode

Hi Max, It seems that the primeview array is in a sort of no-man's land. The cdf file that I have, and use to build the cdf package is a text cdf. Why this matters is covered here: https://stat.ethz.ch/pipermail/bioc-devel/2007-October/001403.html The upshot being that the software used to produce the cdf packages for use with the affy package will only use a multiply-mapped probeset one time (e.g., if a probe is used in two or more probesets, it will only be mapped to one probeset when producing the cdf package). I just checked, and the current version of the cdf is still text. So if you use the cdf that you automatically get from BioC, then you are explicitly excluding some probes from some probesets. And this is how things will remain, as we are just offering a converted version of what we get from the Affy website. But there are things you can do if you want to do different things. First, you can use the affxparser package to convert the cdf file from Affy (which you can get here: http://www.affymetrix.com/support/downloads/library_files/primeview_li braryfile.zip Unzip and then open CD_PrimeView_rev01/Full/PrimeView/LibFiles and put the PrimeView.cdf somewhere useful). You can then use convertCdf() to make a binary format cdf, and then use makecdfenv to make a cdfpackage that will have all of the multiply-mapped probes in each probeset. Alternatively, you can use one of the MBNI remapped cdfs. You will want to go here: http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF _download.asp then choose the mapping you like, and then figure out what the cdf package name is. You can then use biocLite() to get the correct cdf for your version of R. So if you like Entrez Gene mappings, you would want biocLite("primeviewhsentrezgcdf") Best, Jim On 6/19/2013 3:49 AM, Max Kauer wrote: > Thanks everybody for your replies! > Maybe a short addon: the reason I wanted to make this pd.info file was that > I wanted to try SCAN.UPC on this arrays (which wants that file). Otherwise I > tried already rma with the probe and cdf files from the Bioconductor site. > And this worked just fine. At least it seemed fine to me - now with probes > mapping to multiple probesets, I wonder if that could do something funny to > the analysis. > > Best, > Max > > > > -----Original Message----- > From: cstrato [mailto:cstrato at aon.at] > Sent: Tuesday, June 18, 2013 9:18 PM > To: Max Kauer > Cc: Bioconductor at r-project.org > Subject: Re: [BioC] makePdInfoPackage for Primeview arrays > > Dear Max, > > In principle you could also use package xps, which can handle PrimeView > arrays. To create a root 'scheme' file (see vignette xps.pdf) you simply > need to do: > > ### new R session: load library xps > library(xps) > > ### define directories: > # directory containing Affymetrix library files libdir<- > "/Volumes/GigaDrive/Affy/libraryfiles" > # directory containing Affymetrix annotation files anndir<- > "/Volumes/GigaDrive/Affy/Annotation" > # directory to store ROOT scheme files > scmdir<- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes" > > ### create scheme file: > scheme.primeview<- import.expr.scheme("primeview", filedir = > file.path(scmdir, "na33"), > schemefile = file.path(libdir, "PrimeView.CDF"), > probefile = file.path(libdir, > "PrimeView.probe.tab"), > annotfile = file.path(anndir, "Version12Nov", > "PrimeView.na33.annot.csv")) > > For more information and examples see also the example scripts in > xps/examples/script4schemes.R and xps/examples/script4xps.R > > Best regards, > Christian > _._._._._._._._._._._._._._._._._._ > C.h.r.i.s.t.i.a.n S.t.r.a.t.o.w.a > V.i.e.n.n.a A.u.s.t.r.i.a > e.m.a.i.l: cstrato at aon.at > _._._._._._._._._._._._._._._._._._ > > > > On 6/18/13 3:25 PM, Max Kauer wrote: >> Hi, >> >> I am trying to make a pd.info package for the Affy Primeview array, >> but I get an error. >> >> Thanks for any help! >> >> Cheers, >> >> Max >> >> >> >> >> >> This is my code: >> >> >> >> library(pdInfoBuilder) >> >> cdf<- list.files( pathAnnotPr, pattern = ".cdf", full.names = TRUE ) >> >> cel<- list.files( pathC, pattern = ".CEL", full.names = TRUE )[1] # >> take first array >> >> tab<- list.files(pathAnnotPr, pattern = "_tab", full.names = TRUE) >> >> >> >> seed<- new("AffyExpressionPDInfoPkgSeed", >> >> cdfFile = cdf, celFile = cel, >> >> tabSeqFile = tab, author = "xx", >> >> email = "xx", >> >> biocViews = "AnnotationData", >> >> genomebuild = "hg19", >> >> organism = "Human", species = "Homo Sapiens", >> >> url = "xx" >> >> ) >> >> makePdInfoPackage( seed, destDir = "." ) >> >> >> >> >> >> >> >> Which produces this output/error (although a pd.primeview directory is >> created): >> >> >> >> ====================================================================== >> ====== >> ==== >> >> Building annotation package for Affymetrix Expression array >> >> CDF...............: PrimeView.cdf >> >> CEL...............: MJ_05042013_TAS_10_PrimeView.CEL >> >> Sequence TAB-Delim: PrimeView.probe_tab >> >> ====================================================================== >> ====== >> ==== >> >> Parsing file: PrimeView.cdf... OK >> >> Parsing file: MJ_05042013_TAS_10_PrimeView.CEL... OK >> >> Parsing file: PrimeView.probe_tab... OK >> >> Getting information for featureSet table... OK >> >> Getting information for pm/mm feature tables... >> >> OK >> >> Combining probe information with sequence information... OK >> >> Getting PM probes and sequences... OK >> >> Done parsing. >> >> Creating package in ./pd.primeview >> >> Inserting 49395 rows into table featureSet... OK >> >> Inserting 609663 rows into table pmfeature... Error in >> sqliteExecStatement(con, statement, bind.data) : >> >> RS-DBI driver: (RS_SQLite_exec: could not execute: PRIMARY KEY must >> be >> unique) >> >> In addition: Warning messages: >> >> 1: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile, > : >> Probe sequences were not found for all PM probes. These probes will >> be removed from the pmSequence object. >> >> 2: In parseCdfCelProbe(object at cdfFile, object at celFile, object at tabSeqFile, > : >> Probe sequences were not found for all MM probes. These probes will >> be removed from the mmSequence object. >> >> >> >> >> >> >> >>> sessionInfo() >> R version 3.0.0 (2013-04-03) >> >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> >> >> locale: >> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> >> [7] LC_PAPER=C LC_NAME=C >> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> >> >> attached base packages: >> >> [1] parallel stats graphics grDevices utils datasets methods >> >> [8] base >> >> >> >> other attached packages: >> >> [1] pdInfoBuilder_1.24.0 oligo_1.24.0 oligoClasses_1.22.0 >> >> [4] affxparser_1.32.1 RSQLite_0.11.4 DBI_0.2-7 >> >> [7] Biobase_2.20.0 BiocGenerics_0.6.0 >> >> >> >> loaded via a namespace (and not attached): >> >> [1] affyio_1.28.0 BiocInstaller_1.10.2 Biostrings_2.28.0 >> >> [4] bit_1.1-10 codetools_0.2-8 ff_2.2-11 >> >> [7] foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.1 >> >> [10] iterators_1.0.6 preprocessCore_1.22.0 splines_3.0.0 >> >> [13] stats4_3.0.0 zlibbioc_1.6.0 >> >> >> >> Max Kauer >> >> CHILDREN'S CANCER RESEARCH INSTITUTE >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 10.8 years ago James W. MacDonald 65k

Login before adding your answer.