Filtering pmSequence based on probe target level for HTA 2.0 arrays
1
0
Entering edit mode
@stephen-piccolo-6761
Last seen 3.6 years ago
United States
List members, I am working with some Affymetrix HTA 2.0 arrays. I have installed the draft annotation package described here: http://grokbase.com/t/r/bioconductor/1428394w2d/bioc-draft-support- for-hta- 2-0-with-oligo I am using the following commands from the oligo package to extract intensity values and PM sequences via the oligo package. However, I am running into a problem because the oligo::pmSequence function doesn't allow me to specify a target probe type for these arrays. By default oligo::pm() uses the "core" probes, whereas oligo::pmSequence only allows me to use the "probeset" probes. In contrast, for the ST arrays, I am able to do this. affyExpressionFS <- read.celfiles(celFilePath) pint = oligo::pm(affyExpressionFS, target="core") pmSeq = oligo::pmSequence(affyExpressionFS, target="core") Below is the error message I get. Loading required package: pd.hta.2.0 Loading required package: RSQLite Loading required package: DBI Platform design info loaded. Reading in : testInputData/HTA2.CEL.gz Error in { : task 1 failed - "unused argument (target = "probeset")" Below is my session info. Any help would be appreciated. R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel methods stats graphics grDevices utils datasets [8] base other attached packages: [1] pd.hta.2.0_3.8.0 RSQLite_0.11.4 DBI_0.2-7 [4] GEOquery_2.30.1 sva_3.10.0 mgcv_1.8-2 [7] nlme_3.1-117 corpcor_1.6.6 foreach_1.4.2 [10] oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 [13] IRanges_1.22.10 Biobase_2.24.0 oligoClasses_1.26.0 [16] BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] affxparser_1.36.0 affyio_1.32.0 BiocInstaller_1.14.2 [4] bit_1.1-12 codetools_0.2-8 compiler_3.1.0 [7] ff_2.2-13 GenomeInfoDb_1.0.2 GenomicRanges_1.16.4 [10] grid_3.1.0 iterators_1.0.7 lattice_0.20-29 [13] Matrix_1.1-4 preprocessCore_1.26.1 RCurl_1.95-4.3 [16] splines_3.1.0 stats4_3.1.0 XML_3.98-1.1 [19] zlibbioc_1.10.0 Regards, -Steve -??????????????????????????????????? Stephen Piccolo, Ph.D. Postdoctoral Research Associate Affiliations: Department of Pharmacology and Toxicology, University of Utah Division of Computational Biomedicine, Boston University School of Medicine ???????????????????????????????????
Annotation probe oligo Annotation probe oligo • 2.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 6 hours ago
United States
Hi Steve, It looks like pmSequence() for HTAFeatureSet objects dispatches on the FeatureSet class: > showMethods(pmSequence, class="FeatureSet", includeDefs = TRUE) Function: pmSequence (package oligo) object="FeatureSet" function (object, ...) { .local <- function (object) { pmSequence(getPlatformDesign(object)) } .local(object, ...) } which doesn't allow for a target argument. I haven't looked closer to see why the dispatch is off. But it appears it should use stArrayDBPDInfo class: > showMethods(pmSequence) Function: pmSequence (package oligo) object="AffyGenePDInfo" object="AffyHTAPDInfo" (inherited from: object="stArrayDBPDInfo") object="AffySNPPDInfo" object="DBPDInfo" object="ExonFeatureSet" object="FeatureSet" object="GeneFeatureSet" object="HTAFeatureSet" (inherited from: object="FeatureSet") object="stArrayDBPDInfo" Which we can force by doing something like z <- pmSequence(getPD(dat), target="probeset") where 'dat' is a HTAFeatureSet. But we still get more probe sequences than I would expect: > pmid1 <- pmindex(dat, target="core") > pmid2 <- pmindex(dat, target="probeset") > length(pmid1) [1] 6058440 > length(pmid2) [1] 7576209 But since both pmid1 and pmid2 are ordered, I think you should be able to get the pmSequences for just the probes that will be summarized at the 'core' level by subsetting: > z.core <- z[pmid2 %in% pmid1,] > z.core A DNAStringSet instance of length 6056075 width seq [1] 25 GATTAATCTTAAATCAGGATGATCC [2] 25 CAAAATCTAAACCCGGACTGTACCT [3] 25 CACACTATTCACACCCGCACCGAAG [4] 25 CCGTACCTTTCAAGGTCGGCCAAGC [5] 25 ACCCCTTGACTAAGGACGGTTGTTG ... ... ... [6056071] 25 TCACCGTGTGTCGACGCCGGACACA [6056072] 25 AGGTTCCTGGGACCTCGTGAGTACA [6056073] 25 GACCCAGAGTGTAGCTCGACGACCT [6056074] 25 ACCACAGGTACGACACTACTAAGGA [6056075] 25 TGGCCTTCCGTGCATATCTGCACCT Best, Jim On Wed, Aug 20, 2014 at 10:55 AM, Steve Piccolo < stephen.piccolo at hsc.utah.edu> wrote: > List members, > > I am working with some Affymetrix HTA 2.0 arrays. I have installed the > draft annotation package described here: > http://grokbase.com/t/r/bioconductor/1428394w2d/bioc-draft-support- for-hta- > 2-0-with-oligo > > I am using the following commands from the oligo package to extract > intensity values and PM sequences via the oligo package. However, I am > running into a problem because the oligo::pmSequence function doesn't > allow me to specify a target probe type for these arrays. By default > oligo::pm() uses the "core" probes, whereas oligo::pmSequence only allows > me to use the "probeset" probes. In contrast, for the ST arrays, I am able > to do this. > > affyExpressionFS <- read.celfiles(celFilePath) > pint = oligo::pm(affyExpressionFS, target="core") > > pmSeq = oligo::pmSequence(affyExpressionFS, target="core") > > > > Below is the error message I get. > > Loading required package: pd.hta.2.0 > Loading required package: RSQLite > Loading required package: DBI > Platform design info loaded. > Reading in : testInputData/HTA2.CEL.gz > Error in { : task 1 failed - "unused argument (target = "probeset")" > > Below is my session info. Any help would be appreciated. > > > R version 3.1.0 (2014-04-10) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel methods stats graphics grDevices utils datasets > [8] base > > other attached packages: > [1] pd.hta.2.0_3.8.0 RSQLite_0.11.4 DBI_0.2-7 > [4] GEOquery_2.30.1 sva_3.10.0 mgcv_1.8-2 > [7] nlme_3.1-117 corpcor_1.6.6 foreach_1.4.2 > [10] oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 > [13] IRanges_1.22.10 Biobase_2.24.0 oligoClasses_1.26.0 > [16] BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] affxparser_1.36.0 affyio_1.32.0 BiocInstaller_1.14.2 > [4] bit_1.1-12 codetools_0.2-8 compiler_3.1.0 > [7] ff_2.2-13 GenomeInfoDb_1.0.2 GenomicRanges_1.16.4 > [10] grid_3.1.0 iterators_1.0.7 lattice_0.20-29 > [13] Matrix_1.1-4 preprocessCore_1.26.1 RCurl_1.95-4.3 > [16] splines_3.1.0 stats4_3.1.0 XML_3.98-1.1 > [19] zlibbioc_1.10.0 > > > > > Regards, > -Steve > > -??????????????????????????????????? > Stephen Piccolo, Ph.D. > Postdoctoral Research Associate > > Affiliations: > Department of Pharmacology and Toxicology, University of Utah > Division of Computational Biomedicine, Boston University School of > Medicine > ??????????????????????????????????? > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6