Exon Arrays - GC & how many pm features?
Hi all, I'm a little confused by Exon Array analysis in oligo. Specifically, I've been wondering if I should be correcting for GC content. The plots I get from following the vignette indicate that there is significant GC bias in my data, but the vignette doesn't appear to perform a GC correction later. However, in investigating this, I've encountered another problem: I don't understand why I get two different answers when counting the number of perfect match features associated with the "core" meta- probeset (mps). Here's an example of what I'm doing, applied to the example data; any help would be greatly appreciated: library(oligo) library(pd.huex.1.0.st.v2) library(oligoData) data(affyExonFS) nrow(exprs(affyExonFS)) ## == 6553600 total features length(pmSequence(affyExonFS, target="core")) ## == 893078 core-associated features ##get the feature sets associated with core mps library(AnnotationDbi) conn <- db(affyExonFS) sql <- "SELECT fsetid from core_mps" fsets <- dbGetQuery(conn, sql)$fsetid sql <- "SELECT fsetid from pmfeature" pmseq <- dbGetQuery(conn, sql)$fsetid sum(pmseq %in% fsets) ## == 891084 core-associated features? --------------- Thanks, Jonathan > sessionInfo() R version 2.15.3 (2013-03-01) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] AnnotationDbi_1.20.7 oligoData_1.8.0 pd.huex.1.0.st.v2_3.8.0 [4] RSQLite_0.11.2 DBI_0.2-5 oligo_1.22.0 [7] Biobase_2.18.0 oligoClasses_1.20.0 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] affxparser_1.30.2 affyio_1.26.0 BiocInstaller_1.8.3 [4] Biostrings_2.26.3 bit_1.1-10 codetools_0.2-8 [7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7 [10] IRanges_1.16.6 iterators_1.0.6 parallel_2.15.3 [13] preprocessCore_1.20.0 splines_2.15.3 stats4_2.15.3 [16] tools_2.15.3 zlibbioc_1.4.0 [[alternative HTML version deleted]]
