Entering edit mode
Jonathan Cairns
▴
60
@jonathan-cairns-5761
Last seen 10.2 years ago
Hi all,
I'm a little confused by Exon Array analysis in oligo. Specifically,
I've been wondering if I should be correcting for GC content. The
plots I get from following the vignette indicate that there is
significant GC bias in my data, but the vignette doesn't appear to
perform a GC correction later.
However, in investigating this, I've encountered another problem: I
don't understand why I get two different answers when counting the
number of perfect match features associated with the "core" meta-
probeset (mps). Here's an example of what I'm doing, applied to the
example data; any help would be greatly appreciated:
library(oligo)
library(pd.huex.1.0.st.v2)
library(oligoData)
data(affyExonFS)
nrow(exprs(affyExonFS))
## == 6553600 total features
length(pmSequence(affyExonFS, target="core"))
## == 893078 core-associated features
##get the feature sets associated with core mps
library(AnnotationDbi)
conn <- db(affyExonFS)
sql <- "SELECT fsetid from core_mps"
fsets <- dbGetQuery(conn, sql)$fsetid
sql <- "SELECT fsetid from pmfeature"
pmseq <- dbGetQuery(conn, sql)$fsetid
sum(pmseq %in% fsets)
## == 891084 core-associated features?
---------------
Thanks,
Jonathan
> sessionInfo()
R version 2.15.3 (2013-03-01)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] AnnotationDbi_1.20.7 oligoData_1.8.0
pd.huex.1.0.st.v2_3.8.0
[4] RSQLite_0.11.2 DBI_0.2-5 oligo_1.22.0
[7] Biobase_2.18.0 oligoClasses_1.20.0 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] affxparser_1.30.2 affyio_1.26.0 BiocInstaller_1.8.3
[4] Biostrings_2.26.3 bit_1.1-10 codetools_0.2-8
[7] ff_2.2-11 foreach_1.4.0 GenomicRanges_1.10.7
[10] IRanges_1.16.6 iterators_1.0.6 parallel_2.15.3
[13] preprocessCore_1.20.0 splines_2.15.3 stats4_2.15.3
[16] tools_2.15.3 zlibbioc_1.4.0
[[alternative HTML version deleted]]