Hi,
I'm likely to be working with some Clariom D assays and I'm starting to do some initial explorations.
There are some example CEL files provide on the Thermo Fisher website and I'm using those as my test set.
I'm interested in being able to select specific sets of probes from the total set of probes on the assay. Having run the following code below for a single CEL file, I'm confused as to how I can use the fid of a probe to be able to select it from the associated expression set.
I can extract the probe fids using getProbeInfo but I'm stuck after that. I notice that featureData is set to "none" for the CEL data read in.
Is a probe's fid its index in the expression set matrix?
Thanks
Alistair
> library(oligo)
> library(affycoretools)
> d_file <- "Clariom/Clariom_D/sample_data/WTPlus_Liver_Rep1_ClariomD.CEL"
> d_data <- read.celfiles(d_file)
Platform design info loaded.
Reading in : /Clariom/Clariom_D/sample_data/WTPlus_Liver_Rep1_ClariomD.CEL
> d_data
HTAFeatureSet (storageMode: lockedEnvironment)
assayData: 6892960 features, 1 samples
element names: exprs
protocolData
rowNames: WTPlus_Liver_Rep1_ClariomD.CEL
varLabels: exprs dates
varMetadata: labelDescription channel
phenoData
rowNames: WTPlus_Liver_Rep1_ClariomD.CEL
varLabels: index
varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.clariom.d.human
> d_probes <- getProbeInfo(d_data, field = c('fid', 'fsetid', 'type', 'x', 'y'))
> head(d_probes)
fid man_fsetid fsetid x y type
1 6 PSR1700199794.hg.1 24403702 5 0 main->psr
2 8 24657315 24657315 7 0 <NA>
3 8 PSR1300152110.hg.1 24258776 7 0 main->psr
4 9 PSR0200224250.hg.1 23827198 8 0 main->psr
5 11 24587906 24587906 10 0 <NA>
6 11 PSR0300183028.hg.1 23858357 10 0 main->psr
> sessionInfo() R version 3.4.3 (2017-11-30) Platform: x86_64-conda_cos6-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=C [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=C [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] pd.clariom.d.human_3.14.1 DBI_1.0.0 [3] RSQLite_2.1.1 affycoretools_1.50.6 [5] oligo_1.42.0 Biostrings_2.46.0 [7] XVector_0.18.0 IRanges_2.12.0 [9] S4Vectors_0.16.0 Biobase_2.38.0 [11] oligoClasses_1.40.0 BiocGenerics_0.24.0
Thanks James.
So something like the following, for the small example above, would give me what I expect:
wish_list_probe_fids <- c(6, 8, 9, 11) eset_filtered <- exprs(d_data)[wish_list_probe_fids]
If what you expect is to get rows 6,8,9 and 11, then yes.