hello,
I have a problem with the oligo package and affys miRNA 4.0 GeneChips.
I load CEL-files (affymetrix, miRNA 4.0 GeneChips) and want extract the raw probelevel values.
dat1 <- read.cellfles(files)
Using probeNames(dat1) and getProbeInfo(dat1) I get 346085 probes.
But the exprs-slot (dat1@assayData$exprs) or if I extract the expressions ( exprs(dat1) ) shows only 292681 values with rownames from 1 to 292681. Which probes are missing or more important how can I annotate these 292681 values with the correct probenames or fid's (means: feature identifier?).
--- SESSION INFO ---
R version 3.2.2 (2015-08-14) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: Scientific Linux release 6.7 (Carbon) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] limma_3.24.15 pd.mirna.4.0_3.12.0 RSQLite_1.0.0 [4] DBI_0.3.1 oligo_1.32.0 Biostrings_2.36.4 [7] XVector_0.8.0 IRanges_2.2.9 S4Vectors_0.6.6 [10] Biobase_2.28.0 oligoClasses_1.30.0 BiocGenerics_0.14.0 [13] rj_2.0.3-1 loaded via a namespace (and not attached): [1] affxparser_1.40.0 GenomicRanges_1.20.8 splines_3.2.2 [4] zlibbioc_1.14.0 bit_1.1-12 rj.gd_1.1.3-1 [7] foreach_1.4.3 GenomeInfoDb_1.4.3 tools_3.2.2 [10] ff_2.2-13 iterators_1.0.8 preprocessCore_1.30.0 [13] affyio_1.36.0 codetools_0.2-14 BiocInstaller_1.18.5
When making a comment, please use the 'ADD COMMENT' link rather than 'Add your answer'.
I think you probably want to query the underlying SQLite database directly.
And if you want to know which are controls or main probes
Thank you again James,
good to know how to access the db directly, but I have already this information. I only need the IDs for the raw exprs matrix in the ExpressionFeatureSet. Perhaps it is so easy, that I only miss something...
See below. I cannot assign the probe-info (n=346085) to the rows of the exprs matrix (n=292681), which shows only an index from 1 to 292681.
Unfortunately, in dat1 (ExpressionFeatureSet) there is no annotation to these 292681 rows in the exprs matrix.
I suppose it is obvious. The arrays are read by the scanner, by row, and when you read in the celfile it's in the same order. That is the fid.
The count is zero based, so the first cell read in is at (0,0). The first cell that is used for anything is in the first row (0 on the y axis) and the sixth column (5 on the x axis). So that probe is in the sixth row of the data.frame you get from doing
exprs(dat1)
.So your pINFO data.frame tells you the row index (fid), the probeset that particular probe goes into, and what type of probeset it is. That should be sufficient, no?
As a side note, there is no profit in accessing slots directly (e.g., dat1@assayData$exprs) - the
exprs
function does the expected thing, and if Benilton changes the underlying structure, theexprs
function will continue to do the expected thing, but direct queries may not.