Hi,
I'm analyzing >30 Affymetrix miRNA 4.0 microarrays. As the corresponding miRNA 4.0 CDF is not available from BioC, I downloaded it from the Affymetrix website. I then created a CDF environment using make.cdf.env (makecdfenv package) and read in the data as:
rawData <- ReadAffy( ) rawData@cdfName <- 'mirna40'
The returned AffyBatch object seems perfectly fine to me. It has meaningful row.names and runs smoothly through rma (affy).
However, I would like to perform extensive QC for these data before proceding with differential expression analysis. To this end, I understand that the QCReport (affyQCReport package) and/or the qc (simpleaffy) functions are valuable options. Unfortunately, a call to either function currently raises the error below:
QCReport( rawData, file = 'QC.pdf' ) Error in ans[[i]][, i.probes] : subscript out of bounds qc( rawData ) Error in ans[[i]][, i.probes] : subscript out of bounds
Debugging suggests that the error is generated by the signalDist function, but I was unable to go further.
I would really appreciate your help on this. Thanks a lot in advance.
Federico
sessionInfo() R version 3.1.2 (2014-10-31) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] hgu95av2cdf_2.15.0 affydata_1.13.1 affyQCReport_1.44.0 [4] lattice_0.20-29 BiocInstaller_1.16.1 makecdfenv_1.42.0 [7] affyio_1.34.0 simpleaffy_2.42.0 gcrma_2.38.0 [10] genefilter_1.48.1 affy_1.44.0 Biobase_2.26.0 [13] BiocGenerics_0.12.1 loaded via a namespace (and not attached): [1] affyPLM_1.42.0 annotate_1.44.0 AnnotationDbi_1.28.1 [4] Biostrings_2.34.1 DBI_0.3.1 GenomeInfoDb_1.2.4 [7] grid_3.1.2 IRanges_2.0.1 preprocessCore_1.28.0 [10] RColorBrewer_1.1-2 RSQLite_1.0.0 S4Vectors_0.4.0 [13] splines_3.1.2 stats4_3.1.2 survival_2.37-7 [16] tools_3.1.2 XML_3.98-1.1 xtable_1.7-4 [19] XVector_0.6.0 zlibbioc_1.12.0

I have never used the affy package and the (unsupported) CDF file for these arrays, instead using oligo, which is much better suited.
> dat1 <- read.celfiles(filenames = samps$File[1:6]) Loading required package: pd.mirna.4.0 Loading required package: RSQLite Loading required package: DBI Platform design info loaded. Reading in : ../CEL/A12258.CEL Reading in : ../CEL/A10033.CEL Reading in : ../CEL/Z08140.CEL Reading in : ../CEL/Z08062.CEL Reading in : ../CEL/A12263.CEL Reading in : ../CEL/A10016.CEL > boxplot(dat1) Warning message: 'isIdCurrent' is deprecated. Use 'dbIsValid' instead. See help("Deprecated") ## the above warnings have to do with changes to the RSQLite package, and will not affect the analysis, and will go away in the next release > dat1 ExpressionFeatureSet (storageMode: lockedEnvironment) assayData: 292681 features, 6 samples element names: exprs protocolData rowNames: A12258.CEL A10033.CEL ... A10016.CEL (6 total) varLabels: exprs dates varMetadata: labelDescription channel phenoData rowNames: A12258.CEL A10033.CEL ... A10016.CEL (6 total) varLabels: index varMetadata: labelDescription channel featureData: none experimentData: use 'experimentData(object)' Annotation: pd.mirna.4.0