Question: QC for Affymetrix miRNA 4.0 arrays: Error from qc/QCReport
0
4.8 years ago by
Switzerland
federico.comoglio100 wrote:

Hi,

I'm analyzing >30 Affymetrix miRNA 4.0 microarrays. As the corresponding miRNA 4.0 CDF is not available from BioC, I downloaded it from the Affymetrix website. I then created a CDF environment using make.cdf.env (makecdfenv package) and read in the data as:

rawData <- ReadAffy( )
rawData@cdfName <- 'mirna40'

The returned AffyBatch object seems perfectly fine to me. It has meaningful row.names and runs smoothly through rma (affy).

However, I would like to perform extensive QC for these data before proceding with differential expression analysis. To this end, I understand that the QCReport (affyQCReport package) and/or the qc (simpleaffy) functions are valuable options. Unfortunately, a call to either function currently raises the error below:

QCReport( rawData, file = 'QC.pdf' )

Error in ans[[i]][, i.probes] : subscript out of bounds

qc( rawData )
Error in ans[[i]][, i.probes] : subscript out of bounds

Debugging suggests that the error is generated by the signalDist function, but I was unable to go further.

I would really appreciate your help on this. Thanks a lot in advance.

Federico

sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] hgu95av2cdf_2.15.0   affydata_1.13.1      affyQCReport_1.44.0
[4] lattice_0.20-29      BiocInstaller_1.16.1 makecdfenv_1.42.0
[7] affyio_1.34.0        simpleaffy_2.42.0    gcrma_2.38.0
[10] genefilter_1.48.1    affy_1.44.0          Biobase_2.26.0
[13] BiocGenerics_0.12.1

loaded via a namespace (and not attached):
[1] affyPLM_1.42.0        annotate_1.44.0       AnnotationDbi_1.28.1
[4] Biostrings_2.34.1     DBI_0.3.1             GenomeInfoDb_1.2.4
[7] grid_3.1.2            IRanges_2.0.1         preprocessCore_1.28.0
[10] RColorBrewer_1.1-2    RSQLite_1.0.0         S4Vectors_0.4.0
[13] splines_3.1.2         stats4_3.1.2          survival_2.37-7
[16] tools_3.1.2           XML_3.98-1.1          xtable_1.7-4
[19] XVector_0.6.0         zlibbioc_1.12.0
affy simpleaffy affyqcreport • 1.2k views
modified 4.8 years ago • written 4.8 years ago by federico.comoglio100
Answer: QC for Affymetrix miRNA 4.0 arrays: Error from qc/QCReport
1
4.8 years ago by
United States
James W. MacDonald52k wrote:

Both the simpleaffy and affyQCReport packages were designed with the original 3'-biased arrays in mind. The miRNA arrays don't have the same content, so both of these packages will tend to fail because the miRNA arrays do not fulfill the expectations that particular probesets will exist on the array.

In addition, the miRNA arrays are difficult to QC because in general most of the transcripts are either expressed at relatively low concentrations or not at all. And there is content on the array for any number of different species (and Affy may or may not re-use the same probes for different species, depending on conservation).

Add in the fact that miRNA transcripts are usually 21-23 nt long, and the Affy probes are 25 nt long (so each probe is usually longer than the transcript being measured, and the probeset is made up of the same probe, just distributed across the array), and things like the affyRNADeg() plot no longer make sense.

Long story short, you are pretty much on your own with these arrays.

Answer: QC for Affymetrix miRNA 4.0 arrays: Error from qc/QCReport
0
4.8 years ago by
Switzerland
federico.comoglio100 wrote:

Hi Jim,

thank you for your insightful answer. I do agree with you that QC such as RNA degration do not make sense for these arrays. However, spike-in controls should be meaningful. In addition, even a simple boxplot of raw intensity values fail in a call to

boxplot( rawData )

raising the same error as above.

I have never used the affy package and the (unsupported) CDF file for these arrays, instead using oligo, which is much better suited.

> dat1 <- read.celfiles(filenames = samps\$File[1:6])
> boxplot(dat1)
Warning message:
'isIdCurrent' is deprecated.
See help("Deprecated")

## the above warnings have to do with changes to the RSQLite package, and will not affect the analysis, and will go away in the next release

> dat1
ExpressionFeatureSet (storageMode: lockedEnvironment)
assayData: 292681 features, 6 samples
element names: exprs
protocolData
rowNames: A12258.CEL A10033.CEL ... A10016.CEL (6 total)
varLabels: exprs dates
phenoData
rowNames: A12258.CEL A10033.CEL ... A10016.CEL (6 total)
varLabels: index
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.mirna.4.0

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.