Inappropriate Statistical Models for Agilent microRNA Microarray Data
AgiMicroRna has two modes of probe summarisation:

1. Total Gene Signal (TGS), as also done by Agilent's software.
2. RMA.

microRNA probes on Agilent arrays are special, because they have an almost-identical design for each microRNA. For example, for hsa-miR-300, the four probes annotated to it are:

A_25_P00012884: AGAGAGAGTCTGC
A_25_P00012885: AGAGAGAGTCTGCC
A_25_P00012886: AGAGAGAGTCTGCCC
A_25_P00012887: AGAGAGAGTCTGCCCT

Each probe has the same sequence as another probe, except being one nucleotide longer than it. The product documentation for these arrays explains the reason is that some probes hybridise multiple microRNAs and the shorter versions have greater specificity for the particular microRNA of interest.

Both the TGS and RMA algorithms seem inappropriate in this context. Some kind of mixture model approach, which partitions the signal of each probe into miR-specific and non-specific amounts, followed by summing up the specific signal components would be desirable.

