Question: Affymetrix miRNA 4.0 Normalization and Analysis
1
3.1 years ago by
Belgium

Hello!

I have to analyze data obtained from Affymetrix miRNA 4.0 microarray. It is the first time I have to analyze such kind of data and I wanted to be sure that I was doing it right. My main concerns are about the normalization of the probe's intensities and the filtering in limma.

# Normalization
library(oligo)
library(pd.mirna.4.0)
celFiles <- list.celfiles("~/Desktop/Affymetrix_miRNA/RawData_miRNA", full.names=TRUE)
eset <- rma(rawData)

# I did some plots and everything look really great after normalization.

# Limma
# I directly use the eset data to calculte the miRNAs differentially expressed. 

Question:

1. Normalization: Should I do something else for the normalization or apply rma is usually enough? Should I do something with the information of the spike-in probes?

2. Filtering: I have seen that among the probes on this micro-array some are unrelevant for my analysis:

snoRNA, spike-in,...

should I remove such probes before computing all the statistics from limma?

3. More globally, I found many tutorials talking about normalization of microarrays for genes but not for miRNAs. Are they differences in the processing of those two types of microarrays that I should know?

4. From what I have seen, some miRNAs are represented by several probes on the chip. They seem to be clustered during the normalization step and the creation of the expression dataset. Can someone explain me how it is done or point me to article/post that explain such thing?

mirna microarray limma R • 2.0k views
modified 3.1 years ago by James W. MacDonald49k • written 3.1 years ago by Radek50
Answer: Affymetrix miRNA 4.0 Normalization and Analysis
2
3.1 years ago by
United States
James W. MacDonald49k wrote:
1. No, that's about it. You could hypothetically use the spike-ins for something, but I have yet to find a reasonable use for them.
2. Probably. I generally remove all the extra cruft Affy puts on the arrays before fitting the model.
3. Well, the miRNA arrays are sort of different from most other Affy arrays. First, most miRNA transcripts are shorter than the Affy probes (21-23 nt vs 25 nt), so Affy tiles down some number (four, I believe) of the same exact probe on different regions of the array. You then quantile normalize, which is reasonable, but then you fit a medianpolish model, which is probably not necessary - you could probably just take the average of the probes and go from there. But I don't think the medianpolish is going to do anything wrong per se, and it's easier to do than trying to figure out how to compute the averages.
4. Yeah, especially with the miRNA 4.0 array. If a given miRNA is conserved across various species, instead of putting down individual probes for each species, Affy just puts down one probeset and re-labels it a bunch of times. It's like more efficient and stuff. Or are you talking about the fact that multiple probes go into a single probeset? In that case, then yeah, multiple probes are summarized to make a single probeset (and hence a single measure of expression for the thing being measured). If you google 'Irizarry affymetrix medianpolish' you will get a bunch of links that will explain things further.

Thanks a lot!

For the last question I was talking about the multiple probes into single probeset so you answered my question.

I have another quick question related to the previous one.

I see that usually when a particular miR is differentially expressed, it is in most of the represented species but not in all. I supposed it is due to the conservation of those miRNAs accros species but I was wondering if in general people tends to restricted their differential expression analysis to the species they are interested in? Like a prefiltering of the data for only human probes is better than use the whole information if I am working with human samples?

Thanks!

That sort of depends on what you are after. In general I restrict to the species under consideration, but you could make the argument that some of the other miRNA transcripts (non-human) for which there isn't a human transcript on the array are actually expressed in humans, and we just don't know about it yet. In that case, a differentially expressed non-human miRNA may indicate that the miRNA is expressed in humans, and differentially so in your experiment.

As usual, there are trade-offs, and you have to decide what trade-offs you want to make.

Thanks this is what I thought!