Question: Array Normalization on focussed array in Limma using R
0
gravatar for reubenmcgregor88
10 months ago by
reubenmcgregor880 wrote:

I have been analysing protein array data with hundreds and thousands of proteins using Limma in R.

For normalisation I have been using the following:

y <- normalizeBetweenArrays(log2(exprs), method="quantile")

followed by box plots and density plots for QC. Followed by model fitting for differential expression analysis in Limma.

However we then chose the most promising 35 proteins and had a "focussed" array synthesised. Here we chose the 35 proteins that were highest in patients vs controls and ran them for many more patients and their controls. When we got the data back I had a think about the analysis and normalising between arrays may be fine when there are many random proteins to bring between array intensities to similar levels.

However it seems to me (I am relatively new to array analysis so I may be wrong) that if we have specifically chosen proteins based on the low expression in some samples and high expression in other samples that this normalisation would not be valid ,as the assumption for this normalisation is that genes are expected to have low variation. Is this correct?

If so what kind of normalisation is more appropriate for this type of analysis?

Any guidance much appreciated.

EDIT: I have been toying with the idea of using:

y <- normalizeBetweenArrays(log2(exprs), method="cyclicloess")

Which may be more appropriate?

EDIT2: The array was a Protoarray and the analysis has actually already been done by someone from the provider of the service. However I managed to repeat their analysis getting the same values in Limma with the quantile normalisation mentioned above. The issue is I am questioning if they simply ran a standard analysis pipeline for large arrays not putting much thought into the different design of the array

Note: all values are Log2 transformations of the fluorescence data

Boxplots of data pre normalisation: enter image description here

Boxplots of data post quantile normalisation: enter image description here

Boxplots of data post cyclicloess normalisation: enter image description here

microarray limma R • 412 views
ADD COMMENTlink modified 10 months ago by Gordon Smyth39k • written 10 months ago by reubenmcgregor880
Answer: Array Normalization on focussed array in Limma using R
2
gravatar for Gordon Smyth
10 months ago by
Gordon Smyth39k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth39k wrote:

You are right to recognise that there is a problem here, because the focussed array design is entirely confounded with DE for patients vs controls.

When we have made focused arrays in the past, we included control probes in order to permit normalization, see

https://genomebiology.biomedcentral.com/articles/10.1186/gb-2007-8-1-r2

Without control probes, there is frankly not much you can do. Switching to cyclic loess normalization will be slightly better because it is more tolerant of DE all in one direction than quantile is. Other than that, you just have to proceed and recognize that the patient vs control log fold changes will be under-estimated (less positive or more negative) because the changes will be partly normalized out.

The boxplots that you've done don't really help. They don't allow you to see the problem. An MA or MD plot comparing patient samples to control samples might show an increasing trend, which would be symptomatic of the problem.

ADD COMMENTlink written 10 months ago by Gordon Smyth39k

Thank you Gordon,

Yes I had read that paper before posting and had an inkling that in hindsight control proteins would have been a good idea.

ADD REPLYlink modified 10 months ago • written 10 months ago by reubenmcgregor880
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 238 users visited in the last hour