Question

Analysing two-channel Agilent microarray data

0

Entering edit mode

joanna.goscik • 0

@joannagoscik-7031

Last seen 9.5 years ago

Hi All,

I've come across an ambiguity concerning reading and analysing two-channel Agilent microarray data.

The function I use to read the data in is read.maimages with the defaults (gMedianSignal, rMedianSignal, gMedianSignal, rMedianSignal) except for annotation columns.

As a result of the analysis I obtained a list of genes, but in some cases I observe the following problem: when I look in the raw data files (for some genes) there is LogRatio = 0 and corresponding p-value equal to 1 whereas the same genes are listed as significantly differentially expressed - after my analysis using limma.

I really have no idea where the ploblem lies - I assume that Feature Extraction Software provided by Agilent uses different columns for calcutation of LogRatio - should I read the data in in a different way (e.g. using gProcessedSignal etc.)?

I would really appreciate your help,

Joanna

limma microarray agilent • 2.1k views

ADD COMMENT • link updated 9.5 years ago by Gordon Smyth 50k • written 9.5 years ago by joanna.goscik • 0

score 2 · Accepted Answer · 2014-11-12

There is no problem. limma has processed your data correctly.

limma reads and uses the raw intensity data, rather than using Feature Extraction's summary measures that are also included in the data file.

The p-value column in the Feature Extraction file has to do with calling a probe as expressed relative to background. The limma authors do not believe that this is the best approach to background correction. In any case, this p-value has nothing to do with the p-values returned by a limma differential expression analysis. It is for a completely different purpose.

Limma computes log-ratios (M-values) in a slightly different way to Feature Extraction. limma does not throw out information unnecessarily by setting some of the log-ratios to zero. Setting log-ratios artificially to zero, when the actual measurements were not equal in the two channels, would interfere with the differential expression procedures. Feature Extraction only does this because it thinks you are going to interpret one array in isolation -- it doesn't know that you are going to do a proper differential expression analysis with biological replicates.

The limma approach to background correction and log-ratios is explained in this publication: http://bioinformatics.oxfordjournals.org/content/23/20/2700

Please do not change the way the data is read.