Question

limma normalization and arrayWeights fold change

0

Entering edit mode

merella.stefania • 0

@merellastefania-8220

Last seen 7.9 years ago

Italy

Dear all,

I am using limma package to analyze Illumina iscan micro array data. I am following the manual but I have some question regarding the normalization method and the arrayweight function. My design is quite simple, my samples coming from human patients, I have 3 experimental groups + normal group. Each group has at least 3 replicates. The data was extracted without background correction and without normalization.

I read the data with read.ilm function. As normalization method I am using the neqc function. I do not have the control probe file but, as explained in the manual page, this would not be a problem because the negative control probes are inferred from detection p.values.

fileNameProbe = "SampleProbe_noNorm_noSubBack_30-10-12.txt"
x <- read.ilmn(files=fileNameProbe,other.columns = "Detection")
y <- neqc(x)

Is this correct or is it better to use normalizeWithinArrays?

As suggested in the manual, for human samples and for samples that have different array qualities, I am using arrayWeights function. As explained in this post:

Limma, arrayWeights and fold change

the array weight function affects the logFC values. So my question is: how the array weighted logFC has to be interpreted? In what way is it different from standard logFC?

Any suggestion is really appreciated.

Thanks, Stefania

sessionInfo()
R version 3.2.4 (2016-03-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.4 (El Capitan)

locale:
[1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] gplots_3.0.1 limma_3.26.9

loaded via a namespace (and not attached):
[1] tools_3.2.4        KernSmooth_2.23-15 gdata_2.17.0       caTools_1.17.1     bitops_1.0-6       gtools_3.5.0

limma arrayweights logfoldchange neqc normalization • 1.3k views

ADD COMMENT • link updated 7.9 years ago by svlachavas ▴ 830 • written 7.9 years ago by merella.stefania • 0

score 1 · Answer 1 · 2016-05-30

Dear Stephania,

for your first question: i think even without the control probes, implementing neqc() would be just fine-i think normalizeWithinArrays() is irrelevant, except you are dealing with two colour spotted arrays ?

Regarding your second question, about the arrayWeights: very "naively", arrayWeights help identify and down-weight samples that are more variable on average-but without knowing or infering if the source of this variability is technical or biological. Moreover, it will affect both the log-fold changes and the estimates of the variability in your experiment. The log-FCs are calculated as weighted averages using the corresponding weights from the algorithm, where 1 is the value of "equality line" of the weights. More precisely, consider the following example (and the specialists of the group please correct me if i describe it wrong):

consider one specific gene from your normal group which has 4 replicate samples. For example, this gene has log2-expression values: 8, 10, 10 & 11 in your 4 replicate samples and calculated weights: 0.5,0.5,1 & 1 respectively. Then, the weighted average for the normal group for this specific gene would be:

(8*0.5 + 10*0.5 + 10*1 + 11*1)/(0.5+0.5+1+1)...

Samples with weights above 1 are considered of "good quality", whereas weights below 1 indicate "low quality" (especially near zero). Nevertheless, after implementing arrayWeights you should make a plot of the total weights, in order to inspect if the variability is "roughly" consistent between the samples-or you identify significant variations, which suggest that arrayWeights might be beneficial for your analysis to increase power for detecting DE genes. For more information, the publication below would be very helpful:

http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-7-261

Hope that helps,

Efstathios