Question: How to control p-value inflation with limma
0
2.7 years ago by
stephane.cauchi0 wrote:

Dear all,

Using the limma package I compared the mRNA expression profiles between small groups (N=5) of isogenic mice (balb/c) for different conditions (Agilent one-color microarrays). I could observe strong p-value inflation (as measured by lambda) in most comparisons despite the FDR correction. One factor known from genomewide association (GWA) studies to cause p-value inflation is population stratification, such as relatedness among individuals. Whatever the comparisons, all the mice are supposed to have the same genetic background. Therefore, I have several questions:

2) Because one-color microarrays were used for this experiment, I have been told that within array normalization may not be feasible. Until now this has not been applied to the dataset. What do you think?

3) Do you think that additional packages such as BACON may be necessary to fix this issue?

Thank you very much for your help

limma pvalue inflation bacon • 686 views
modified 2.7 years ago by Aaron Lun25k • written 2.7 years ago by stephane.cauchi0
Answer: How to control p-value inflation with limma
1
2.7 years ago by
Aaron Lun25k
Cambridge, United Kingdom
Aaron Lun25k wrote:

The genomic inflation factor is not applicable here. GWAS analyses involve different models and different assumptions, and you can't just take a diagnostic from those analyses and expect them to be useful in limma. In particular, diagnostics based on goodness-of-fit statistics are not relevant to linear models, because any deviation of the observations from the fitted values will be modelled by an increase in the residual variance. This means that you won't be able to distinguish between an incorrect model that's missing some terms, and a correct model fitted to highly variable data. Now, to answer your specific questions:

1. Just because your mice have the same genetic background doesn't mean that there aren't underlying correlations. Were some of the mice raised at the same time? Are they littermates? Even if all samples were collected at the same time, were some samples processed differently? These factors may affect expression and cause the residuals for some samples to be positively correlated; this is usually anticonservative as the amount of information in the data is overstated. If you have known factors of variation in your data set, you should block on them in the design matrix or via duplicateCorrelation.
2. Of course you can't do within-array normalization, that's for two-colour arrays. You need to do between-array normalisation.
3. What issue? I'm yet to be convinced that limma is not working.
Answer: How to control p-value inflation with limma
0
2.7 years ago by
Netherlands
m.van_iterson20 wrote:

Dear Stephane,

I wouldn't not use bacon for this. bacon is really meant for association studies with sample size n>100. Also I wouldn't not use the GWAS inflation factor it is really meant for GWAS data.

Did you check the limma userguide there is a whole chapter devoted to single channel arrays: "Single-Channel Experimental Design".

If you think there might be unobserved confounding factors you could look at sva, combat, ruv or cate. These are all methods for handling unobserved confounding factors and have a R/Bioconductor implementation.

Cheers,

Maarten