Search
Question: Advice on normalization with metagenomics data
0
gravatar for David
18 months ago by
David860
David860 wrote:

Hello,

I´m using the metagenomSeq package to normalize my 16S data for one experiment with several samples . Here below are the log2 boxplots of the data after normalization ( i have normalized at the genus level). It looks that the normalization has worked pretty well. I have tried to check the data with qqnorm (2nd graph below).In the qqnorm graph the data does not look very normal ????  I still see a lot of values close to (>0) , i assume these are basically singletons. 

I´m just wondering if this is what you expect from such metagenomics data and if i can apply normality tests (such as anova for eaxmple) to compare my groups or i should stick to the suggested methods in metagenomeSEQ for gorup comparisons.  How can i control my data has properly been normalized. Thanks for your advice.

 

ADD COMMENTlink modified 18 months ago • written 18 months ago by David860
0
gravatar for James W. MacDonald
18 months ago by
United States
James W. MacDonald45k wrote:

Are you thinking that 'normalize' should make your data normally distributed? If so, that's not the case. All a normalization is intended to do is remove as much technical variability between samples as possible, so you can then compare between samples without picking up uninteresting things about how the data were processed.

In other words, the Q-Q plot that you show is to be expected. Count data are not normally distributed, and ecological count data tend to be zero inflated (meaning you get lots of zeros, which may indicate that the species in question wasn't there, or maybe that it was there, but you just didn't count it). The statistics that metagenomeSeq uses are intended to work correctly, given those limitations of the data, whereas a 'regular' linear model is not.
 

ADD COMMENTlink written 18 months ago by James W. MacDonald45k
0
gravatar for David
18 months ago by
David860
David860 wrote:

Thanks James,

Thanks so much for the clarification. I think i understand the meaning of zero inflated now. It was just there but just needed some clarifications. I guess that not normal methods should be use to move forward starting with the methods that metagenomeSeq provides.

How do you know if the normalization has worked properly ?

ADD COMMENTlink modified 18 months ago • written 18 months ago by David860
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 315 users visited in the last hour