Question

Low quantile estimate using CSS normalization in metagenomeSeq

0

Entering edit mode

noelle.noyes ▴ 30

@noellenoyes-7241

Last seen 9.9 years ago

United States

When using both cumNormStat and cumNormStatFast, I receive the following warning:

Warning message:
In cumNormStat(obj) :

Low quantile estimate. Default value being used.

I understand the default value is 0.5. In the paper for metagenomeSeq, there's a citation that suggests using 0.75 for RNASeq data, but our data are shotgun metagenomic. My question is whether there's an "agreed upon" value for metagenomic data, or whether there's standard methods for determining what the best value would be.

Thanks!

metagenomeSeq microbiome • 3.8k views

ADD COMMENT • link updated 11.0 years ago by Joseph Nathaniel Paulson ▴ 280 • written 11.0 years ago by noelle.noyes ▴ 30

score 1 · Answer 1 · 2015-01-22

Upper quantile normalization is fully described here: http://bioinformatics.oxfordjournals.org/content/19/2/185.

The method scales the count distribution by a sample's 75th quantile. Our normalization procedure, cumulative-sum scaling (CSS) calculates the quantile of the count distribution of samples where they all should be roughly equivalent and independent of each other up to this quantile under the assumption that, at this range, counts are derived from a common distribution.

With CSS, raw counts are divided by the cumulative sum of counts up to a percentile determined using the data-driven approach as found in cumNormStat / cumNormStatFast. Method and reasons are in: http://www.nature.com/nmeth/journal/v10/n12/full/nmeth.2658.html

We recommend if the value is less than 0.5 then for similar reasons provided in the paper to keep it at 0.5. I should mention this is an area of current research we plan to address publicly soon.

Thanks!