What MetaPhlAn3 database was used in cMD 3?
Last seen 6 days ago
United States


Thank you for putting together a nice package. I'm looking forward to the v4 release.

In the meantime, I'd like to confirm exactly what MetaPhlAn database was used in cMD 3. It looks like cMD 3 used MetaPhlAn v3.0, and based on this Github page, I assume this means mpa_v30_CHOCOPhlAn_201901. Can you please confirm?

Thank you in advance,

curatedMetagenomicData • 91 views
Last seen 2 days ago
CUNY Graduate School of Public Health a…

Hi Lev, you are correct. Here are additional details about pre-processing of curatedMetagenomicData v3 in case useful:

  • MetapPhlAn version was 3.0. HUMAnN3 was version v3.0.0.alpha.3.
  • We didn't run any preprocessing, as we simply downloaded the data from NCBI. Data which originated in our lab have been preprocessed with a pipeline similar to Kneaddata, but there is no general preprocessing method adopted for the studies included as a whole (we rely on the original authors for this, including ourselves).
  • We run metaphlan and humann with default settings always. HUMAnN was run taking as metaphlan-profile the above-described profile, and adding the param --metaphlan-options -t rel_ab --index mpa_v30_CHOCOPhlAn_201901, the protein database was the uniref90_201901, while the nucleotide database was chocophlan version 201901. Both HUMANn and Metaphlan were called by their conda environments. HUMAnN used therefore diamond version 2.0.4 and bowtie2 version 2.4.1.
  • Metaphlan was first run specifying --index mpa_v30_CHOCOPhlAn_201901, and bowtie2 version

