Processing of curatedmetagenomics
Entering edit mode
Edward • 0
Last seen 29 days ago

There is a lot of data from different research articles collected and quality checks by manual power. I found there are some .csv files in the path (curatedMetagenomicData/inst/extdata) on github which seems to show the metadata for each involved research. It seems like the raw data FASTQ files are located on the NCBI and they may be manually collected by humans and put into the package for further processing into different forms for different applications. I've tried many ways to find where the original FASTQ file is on NCBI and my questions are below.

  1. How can I trace back to the original source of raw sequence data on NCBI based on the information on metadata ? (Ex: If I want to know the exact raw sequence data of AsnicarF_2017, how can I trace back to the original source on NCBI).

  2. Where can I find more detailed information about processing steps from raw sequence data to usable data (e.g. relative abundance, gene counts, pathway, ...) aside from the published paper ? I think it does not mention every detail how they process the raw data into six output data in curatedmetagenomics (relative abundance, marker gene, marker presence, pathway abundance, pathway coverage, and gene family.) . Also, I think there may be other quality controls requirement during the processing steps such as excluding some samples or reads or etc...


curatedMetagenomicData • 77 views

Login before adding your answer.

Traffic: 580 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6