Question: help with analysis of genotyping data from Illumina HumanOmni5-4v1_B chip
0
gravatar for Abhishek Pratap
5.8 years ago by
United States
Abhishek Pratap170 wrote:
Hi Guys We have recently obtained from precalled genotype data from our collaborators generated from the Illumina Human Omni5 array chip (HumanOmni5-4v1_B). The genotypes have already been called using the Illumina's Genome Studio. I being new to the array based genotyping data (coming from sequencing arena) would like to know the following. 1. What QC can be done on these genotype data files (200 sampled) to ascertain their quality and filter out the low quality calls. 2. Does bioconductor have a package for annotation of this chip HumanOmni5-4v1_B. I was not able to find "humanomni5quadv1bCrlmm" but not sure if that would give me the annotation on loci / SNP. 3. Any existing slick way to create VCF files from these 200 genotype files. Our goal is to summarize the information in a single VCF across all the samples tagging the low quality ones. Many thanks! -Abhi
snp annotation • 1.4k views
ADD COMMENTlink modified 5.8 years ago by Stephanie M. Gogarten720 • written 5.8 years ago by Abhishek Pratap170
Answer: help with analysis of genotyping data from Illumina HumanOmni5-4v1_B chip
0
gravatar for Stephanie M. Gogarten
5.8 years ago by
University of Washington
Stephanie M. Gogarten720 wrote:
Hi Abhi, 1. The GWASTools package was designed for QC of precalled array data. See the "Data Cleaning" vignette for a recommended workflow. You might also want to look at Laurie et al 2010 in Genetic Epidemiology (10.1002/gepi.20516), as the vignette implements the QC methods described therein. 2. I usually get the annotation file from Illumina (it would probably be called HumanOmni5-4v1_B.csv). Your collaborators may have this file, or you could register with Illumina's website to download it. It has rsID, chromosome, position, alleles, and probe sequences. 3. I don't know of a good way at the moment, but "export GWASTools objects as VCF" is going on my to-do list. I recently used the un- slick way of PLINK file -> load in PLINK/SEQ -> export VCF. You might also try creating a VariantAnnotation object from your data and using the writeVcf method. Stephanie On 1/14/14 11:19 AM, Abhishek Pratap wrote: > Hi Guys > > We have recently obtained from precalled genotype data from our > collaborators generated from the Illumina Human Omni5 array chip > (HumanOmni5-4v1_B). The genotypes have already been called using the > Illumina's Genome Studio. > > I being new to the array based genotyping data (coming from sequencing > arena) would like to know the following. > > 1. What QC can be done on these genotype data files (200 sampled) to > ascertain their quality and filter out the low quality calls. > > 2. Does bioconductor have a package for annotation of this chip > HumanOmni5-4v1_B. I was not able to find "humanomni5quadv1bCrlmm" but > not sure if that would give me the annotation on loci / SNP. > > 3. Any existing slick way to create VCF files from these 200 genotype > files. Our goal is to summarize the information in a single VCF across > all the samples tagging the low quality ones. > > > Many thanks! > -Abhi > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENTlink written 5.8 years ago by Stephanie M. Gogarten720
Thanks a lot Stephanie for your quick response. This is was very useful info. I will follow up with package specific questions if any. Cheers! -Abhi On Tue, Jan 14, 2014 at 1:54 PM, Stephanie M. Gogarten <sdmorris at="" u.washington.edu=""> wrote: > Hi Abhi, > > 1. The GWASTools package was designed for QC of precalled array data. See > the "Data Cleaning" vignette for a recommended workflow. You might also > want to look at Laurie et al 2010 in Genetic Epidemiology > (10.1002/gepi.20516), as the vignette implements the QC methods described > therein. > > 2. I usually get the annotation file from Illumina (it would probably be > called HumanOmni5-4v1_B.csv). Your collaborators may have this file, or you > could register with Illumina's website to download it. It has rsID, > chromosome, position, alleles, and probe sequences. > > 3. I don't know of a good way at the moment, but "export GWASTools objects > as VCF" is going on my to-do list. I recently used the un-slick way of > PLINK file -> load in PLINK/SEQ -> export VCF. You might also try creating > a VariantAnnotation object from your data and using the writeVcf method. > > Stephanie > > > On 1/14/14 11:19 AM, Abhishek Pratap wrote: >> >> Hi Guys >> >> We have recently obtained from precalled genotype data from our >> collaborators generated from the Illumina Human Omni5 array chip >> (HumanOmni5-4v1_B). The genotypes have already been called using the >> Illumina's Genome Studio. >> >> I being new to the array based genotyping data (coming from sequencing >> arena) would like to know the following. >> >> 1. What QC can be done on these genotype data files (200 sampled) to >> ascertain their quality and filter out the low quality calls. >> >> 2. Does bioconductor have a package for annotation of this chip >> HumanOmni5-4v1_B. I was not able to find "humanomni5quadv1bCrlmm" but >> not sure if that would give me the annotation on loci / SNP. >> >> 3. Any existing slick way to create VCF files from these 200 genotype >> files. Our goal is to summarize the information in a single VCF across >> all the samples tagging the low quality ones. >> >> >> Many thanks! >> -Abhi >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD REPLYlink written 5.8 years ago by Abhishek Pratap170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 197 users visited in the last hour