Dear all, talking about the vcf files in R/BioC:
please could someone tell me more about the differences between a "collapsed VCF" and an "expanded VCF" file, and which functions/methods work apply to collapsed vs expanded VCF files ?
thank you,
-- bogdan
Dear Valerie, thank you for your time and the information. Please, when you have a minute, it would be really helpful to have an concrete example of how collapsed or expanded vcf file influence the data processing, or the use of functions or methods on the data. Sorry for being slow, and many thanks !
The ExpandedVCF class is a flat-ish form of the CollapsedVCF. The expansion is centered around the ALT field. Often there can be more than one ALT value per variant and in the CollapsedVCF all ALT values for a single variant are presented in a single row. While there is one row per genomic position, the row actually represents multiple REF / ALT pairs which creates a somewhat nested view of the data. To flatten this out, or have one REF / ALT pair per row, you can expand() a CollapsedVCF object or call readVcf(..., collapsed=FALSE). In this expanded form the 'AD' genotype field is also expanded into REF/ALT pairs and all other fields are simply replicated out.
Another flat form of variant data is the VRanges class. This is a GRanges with the info and geno fields as metadata columns instead of separate slots. The class is less complex than the VCF class and may be more useful for analysis.
As for concrete examples of how collapsed or expanded vcf files influence data processing there are a number examples on the man page under 'Collapsed and Expanded VCF', e.g.,
If you have a specific analysis in mind it may be more productive to post that as a question. Then others can contribute what they have tried, what classes and methods worked for them, etc.
Valerie
Dear Valerie, thank you for your quick reply and very comprehensive answer : will go slowly through the examples that you offered, and will let you know shall I have any comment or additional tiny question. happy and fruitful week (hope not too cold ;) !