Hello,
I would like to use the VariantFiltration library of Bioconductor of reduce the number of SNP identified for the phenotype of our interest. We are in the fortunate situation to analyze related individuals, but we have different phenotypes (more than absent & present). I would like to use a pedigree file to describe the relationship for use in VariantFiltration.
I tried to understand the example file CEUtrio.ped. In my download this file does not have a header
$head CEUtrio.ped
FX-1800 NA12891 0 0 2 1
FX-1800 NA12892 0 0 1 1
FX-1800 NA12878 NA12892 NA12891 2 2
It is clear the first column is the family ID, the 2nd col the id of the individual, the third col the id of the father, the 4th col the ID of the mother and the 5th col the sex (I guess: 1= male, 2=female). Probably the 6th col the describes the phenotype.
However, I am not clear on the code. I did find different phenotypes for the this CEU Trio with google. In some examples (like the above?) the daughter seems to have a phenotype - in other examples (e.g. she did not.
Could someone please explain me the code used for describing the phenotype?
I hope I did not miss in my search not any relevant hits - in case I did, I apologize for this post.
Looking forward to a supporting reply.
Thanks & best wishes
Claus
Dear Claus,
you're probably not aware, but your question now is written in the space for answers to your first question. you should have written it using the 'ADD COMMENT' link below my answer. this helps to keep some structure in the conversation and quickly identify relevant answers.
regarding you specific question, although this 6th column is commonly known as the phenotype column it just encodes the so-called "affection status", which means that you only code whether the individual is "affected" or not by the disease, you may use -9 o 0 when this affection status is missing. if you do a google search for "PED file format" you may find pointers to documentation about this format, one of then being this one.
if you want to understand more deeply how VariantFiltering filters with the different inheritance models, which is when this phenotype column comes into play, you may consult the corresponding unit tests from the source code in
cheers,
robert.