GWASTools: quasi-/perfect linear separation
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
This is not really a question but more of a warning to other users. I have performed a regression analysis using the assocTestRegression function under three different models (dominant,recessive,additive). My data set contains ~3 million markers which have been filtered so that only SNPs with >= MAF of 10% are included. Please note that this filter was applied with both cases and controls as one big data set (i.e. I did not perform the filter for cases and controls separately). Once I have examined the results of the association under the recessive model, I noticed very large beta estimates (8-9). When I looked at the genotype counts, I realised that this was due to the fact that in some SNPs, there is perfect linear separation. In other words, the AA genotype has a count of 0 in cases and a count of 170 in controls, which leads to inflated estimates. I was surprised to find that the function does not throw a warning for this or drops the analysis for SNPs where this occurs. Regards, Danica -- output of sessionInfo(): -- Sent via the guest posting facility at bioconductor.org.
Regression • 996 views
ADD COMMENT
0
Entering edit mode
@stephanie-m-gogarten-5121
Last seen 28 days ago
University of Washington
Hi Danica, assocTestRegression will return an error code for SNPs that are monomorphic in either cases or controls, but it seems that you have found a case that we did not test for. I consulted with Matt Conomos, who wrote this function, and he said the following: Since AA has a count of 0 in cases in the example given, and an error was not returned, I would assume that both AB and BB are non-zero in cases, but it would be nice to confirm this. Also, it would be nice to know which allele is the minor allele (the function returns this), since a recessive model is being fit. If the A allele is the minor allele, then the recessive model collapses the AB and BB classes, and this could lead to the separability issue. I may need to add in a check for this when fitting dominant or recessive models. Could you please provide the full output of assocTestRegression for the SNPs where you see this problem? Also, include the output of sessionInfo() so we know which version of GWASTools you are using. Stephanie On 9/4/14, 3:56 AM, Danica [guest] wrote: > This is not really a question but more of a warning to other users. > > I have performed a regression analysis using the assocTestRegression function under three different models (dominant,recessive,additive). My data set contains ~3 million markers which have been filtered so that only SNPs with >= MAF of 10% are included. Please note that this filter was applied with both cases and controls as one big data set (i.e. I did not perform the filter for cases and controls separately). > > Once I have examined the results of the association under the recessive model, I noticed very large beta estimates (8-9). When I looked at the genotype counts, I realised that this was due to the fact that in some SNPs, there is perfect linear separation. In other words, the AA genotype has a count of 0 in cases and a count of 170 in controls, which leads to inflated estimates. > > I was surprised to find that the function does not throw a warning for this or drops the analysis for SNPs where this occurs. > > Regards, > Danica > > > > -- output of sessionInfo(): > > > > -- > Sent via the guest posting facility at bioconductor.org. >
ADD COMMENT

Login before adding your answer.

Traffic: 938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6