0
9 weeks ago by
GENOMIC_region0 wrote:

I'm running into an issue with sporadic missingness when using admixmmap function from Genesis.

Column rsids joining factor and character vector, coercing into character vector
Reading in Phenotype and Covariate Data...
Fitting Model with 7485 Samples
Computing Variance Component Estimates using AIREML Procedure...
Running analysis with 7485 Samples and 43221 SNPs
Beginning Calculations...
Block 1 of 9 Completed - 1.645 mins
Block 2 of 9 Completed - 1.128 mins
Block 3 of 9 Completed - 38.08 secs
Block 4 of 9 Completed - 47.14 secs
Block 5 of 9 Completed - 42.99 secs
Block 6 of 9 Completed - 1.203 mins
genoData has sporadic missingness in block size > 1
Execution halted


I'm left with one chromosome to wrap up the analysis.
Is it possible to identify region the function is giving issues for missingness? Is it due to large number of SNPs, or low number of SNPs, or high linkage equilibrium?

SNPRelate_1.12.2    argparse_2.0.1      GENESIS_2.8.1     dplyr_0.8.0.1       gdsfmt_1.14.1       GWASTools_1.24.1     Biobase_2.38.0      BiocGenerics_0.24.0
R version 3.4.2 (2017-09-28)


modified 9 weeks ago by Stephanie M. Gogarten740 • written 9 weeks ago by GENOMIC_region0
1
9 weeks ago by
University of Washington
Stephanie M. Gogarten740 wrote:

admixMapMM cannot handle missing genotype values in the input data. The fastest way to identify which SNPs have missing values is to use the SNPRelate function snpgdsSNPRateFreq, which returns missing call rate as well as allele frequency. You could then exclude those SNPs from your analysis.

Also, your versions of R and Bioconductor are two years old, so I would recommend upgrading to the current versions. In GENESIS 2.14.3, admixMapMM has been replaced by a new function admixMap which will give you an warning (instead of an error) and return NA for any blocks which contain missing values.

I've small concern. I perform analysis using genetic related matrix (GRM) to adjust for random effect. Data size is 10K individuals. Here, the most time consuming step is fitting null-model for each CHR. I submit analysis separately for each chromosome.

Now, my model needs to adjust for another matrix, that is, I've two matrices as random effect. Is it possible to store output of fitted null model in an R object (.rd or such) that would allow to read it and use when performing analysis. This would cull the most time intensive process every time, thus, expediting the analysis. I hope my concern isn't confusing. I'm using libraries tied with R v3.5 and admixMap function.

1

Yes, this is exactly what we do in our own analyses. script 1:

nullmod <- fitNullModel(...)
save(nullmod, file="my_null_model.RData")


script 2:

load("my_null_model.RData")