University of Washington - Department of Statistics
The paradox of missing heritability refers to the common finding that in complex genetic traits with high heritability as estimated by methods such as twin studies, only a small fraction of the population variance is explained by the few Single Nucleotide Polymorphism (SNP) markers which are found to be individually significantly associated with the trait. Human height, with heritability estimates as high as 80% largely unexplained by individual SNPâ€™s, is the canonical example of such a trait. One possible explanation for such behavior is the polygenic view: height may be affected by a large number of genes too weak to be identified individually. This implies a general quantitative genetic trait model estimating all gene effects simultaneously, with significantly more parameters than observations, therefore highly sensitive to prior or regularization assumptions.
Using the infinite allele model from population genetics and a novel application of the methods of classifying Identity By Descent (IBD) states, we motivate a structured prior with a limited number of hyperparameters. Unlike existing Whole Genome Prediction and Genome Assisted Selection methods, emerging from the animal and plant breeding literature and recently applied to human genetics, the model captures both additive and dominance effects, and can be applied to full sequence DNA data if available. The model can be applied to the detection of individual gene effects, heritability estimation and decomposition by locus, breeding value prediction, and genotypic value prediction. It is demonstrated with out of sample prediction of height from genotype on 990 human SNP profiles.