University of Washington - Department of Statistics
Haplotypes are specific combinations of alleles on the same chromosome, and various methods exist for the analysis of haplotype data from unrelated individuals. However, humans are diploid and studies of genetic variation might consist of unphased genotype data, where an unordered pair of alleles is observed at each locus. There is a coming need for less-computationally intensive models that may be directly applied to unphased genotype data from thousands of individuals at thousands of loci. In this talk, we present such a model for genetic variation. We apply the model to data from real populations and assess its accuracy in imputing missing genotypes, comparing the results to those obtained from existing methods based on models for haplotypes. We discuss model extensions, as well as several computational issues and conclude with a discussion of further applications of the model, which will comprise future work.