Antibodies must recognize a great diversity of antigens to protect us from infectious disease. The binding properties of antibodies are determined by the sequences of their corresponding B cell receptors (BCRs). These BCR sequences are created in "draft" form by VDJ recombination, which randomly selects and deletes from the ends of V, D, and J genes, then joins them together with additional random nucleotides. If they pass initial screening and bind an antigen, these sequences then undergo an evolutionary process of mutation and selection, "revising" the BCR to improve binding to its cognate antigen. It has recently become possible to determine the antibody-determining BCR sequences resulting from this process in high throughput. Although these sequences implicitly contain a wealth of information about both antigen exposure and the process by which we learn to resist pathogens, this information can only be extracted using computer algorithms.
In this talk, I will describe three recent projects to develop model-based inferential tools for analyzing BCR sequences: first, a hidden Markov model (HMM) framework to reconstruct BCR rearrangement events and determine which BCRs derived from the same rearrangements, second, a Bayesian hierarchical model to infer a rich collection of parameters describing BCR nucleotide substitution, and third, a method for assessing natural selection on BCRs that side-steps the difficulties in differentiating between per-site selection versus mutation.
This work is joint with Trevor Bedford (Fred Hutch), Vladimir Minin (UW Statistics), and Duncan Ralph (Fred Hutch).