Hungarian Academy of Sciences & Eotvos Lorand University
Relational models generalize log-linear models for multivariate categorical data in three aspects. The sample space does not have to be a Cartesian product of the ranges of the variables, the effects allowed in the model do not have to be associated with cylinder sets, and the existence of an overall effect present in every cell is not assumed. After discussing examples which motivate these generalizations, the talk will consider estimation and testing in relational models. When the overall effect is not present, the usual equivalence of the maximum likelihood estimates under multinomial and Poisson sampling does not hold. The MLEs may be obtained by iterated Bregman projections, and in the multinomial case they reproduce the observed subset sums only to a constant of proportionality, and in the Poisson case they do not reproduce the observed total. When the data also contain zeros, depending on their pattern, the MLE may only exist in the closure of the original model with respect to the Bregman divergence, which coincides with the set of pointwise limits of the distributions in the relational model. For relational models, the Pearson chi-squared statistic has an asymptotic chi-squared distribution, just like a generalization of the likelihood ratio statistic based on the Bregman divergence. The material presented is joint work with Anna Klimova.