Jan 22

3:30 pm

## Hazard and Density Estimation for Bivariate Survival Data

### Charles Kooperberg

Seminar

University of Washington - Department of Statistics

Correlated failure time data is a hot topic in biostatistics. This type of data arises, for example, in twin studies, where the age at which one of the twins gets a disease may be correlated to the time at which his/her sibling gets the disease. Obviously, one or both of the twins may not get the disease at all; thus different types of censoring are possible. The dependence between the survival times of the two twins may give us information about genetic or environmental influence on the disease.

In the literature the attention seems to be on the estimation of dependence parameters and on (nonparametric) estimation of the survival function. Our approach will be somewhat different. We will formulate spline models to estimate bivariate hazard functions (in the univariate censoring case) and bivariate density functions (in the bivariate censoring case). For the bivariate hazard model we have obtained some L2 convergence rate results, but the corresponding methodology still waits implementation. For the bivariate density model we do not have any theoretical results, but we do have a working methodology.

In particular, in our implementation we estimate the bivariate density function using piecewise linear splines, after transforming the data to the unit square (using hazard estimation with flexible tails - HEFT). The combined procedure yields an estimate of the bivariate density, which may provide insights into the dependence structure, while several of the standard dependence measures can be computed immediately. There are some interesting extensions to tests for independence which will be highlighted during the examples.