Seminar Details

Seminar Details


Jun 3

3:30 pm

Composite Models for the Distribution of Species

David Newman


Boeing Information and Support Services (Retired) - Senior Principal Scientist, Applied Statistics

Pattern classification data from a neural network used to detect change in a variety of real and simulated engineering data sets (see Background, below) was examined in a project in which I participated at Boeing. After testing a variety of "obvious" change point detection statistics (rate of formation of new classes, frequency of activation of "signature" classes), it became clear that a statistical model of the classification process was needed.

Initial investigation of the literature started with a review paper by Bunge and Fitzpatrick (JASA, 1993). There are a variety of approaches to estimating the total number of classes, which is usually treated as a fixed, unknown parameter. Many of these are based on estimates of the coverage, which is the proportion of the target population represented by classes observed in the sample.

A number of these estimators were tested, with disappointing results. The difficulties are due to the large number of relatively unimportant or "incidental" classes containing only a few individual patterns. As noted by Keener, Rothman and Starr (Annals of Stat., 1987), similar difficulties arise in estimation problems in genetics and ecology, where neutral alleles correspond to the incidental classes encountered in this application. Composite models which separate "well-defined" from incidental classes and an approach to estimation and model selection for this class will be presented.


Detecting abnormal or "novel" behavior in dynamic mechanical systems in real time is of critical importance in the aerospace industry. The availability of new, inexpensive sensors creates an opportunity to monitor performance in real time in order to improve system control, maintainance and safety. Traditional approaches based on dynamic modeling, system identification or elementary statistics are ineffective in this new context due to the volume of generated data and the complexity of modern systems. Engineers have turned to neural networks as a tool to rapidly classify patterns extracted from system monitoring data, but the resulting classifications are difficult to interpret.