Trinity College, Dublin - Department of Statistics
An authentic food is one that is what it purports to be. Food processors and consumers need to be assured that when they pay for a specific product or ingredient, they are getting exactly what they pay for. Classification methods are an important tool in food authenticity studies where they are used to assign food samples of unknown type to known types.
In this work, a classification method is developed where the classification rule is estimated using both the labelled and unlabelled data, in contrast to many classical methods which only use the labelled data for estimation.
This methodology models the data as arising from a Gaussian mixture model with parsimonious covariance structure, as is done in model-based clustering (Fraley and Raftery, 2002). A missing-data formulation of the mixture model is used and the models are fitted using the EM and CEM algorithms.
The methods are applied to the analysis of spectra of foodstuffs recorded over the visible and near-infrared wavelength range in food authenticity studies. A comparison of the performance of model-based discriminant analysis and the proposed method of classification is given. The proposed classification method is shown to yield very good misclassification rates. The correct classification rate was observed to be as much as 15% higher than the correct classification rate for model-based discriminant analysis.