Oregon State University - Department of Computer Sciences
Since its beginning, supervised learning research has focused primarily on symmetric loss functions: 0/1 loss, log loss, and square loss. Many cutting-edge applications of data mining and machine learning involve asymmetric loss functions where, for example, false positive errors are more costly than false negatives. In addition, in some applications---notably in medicine---there is a cost to measuring each predictor variable ("feature"). This talk will describe our recent work on problems that involve both asymmetric loss functions and feature measurement costs. The goal of learning in these problems is to find a "diagnostic policy" with minimum expected cost. A Diagnostic Policy is a closed-loop procedure for choosing which predictor to measure next and for choosing when to stop measuring and make the classification decision. A diagnostic policy takes the form of a classification tree.
To solve this problem, we are pursuing the somewhat outrageous idea of performing a systematic search of the space of all possible classification trees. We will describe how the AO* heuristic search algorithm can be applied to this problem, and how we can introduce an admissible (that is, a provably safe) heuristic that can prune large parts of the search space. This admissible heuristic works well in problems with high measurement costs. To handle problems with low measurement costs, we introduce an unsafe statistical pruning heuristic that works well when the amount of data available for model-fitting is small (and which also helps reduce overfitting). We will present experimental results on simulated and real-world data.