SEMINAR: Jack Dunn Event as iCalendar

14 December 2016

3pm

Venue: G10 (Ground Floor Seminar Room), 70 Symonds Street

Optimal Classification Trees

JACK DUNN
Operations Research Center, Massachusetts Institute of Technology

Classification and Regression Trees (CART) was introduced by Breiman et al. in 1984 and is one of the most widely used methods in Machine Learning. As an indication of impact, CART has attracted approximately 32,000 citations in Google Scholar. CART constructs a decision tree using a greedy algorithm, which can lead to suboptimal solutions as Breiman noted in 1984:

"Finally, another problem frequently mentioned (by others, not by us) is that the tree procedure is only one-step optimal and not overall optimal. ... If one could search all possible partitions ... the two results might be quite different. This issue is analogous to the familiar question in linear regression of how well the stepwise procedures do as compared with `best subsets' procedures. We do not address this problem. At this stage of computer technology, an overall optimal tree growing procedure does not appear feasible for any reasonably sized dataset."

Motivated by the astonishing speedups in Mixed-Integer Optimization solvers over the past 25 years, we present Optimal Classification Trees, a novel formulation of the decision tree problem using modern MIO techniques that yields the optimal decision tree for axes-aligned splits. We also show the richness of this MIO formulation by adapting it to give Optimal Classification Trees with Hyperplanes that generates optimal decision trees with multivariate splits. We develop high-performance heuristics for these formulations that offer significant improvements over traditional greedy approaches and run in comparable times.

We show in a large and diverse collection of instances that these optimal trees improve substantially over CART and have performance similar to state-of the-art methods such as random forests and boosted trees, while maintaining the interpretability of a single decision tree.