Decision Tree Classification

A decision tree or a classification tree is a tree in which each internal (nonleaf) node is labeled with an input feature. The arcs coming from a node labeled with a feature are labeled with each of the possible values of the feature. Each leaf of the tree is labeled with a class or a probability distribution over the classes.
A decision tree reaches its decision by performing a sequence of tests. Each internal node in the tree corresponds to a test of the value of one of the input attributes, A_i, and the branches from the node are labeled with the possible values of the attribute, A_i= v_ik
Each leaf node in the tree specifies a value to be returned by the function. The decision tree representation is natural for humans; indeed, many "How To" manuals (e.g., for car repair) are written entirely as a single decision tree stretching over hundreds of pages.
As an example, we will build a decision tree to decide whether to wait for a table at a restaurant. The aim here is to learn a definition for the goal predicate WillWait. First we list the attributes that we Will Wait consider as part of the input:

Alternate : whether there is a suitable alternative restaurant nearby.
Bar : whether the restaurant has a comfortable bar area to wait in.
Fri/Sat : true on Fridays and Saturdays.
Hungry : whether we are hungry.
Patrons : how many people are in the restaurant (values are None, Some, and Full).
Price : the restaurant's price range ($, $$, $$$).
Raining : whether it is raining outside.
Reservation : whether we made a reservation.
Type : the kind of restaurant (French, Italian, Thai, or burger).
WaitEstimate: the wait estimated by the host (0-10 minutes, 10-30, 30-60, or >60).

A decision tree is made up of connected decision points. The tree has a starting decision, its root. For each decision, starting from the root, one of a set of ongoing options is chosen. Note that every variable has a small set of possible values; the value of WaitEstimate, for example, is not an integer, rather it is one of the four discrete values 0-10, 10-30, 30-60, or >60.

The decision tree usually used by one of us (SR) for this domain is shown in Figure Tr.01. Notice that the tree ignores the Price and Type attributes. Examples are processed by the tree starting at the root and following the appropriate branch until a leaf is reached. For instance, an example with Patrons = Pull and WaitEstimate = 0-10 will be classified as positive (i.e., yes, we will wait for a table).

Figure Tr.01 A decision tree for deciding whether to wait for a table.

Related Posts :
Combinations of Decisions
Balancing The Tree
Data Classification

www.CodeNirvana.in