Category:Decision Trees: Creating DTs

Package C50
C5.0( formula, dataset ) will return a decision tree based on the dataset and formula given. There are other parameters to specify a subset of the data and weights for each attribute.

predict( model, dataset ) will classify each instance in 'dataset' on the tree 'model' and return a vector of those classifications.

 Adaptive Boosting in C5.0 : Specify the 'trials’ argument to an integer value representing the number of separate decision trees to create and combine these trees to build a single boosted tree.

DTS
1995-present

C5.0(training data, target, trials = 10) # This creates 10 decision trees C50 Documentation

DTS
1995-present

C5.0(training data, target, trials = 10) # This creates 10 decision trees C50 Documentation

Package Rpart
In addition to the C5.0, Rpart is also a useful package in producing decision trees. The algorithm used to create an rpart tree is slightly different from that of C50, it uses a method of ‘Risk Reduction’ instead of entropy reduction. However, both algorithms build the tree recursively.

To build a tree using Rpart, the syntax is:

rpart-tree <- rpart(formula, data) // One may refer to R documentation about how to correctly write a formula.

The rpart package contains well written functionality in terms of tree visualization. To visualize an rpart tree:

plot(rpart-tree)

Rpart Documentation♙

Package randomForest

A package for creating and examining random forests.

The function randomForest requires a formula, similar to C50

http://cran.r-project.org/web/packages/randomForest/randomForest.pdf