中文说明:确切的说,C4.5不是单个的算法,而是一套算法,C4.5有许多的功能,每个功能都对应着一个算法,这些功能组合起来就形成了一套算法就是C4.5。该算法的框架从根节点开始不断得分治,递归,生长,直至得到最后的结果。根节点代表整个训练样本集,通过在每个节点对某个属性的测试验证,算法递归得将数据集分成更小的数据集.某一节点对应的子树对应着原数据集中满足某一属性测试的部分数据集.这个递归过程一直进行下去,直到某一节点对应的子树对应的数据集都属于同一个类为止.
English Description:
To be exact, C4.5 is not a single algorithm, but a set of algorithms. C4.5 has many functions, and each function corresponds to an algorithm. The combination of these functions forms a set of algorithms, which is C4.5. The framework of the algorithm starts from the root node and continues to divide and conquer, recurse and grow until the final result is obtained. The root node represents the whole training sample set. By testing and verifying a certain attribute in each node, the algorithm recursively divides the data set into smaller data sets. The subtree corresponding to a certain node corresponds to a part of the original data set that satisfies a certain attribute test, Until the data set corresponding to the subtree of a node belongs to the same class