中文说明: 1,中文处理用的是中科院的分词软件, 2,文本聚类用的是词频统计称文档单词矩阵之后, 3,进行tiidf处理,在进行k-means聚类。
English Description:
1, is used for treatment of Chinese word segmentation software Chinese Academy of Sciences, 2, text clustering using a statistic called the document the word matrix, 3, tiidf, k-means clustering.