首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于LSTM-CNN的中文短文本分类模型
引用本文:杜雪嫣,王秋实,王斌君.基于LSTM-CNN的中文短文本分类模型[J].江苏警官学院学报,2020,35(1):103-107.
作者姓名:杜雪嫣  王秋实  王斌君
作者单位:中国人民公安大学警务信息工程与网络安全学院 北京,100240;公安部第一研究所;中国人民公安大学警务信息工程与网络安全学院
摘    要:针对中文语言本身特点,以及传统文本分类方法不能有效应对短文本分类的问题,本文构建了基于LSTM-CNN的中文短文本分类模型。该模型使用word2vec对待分类文本进行预处理,以获得字词级别的向量;再将词向量送入LSTM层提取语义特征,并通过卷积层提取局部特征;在利用最大池化的方法获得特征向量后,将其放入softmax分类器以得到分类的最终结果。与现有的SVM、KNN、CapsNet和Labeled-LDA的实验结果相比,该分类模型能够有效提高中文短文本分类的准确率。

关 键 词:中文短文本  文本分类  深度学习

Chinese Short-text Classification Method Based on LSTM-CNN
DU Xue-yan,WANG Qiu-shi,WANG Bin-jun.Chinese Short-text Classification Method Based on LSTM-CNN[J].Journal of Jiangsu Police Officer College,2020,35(1):103-107.
Authors:DU Xue-yan  WANG Qiu-shi  WANG Bin-jun
Abstract:In view of the characteristics of Chinese language and the fact that traditional text categorization methods can not effectively deal with the problem of short text categorization with the characteristics of large data scale, short length and sparse features. This paper constructs a Chinese short text classification model based on LSTM-CNN. Firstly, word-level vectors obtain by word2vec method. Secondly, the word vectors are fed into LSTM layer to extract semantic features. Thirdly, Local features are obtained by convolution layer. Fourthly, the feature vectors are obtained by using max-pooling method. Finally, the feature vectors are put into the soft max classifier to get the final Chinese short text classification results. Compared with the experimental results of Svm, Knn, CapsNet and labelled-LDA, this classification model can effectively improve the accuracy of Chinese short text classification.
Keywords:Chinese short-text  text classification  deep learning
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号