基于TI-LSTM的文本自动分类算法及应用
首发时间:2022-05-13
摘要:为了有效解决中文文本分类问题,提高文本分类的准确性,本文提出一种基于TF-IDF和神经网络相结合的文本自动分类算法--TI-LSTM算法。算法根据语义情景提取相应特征,进行量化,通过长短期神经网络(LSTM)对量化后的特征进行训练并赋予权重,最后以特征权重为依据对中文文本信息进行评价。使用TI-LSTM算法可以在保留原文语义的情况下准确提取特征。将该算法应用到我校贫困生等级分类研究中。与传统的KNN、逻辑回归、朴素贝叶斯和LSTM分类方法进行了比较,训练和测试的准确率都有了较大的提升,平均准确率达到了86%以上。
关键词: 计算数学 神经网络 文本分类 特征提取 文本量化 贫困生
For information in English, please click here
Research on Automatic Text Classification Based on TI-LSTM
Abstract:In order to solve the problem of Chinese text classification and improve the accuracy, a text automatic classification algorithm based on TF-IDF and neural network is proposed named by TI-LSTM algorithm in this paper. Firstly, the corresponding features are extracted and quantified in the algorithm according to the semantic situation. Then the quantified features are trained and weighted with the long-short term neural network (LSTM). Finally, Chinese text information is evaluated based on feature weight. This method has been successfully applied to the classification of poverty-stricken students in our school. Compared with traditional KNN, logistic regression, naive Bayes and LSTM classification methods, the accuracy of training and testing has been greatly improved. The automatic text classification algorithm in TI-LSTM algorithm can extract features accurately with the original text semantic , and the average accuracy rate is over 86%.
Keywords: Computational Mathematics Neural network Text classification Feature extraction Text quantification Poverty-stricken students
引用
No.****
同行评议
勘误表
基于TI-LSTM的文本自动分类算法及应用
评论
全部评论