基于HTML结构的Web文本主题挖掘研究 Research of Web Text Mining Based on HTML Modal期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于HTML结构的Web文本主题挖掘研究

引用本文：	隋丽萍,徐承韬,李瑞芳. 基于HTML结构的Web文本主题挖掘研究[J]. 西安外事学院学报, 2007, 0(1)

作者姓名：	隋丽萍徐承韬李瑞芳

作者单位：	沈阳化工学院计算机学院辽宁沈阳110142(隋丽萍，李瑞芳)，沈阳化工学院信息学院辽宁沈阳110142(徐承韬)

摘要：	随着数据挖掘应用的深入,非结构化和半结构化数据的挖掘,将成为下一个数据挖掘应用的热点。文本挖掘是一种典型的非结构化数据挖掘,而Web挖掘则是典型的半结构化数据挖掘。所以将信息处理技术中的汉语分词、信息抽取、语义分析等技术应用到Web挖掘研究技术发展的必然趋势。本文应对Web挖掘的特点,将运用数据抽取进行数据结构转换并把语义分析技术应用到数据抽取的过程中的思想,使数据提取更加准确。
关键词：	数据挖掘 Web 自动分词数据提取语义分析
Research of Web Text Mining Based on HTML Modal

Sui Liping,Xu Chengtao,Li Ruifang. Research of Web Text Mining Based on HTML Modal[J]. Journal of Xi＇an International University, 2007, 0(1)

Authors:	Sui Liping Xu Chengtao Li Ruifang

Affiliation:	Sui Liping1,Xu Chengtao2,Li Ruifang1

Abstract:	This paper focuses on the key technology in Information Processing of Chinese.And using this technology for web text mining.So will the Chinese word segmentation information processing technologies,information extraction,Semantic Web Mining analysis techniques to study the technical development of an inevitable trend.This paper is response to the characteristics of Web mining deal.Data extraction will be used for data conversion and semantic structure analysis techniques to the data extraction in the course,data extraction more accurate.

Keywords:	data mining Web Chinese segment data extraction semantic analysis
本文献已被 CNKI 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏