The Research and Application in Intelligent Document Retrieval Based on Text Quantification and Subject Mapping

被引:0
作者
Wang, Qin [1 ]
Qu, Shouning [1 ]
Du, Tao [1 ]
Zhang, Mingjing [2 ]
机构
[1] Univ Jinan, Shandong Prov Key Lab Network Based Intelligent C, Jinan, Peoples R China
[2] Univ Jinan, Sch Informat Sci & Engn, Jinan, Peoples R China
来源
ADVANCED DESIGNS AND RESEARCHES FOR MANUFACTURING, PTS 1-3 | 2013年 / 605-607卷
关键词
quantification; subject mapping; intelligent retrieval; feature extraction;
D O I
10.4028/www.scientific.net/AMR.605-607.2561
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Nowadays, document retrieval was an important way of academic exchange and achieving new knowledge. Choosing corresponding category of database and matching the input key words was the traditional document retrieval method. Using the method, a mass of documents would be got and it was hard for users to find the most relevant document. The paper put forward text quantification method. That was mining the features of each element in some document, which including word concept, weight value for position function, improved weights characteristic value, text distribution function weights value and text element length. Then the word' contributions to this document would be got from the combination of five elements characteristics. Every document in database was stored digitally by the contribution of elements. And a subject mapping scheme was designed in the paper, which the similarity calculation method based on contribution and association rule was firstly designed, according to the method, the documents in the database would be conducted text clustering, and then feature extraction method was used to find class subject. When searching some document, the description which users input would be quantified and mapped to some class automatically by subject mapping, then the document sequences would be retrieved by computing the similarity between the description and the other documents' features in the class. Experiment shows that the scheme has many merits such as intelligence, accuracy as well as improving retrieval speed.
引用
收藏
页码:2561 / +
页数:3
相关论文
共 17 条
  • [1] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
  • [2] [Anonymous], 2007, P 16 INT WORLD WID W, DOI DOI 10.1145/1242572.1242675
  • [3] Fernández J, 2004, LECT NOTES COMPUT SC, V3180, P253
  • [4] Forman G., 2003, Journal of Machine Learning Research, V3, P1289, DOI 10.1162/153244303322753670
  • [5] A Fuzzy Self-Constructing Feature Clustering Algorithm for Text Classification
    Jiang, Jung-Yi
    Liou, Ren-Jia
    Lee, Shie-Jue
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (03) : 335 - 349
  • [6] Krishna S, 2010, EUR J SCI RECH, V42, P412
  • [7] Makrehchi M, 2005, LECT NOTES ARTIF INT, V3587, P580
  • [8] Mladenic D, P 27 ACM INT C RES D, P234
  • [9] Clustering Generalised Instances Set Approaches for Text Classiffication
    Najadat, Hassan
    Obeidat, Rasha
    Hmeidi, Ismail
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2011, 10 (01) : 91 - 107
  • [10] ozge U., 2007, INF SCI NY, V177, P449