Information filtering with extracted index words using ICA

被引:0
作者
Tokyo Metropolitan College of Industrial Technology, Monozukuri Department Electronics and Information Engineering Course, 1-10-40, Higashi-Oi, Shinagawa-ku, 140-0011, Japan [1 ]
不详 [2 ]
不详 [3 ]
不详 [4 ]
机构
[1] Tokyo Metropolitan College of Industrial Technology, Monozukuri Department Electronics and Information Engineering Course, Shinagawa-ku, 140-0011, 1-10-40, Higashi-Oi
[2] Osaka Prefecture University, Engineer Department Computer Science and Intelligent Systems, Naka-ku, Sakai, 599-8531, 1-1, Gakuen-cho
关键词
Independent component analysis; Index words selection; Information filtering; User profile;
D O I
10.1541/ieejeiss.127.1468
中图分类号
学科分类号
摘要
We propose an information filtering system with extracted index words using by Independent Component Analysis(ICA). Elements of a document vector are established as the weights of index words and their dimensions become larger as the number of documents is increased. Therefore, from the view point of processing time and memory space, the dimension must be decreased. The proposed method decreases the dimension by selecting the index words based on the topics included in the corpus. We have applied ICA to the documents to obtain the topics. Then filtering by the relevance feedback with the document vectors reconstructed by the selected index words, was carried out to confirm the effectiveness of the proposed method.
引用
收藏
页码:1468 / 1473+24
相关论文
共 13 条
  • [1] Morita M., Hayami H., Information Filtering System-Prescriptions for Information Flood, Journal of Information Processing Society of Japan, 37, 8, pp. 751-758, (1996)
  • [2] Salton G., McGill M.J., Introduction to Modern Information Retrieval, (1983)
  • [3] Matsuo Y., Ishizuka M., Keywords Extraction from a Document using Word Co-occurrence Statistical Information, Transactions of the Japanese Society for Artificial Intelligence, 17, 3, pp. 213-227, (2002)
  • [4] Ohsawa Y., Benson N., Yachida M., KeyGraph: Automatic Indexing by Segmenting and Uniting Co-occurrence Graphs, Transaction of the Institute of Electronics, Information and Communication Engineers, 500, 2, pp. 391-400, (1999)
  • [5] Hyvarinen A., Oja E., Independent component analysis: A tutorial, Neural Network, 13, pp. 411-430, (2000)
  • [6] Bingham E., Kaban A., Girolami M., Topic Identification in Dynamical Text by Complexity Pursuit, Neural Processing Letters, 17, 1, pp. 69-83, (2003)
  • [7] Kolenda T., Hansen L.K., Independent Components in Text, Advances in Independent Component Analysis, (2000)
  • [8] Yokoi T., Yanagimoto H., Omatu S., Improvement of Information Filtering by Independent Component Analysis, Trans. IEE of Japan, 126-C, 4, pp. 492-497
  • [9] Rocchio J., Relevance feedback in information retrieval, The SMART Retrieval System, Experiments in Automatic Document Processing, pp. 313-323, (1971)
  • [10] Deerwester S., Dumais T., Landauer T., Furnas W., Harshman A., Indexing by Latent Semantic Analysis, Journal of the Society for Information Science, 41, 6, pp. 391-497