A formal analysis of why heuristic functions work

被引:13
作者
Oommen, BJ
Rueda, LG
机构
[1] Carleton Univ, Sch Comp Sci, Ottawa, ON K1S 5B6, Canada
[2] Univ Windsor, Sch Comp Sci, Windsor, ON N9B 3P4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
A* algorithms; heuristic algorithms; pattern recognition; optimization;
D O I
10.1016/j.artint.2002.02.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many optimization problems in computer science have been proven to be NP-hard, and it is unlikely that polynomial-time algorithms that solve these problems exist unless P = NP. Alternatively, they are solved using heuristics algorithms, which provide a sub-optimal solution that, hopefully, is arbitrarily close to the optimal. Such problems are found in a wide range of applications, including artificial intelligence, game theory, graph partitioning, database query optimization, etc. Consider a heuristic algorithm, A. Suppose that A could invoke one of two possible heuristic functions. The question of determining which heuristic function is superior, has typically demanded a yes/no answer-one which is often substantiated by empirical evidence. In this paper, by using Pattern Classification Techniques (PCT), we propose a formal, rigorous theoretical model that provides a stochastic answer to this problem. We prove that given a heuristic algorithm, A, that could utilize either of two heuristic functions H-1 or H-2 used to find the solution to a particular problem, if the accuracy of evaluating the cost of the optimal solution by using H-1 is greater than the accuracy of evaluating the cost using H-2, then H-1 has a higher probability than H-2 of leading to the optimal solution. This unproven conjecture has been the basis for designing numerous algorithms such as the A* algorithm, and its variants. Apart from formally proving the result, we also address the corresponding database query optimization problem that has been open for at least two decades. To validate our proofs, we report empirical results on database query optimization techniques involving a few well-known histogram estimation methods. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 22
页数:22
相关论文
共 28 条
  • [1] Aarts E., 1989, Wiley-Interscience Series in Discrete Mathematics and Optimization
  • [2] [Anonymous], 1997, Tabu Search
  • [3] [Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
  • [4] [Anonymous], 1982, CSRG136 U TOR
  • [5] DEEBA E, 1997, INTERACTIVE LINEAR A
  • [6] Duda R. O., 2000, PATTERN CLASSIFICATI
  • [7] Fukunaga K., 1990, INTRO STAT PATTERN R
  • [8] IOANNIDIS Y, 1991, P ACM SIGMOD C, P268
  • [9] KOOI RP, 1980, THESIS CASE RESERVE
  • [10] Kreher D. L., 1998, Combinatorial Algorithms: Generation, Enumeration, and Search