"DATA TEMPERATURE" IN MINIMUM FREE ENERGIES FOR PARAMETER LEARNING OF BAYESIAN NETWORKS

被引:11
作者
Isozaki, Takashi [1 ,2 ]
Kato, Noriji [3 ]
Ueno, Maomi [1 ]
机构
[1] Univ Electrocommun, Grad Sch Informat Syst, Chofu, Tokyo 1828585, Japan
[2] Fuji Xerox Co Ltd, Res & Technol Grp, Minato Ku, Tokyo 1060032, Japan
[3] Fuji Xerox Co Ltd, Res & Technol Grp, Nakai, Kanagawa 2590157, Japan
关键词
Bayesian networks; parameter learning; minimum free energy principle;
D O I
10.1142/S0218213009000342
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Maximum likelihood method for estimating parameters of Bayesian networks (BNs) is efficient and accurate for large samples. However, the method suffers from overfitting when the sample size is small. Bayesian methods, which are effective to avoid overfitting, present difficulties for determining optimal hyperparameters of prior distributions with good balance between theoretical and practical points of view when no prior knowledge is available. As described in this paper, we propose an alternative estimation method of the parameters on BNs. The method uses a principle, rooted in thermodynamics, of minimizing free energy (MFE). We define internal energies, entropies, and temperature, which constitute free energies. Especially for temperature, we propose a "data temperature" assumption and some explicit models. This approach can treat the maximum likelihood principle and the maximum entropy principle in a unified manner of the MFE principle. For assessments of classification accuracy, our method shows higher accuracy than that obtained using the Bayesian method with normally recommended hyperparameters. Moreover, our method exhibits robustness for the choice of introduced hyperparameters.
引用
收藏
页码:653 / 671
页数:19
相关论文
共 37 条
  • [1] Amari S., 2000, Method of information geometry
  • [2] [Anonymous], 1993, 31 ANN M ASS COMPUTA, DOI [10.3115/981574.981598, DOI 10.3115/981574.981598]
  • [3] [Anonymous], 2007, Bayesian networks and decision graphs, DOI DOI 10.1007/978-0-387-68282-2
  • [4] [Anonymous], 2004, Learning Bayesian Networks
  • [5] [Anonymous], 1998, UCI REPOSITORY MACHI
  • [6] [Anonymous], 1993, Proceedings of the 13th International Joint Conference on Artificial Intelligence
  • [7] [Anonymous], 2000, CAUSATION PREDICTION
  • [8] Robust and efficient estimation by minimising a density power divergence
    Basu, A
    Harris, IR
    Hjort, NL
    Jones, MC
    [J]. BIOMETRIKA, 1998, 85 (03) : 549 - 559
  • [9] Learning Bayesian networks from data: An information-theory based approach
    Cheng, J
    Greiner, R
    Kelly, J
    Bell, D
    Liu, WR
    [J]. ARTIFICIAL INTELLIGENCE, 2002, 137 (1-2) : 43 - 90
  • [10] Cheng J, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P101