Mining for the Most Certain Predictions from Dyadic Data

被引:0
作者
Deodhar, Meghana [1 ]
Ghosh, Joydeep [1 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
来源
KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2009年
关键词
Ranking predictions; dyadic data; co-clustering; regression; ERROR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In several applications involving regression or classification, along with making predictions it is important to assess how accurate or reliable individual predictions are. This is particularly important in cases where due to finite resources or domain requirements, one wants to make decisions based only on the most reliable rather than on the entire set of predictions. This paper introduces novel and effective ways of ranking predictions by their accuracy for problems involving large-scale, heterogeneous data with a dyadic structure, i.e., where the independent variables can be naturally decomposed into three groups associated with two sets of elements and their combination. These approaches are based on modeling the data by a collection of localized models learnt while simultaneously partitioning (co-clustering) the data. For regression this leads to the concept of "certainty lift". We also develop a robust predictive modeling technique that identifies and models only the most coherent regions of the data to give high predictive accuracy on the selected subset of response values. Extensive experimentation on real life datasets highlights the utility of our proposed approaches.
引用
收藏
页码:249 / 257
页数:9
相关论文
共 21 条
  • [1] Agarwal D, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P26
  • [2] BISHOP C, 1996, ICANN 96, P59
  • [3] Bishop C. M., 2009, Pattern Recognition and Machine Learning
  • [4] Blum M., 1973, Journal of Computer and System Sciences, V7, P448, DOI 10.1016/S0022-0000(73)80033-9
  • [5] Cawley GC, 2006, LECT NOTES ARTIF INT, V3944, P56
  • [6] CHOW CK, 1970, IEEE T INFORM THEORY, V16, P41, DOI 10.1109/TIT.1970.1054406
  • [7] DEODHAR M, 2008, ICML 08
  • [8] Deodhar M, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P250
  • [9] FERRI C, 2004, ROCAI, P27
  • [10] Fox J., 1997, APPL REGRESSION ANAL, DOI DOI 10.5860/CHOICE.34-6323