A UML profile for the conceptual modelling of structurally complex data: Easing human effort in the KDD process

被引:3
作者
Lara, Juan A. [1 ]
Lizcano, David [1 ]
Martinez, Maria A. [1 ]
Pazos, Juan [2 ]
Riera, Teresa [3 ]
机构
[1] Open Univ Madrid, UDIMA, Fac Ensenanzas Tecn, Madrid 28400, Spain
[2] Tech Univ Madrid, Sch Comp Sci, Madrid 28660, Spain
[3] Univ Islas Baleares, Dept Matemat & Informat, Palma De Mallorca 07122, Spain
关键词
KDD; Data mining; Conceptual modelling; Structurally complex data; Time series; UML profiles; ENTITY-RELATIONSHIP MODEL; DATA WAREHOUSES; TIME-SERIES; SCHEMAS; DESIGN; METHODOLOGY;
D O I
10.1016/j.infsof.2013.11.005
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Domains where data have a complex structure requiring new approaches for knowledge discovery from data are on the increase. In such domains, the information related to each object under analysis may be composed of a very broad set of interrelated data instead of being represented by a simple attribute table. This further complicates their analysis. Objective: It is becoming more and more necessary to model data before analysis in order to assure that they are properly understood, stored and later processed. On this ground, we have proposed a UML extension that is able to represent any set of structurally complex hierarchically ordered data. Conceptually modelled data are human comprehensible and constitute the starting point for automating other data analysis tasks, such as comparing items or generating reference models. Method: The proposed notation has been applied to structurally complex data from the stabilometry field. Stabilometry is a medical discipline concerned with human balance. We have organized the model data through an implementation based on XML syntax. Results: We have applied data mining techniques to the resulting structured data for knowledge discovery. The sound results of modelling a domain with such complex and wide-ranging data confirm the utility of the approach. Conclusion: The conceptual modelling and the analysis of non-conventional data are important challenges. We have proposed a UML profile that has been tested on data from a medical domain, obtaining very satisfactory results. The notation is useful for understanding domain data and automating knowledge discovery tasks. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:335 / 351
页数:17
相关论文
共 69 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]  
Alonso F., 2007, Expert Systems Research Trends, V3, P113
[3]  
[Anonymous], 2010, BCP SOFTWARE
[4]  
[Anonymous], PRED MOD MARK LANG P
[5]  
[Anonymous], 2003, MDA Explained, the Model Driven Architecture: Practice and Promise
[6]  
[Anonymous], 2009, P INT MULTICONFERENC
[7]  
Arimura H, 2002, LECT NOTES COMPUT SC, V2373, P17
[8]  
Bachman C.W., 1969, SIGMIS DATABASE, V1, P4, DOI 10.1145/1017466.1017467
[9]  
BARIGANT P, 1972, AGRESSOLOGIE, V13, P69
[10]  
BARON J B, 1964, Arch Mal Prof, V25, P41