Web usage and content mining to extract knowledge for modelling the users of the Bidasoa Turismo website and to adapt it

被引:27
作者
Arbelaitz, Olatz [1 ]
Gurrutxaga, Ibai [1 ]
Lojo, Aizea [1 ]
Muguerza, Javier [1 ]
Maria Perez, Jesus [1 ]
Perona, Inigo [1 ]
机构
[1] Univ Basque Country UPV EHU, Comp Architecture & Technol Dept, Donostia San Sebastian 20018, Gipuzkoa, Spain
关键词
Bidasoa tourism website; Web usage mining; Web content mining; Web user profiling; Clustering; Frequent pattern mining; Topic modelling; ALGORITHM;
D O I
10.1016/j.eswa.2013.07.040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The tourism industry has experienced a shift from offline to online travellers and this has made the use of intelligent systems in the tourism sector crucial. These information systems should provide tourism consumers and service providers with the most relevant information, more decision support, greater mobility and the most enjoyable travel experiences. As a consequence, Destination Marketing Organizations (DMOs) not only have to respond by adopting new technologies, but also by interpreting and using the knowledge created by the use of these techniques. This work presents the design of a general and non-invasive web mining system, built using the minimum information stored in a web server (the content of the website and the information from the log files stored in Common Log Format (CLF)) and its application to the Bidasoa Turismo (BTw) website. The proposed system combines web usage and content mining techniques with the three following main objectives: generating user navigation profiles to be used for link prediction; enriching the profiles with semantic information to diversify them, which provides the DMO with a tool to introduce links that will match the users taste; and moreover, obtaining global and language-dependent user interest profiles, which provides the DMO staff with important information for future web designs, and allows them to design future marketing campaigns for specific targets. The system performed successfully, obtaining profiles which fit in more than 60% of cases with the real user navigation sequences and in more than 90% of cases with the user interests. Moreover the automatically extracted semantic structure of the website and the interest profiles were validated by the BTw DMO staff, who found the knowledge provided to be very useful for the future. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:7478 / 7491
页数:14
相关论文
共 52 条
[1]   Internet Adoption by Travel Agents: a Case of Egypt [J].
Abou-Shouk, Mohamed ;
Lim, Wai Mun ;
Megicks, Phil .
INTERNATIONAL JOURNAL OF TOURISM RESEARCH, 2013, 15 (03) :298-312
[2]  
Anitha A., 2010, INT J COMPUTER APPL, V8, P7, DOI DOI 10.5120/1252-1700
[3]  
[Anonymous], 2007, LECT NOTES COMPUTER
[4]  
[Anonymous], 1997, ACM SIGACT NEWS
[5]  
Arbelaitz Olatz, 2012, Proceedings of the 4th International Conference on Knowledge Discovery and Information Retrieval. KDIR 2012, P187
[6]   An extensive comparative study of cluster validity indices [J].
Arbelaitz, Olatz ;
Gurrutxaga, Ibai ;
Muguerza, Javier ;
Perez, Jesus M. ;
Perona, Inigo .
PATTERN RECOGNITION, 2013, 46 (01) :243-256
[7]   MEETING THE NEEDS OF TOURISTS: THE ROLE AND FUNCTION OF AUSTRALIAN VISITOR INFORMATION CENTERS [J].
Ballantyne, Roy ;
Hughes, Karen ;
Ritchie, Brent W. .
JOURNAL OF TRAVEL & TOURISM MARKETING, 2009, 26 (08) :778-794
[8]  
Berger H, 2007, LECT NOTES COMPUT SC, V4655, P46
[9]  
Bhawsar S., 2012, INT J COMPUTER TECHN, V2, P48
[10]   A CORRELATED TOPIC MODEL OF SCIENCE [J].
Blei, David M. ;
Lafferty, John D. .
ANNALS OF APPLIED STATISTICS, 2007, 1 (01) :17-35