Storing and analysing voice of the market data in the corporate data warehouse

被引:18
作者
Garcia-Moya, Lisette [1 ]
Kudama, Shahad [1 ]
Jose Aramburu, Maria [1 ]
Berlanga, Rafael [1 ]
机构
[1] Univ Jaume 1, Temporal Knowledge Bases Grp, Castellon de La Plana, Spain
关键词
Sentiment analysis; Data warehouses; OLAP; Text processing;
D O I
10.1007/s10796-012-9400-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Web opinion feeds have become one of the most popular information sources users consult before buying products or contracting services. Negative opinions about a product can have a high impact in its sales figures. As a consequence, companies are more and more concerned about how to integrate opinion data in their business intelligence models so that they can predict sales figures or define new strategic goals. After analysing the requirements of this new application, this paper proposes a multidimensional data model to integrate sentiment data extracted from opinion posts in a traditional corporate data warehouse. Then, a new sentiment data extraction method that applies semantic annotation as a means to facilitate the integration of both types of data is presented. In this method, Wikipedia is used as the main knowledge resource, together with some well-known lexicons of opinion words and other corporate data and metadata stores describing the company products like, for example, technical specifications and user manuals. The resulting information system allows users to perform new analysis tasks by using the traditional OLAP-based data warehouse operators. We have developed a case study over a set of real opinions about digital devices which are offered by a wholesale dealer. Over this case study, the quality of the extracted sentiment data is evaluated, and some query examples that illustrate the potential uses of the integrated model are provided.
引用
收藏
页码:331 / 349
页数:19
相关论文
共 33 条
[1]  
[Anonymous], 1993, PROVIDING OLAP USER
[2]  
[Anonymous], 2007, Proceedings of the 16th ACM Conference on Con- ference on Information and Knowledge Management, DOI DOI 10.1145/1321440.1321475.19
[3]  
[Anonymous], 2008, FDN TRENDS INF RETRI, DOI DOI 10.1561/1500000001
[4]  
[Anonymous], 2005, Proceedings of the ACM international conference on world wide web
[5]  
Archak N, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P56
[6]  
Berger A, 1999, SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P222, DOI 10.1145/312624.312681
[7]  
Berry M.W., 2007, SURVEY TEXT MINING
[8]   Enhanced Business Intelligence using EROCS [J].
Bhide, M. ;
Chakravarthy, V. ;
Gupta, A. ;
Gupta, H. ;
Mohania, M. ;
Puniyani, K. ;
Roy, P. ;
Roy, S. ;
Sengar, V. .
2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, :1616-+
[9]  
Bryl V, 2010, LECT NOTES COMPUT SC, V6496, P80, DOI 10.1007/978-3-642-17746-0_6
[10]   Generating complex ontology instances from documents [J].
Danger, Roxana ;
Berlanga, Rafael .
JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2009, 64 (01) :16-30