Moving from data-constrained to data-enabled research: Experiences and challenges in collecting, validating and analyzing large-scale e-commerce data

被引:15
作者
Bapna, Ravi [1 ]
Goes, Paulo [1 ]
Gopal, Ram [1 ]
Marsden, James R. [1 ]
机构
[1] Univ Connecticut, Sch Business, Storrs, CT 06269 USA
关键词
large-scale; Internet data; web crawling agents; online auctions; music file sharing;
D O I
10.1214/088342306000000231
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Widespread e-commerce activity on the Internet has led to new opportunities to collect vast amounts of micro-level market and nonmarket data. In this paper we share our experiences in collecting, validating, storing and analyzing large Internet-based data sets in the area of online auctions, music file sharing and online retailer pricing. We demonstrate how such data can advance knowledge by facilitating sharper and more extensive tests of existing theories and by offering observational underpinnings for the development of new theories. Just as experimental economics pushed the frontiers of economic thought by enabling the testing of numerous theories of economic behavior in the environment of a controlled laboratory, we believe that observing, often over extended periods of time, real-world agents participating in market and nonmarket activity on the Internet can lead us to develop and test a variety of new theories. Internet data gathering is not controlled experimentation. We cannot randomly assign participants to treatments or determine event orderings. Internet data gathering does offer potentially large data sets with repeated observation of individual choices and action. In addition, the automated data collection holds promise for greatly reduced cost per observation. Our methods rely on technological advances in automated data collection agents. Significant challenges remain in developing appropriate sampling techniques integrating data from heterogeneous sources in a variety of formats, constructing generalizable processes and understanding legal constraints. Despite these challenges, the early evidence from those who have harvested and analyzed large amounts of e-commerce data points toward a significant leap in our ability to understand the functioning of electronic commerce.
引用
收藏
页码:116 / 130
页数:15
相关论文
共 60 条
[1]  
ALLEN G, 2003, P INT C EL COMM WORK
[2]   Price levels and price dispersion within and across multiple retailer types: Further evidence and extension [J].
Ancarani, F ;
Shankar, V .
JOURNAL OF THE ACADEMY OF MARKETING SCIENCE, 2004, 32 (02) :176-187
[3]  
[Anonymous], NEW PALGRAVE DICT EC
[4]  
[Anonymous], 2005, Berkeley Technology Law Journal
[5]   An empirical analysis of network externalities in peer-to-peer music-sharing networks [J].
Asvanund, A ;
Clay, K ;
Krishnan, R ;
Smith, MD .
INFORMATION SYSTEMS RESEARCH, 2004, 15 (02) :155-174
[6]  
Bailey J. P., 1998, Intermediation and electronic markets: Aggregation and pricing in Internet commerce
[7]   The winner's curse, reserve prices, and endogenous entry:: empirical insights from eBay auctions [J].
Bajari, P ;
Hortaçsu, A .
RAND JOURNAL OF ECONOMICS, 2003, 34 (02) :329-355
[8]   Economic insights from Internet auctions [J].
Bajari, P ;
Hortaçsu, A .
JOURNAL OF ECONOMIC LITERATURE, 2004, 42 (02) :457-486
[9]   Reducing buyer search costs: Implications for electronic marketplaces [J].
Bakos, JY .
MANAGEMENT SCIENCE, 1997, 43 (12) :1676-1692
[10]  
BAKOS JY, 2000, IMPACT ELECT COMMERC