Efficient data sampling in heterogeneous peer-to-peer networks

被引:0
|
作者
Arai, Benjamin [1 ]
Lin, Song [1 ]
Gunopulos, Dimitrios [1 ]
机构
[1] Univ Calif Riverside, Dept Comp Sci & Engn, Riverside, CA 92521 USA
来源
ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING | 2007年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Performing data-mining tasks such as clustering, classification, and prediction on large datasets is an arduous task and, many times, it is an infeasible task given current hardware limitations. The distributed nature of peer-to-peer databases further complicates this issue by introducing an access overhead cost in addition to the cost of sending individual tuples over the network. We propose a two-level sampling approach focusing on peer-to-peer databases for maximizing sample quality given a user-defined communication budget. Given that individual peers may have varying cardinality we propose an algorithm for determining the optimal sample rate (the percentage of tuples to sample from a peer)for each peer We do this by analyzing the variance of individual peers, ultimately minimizing the total variance of the entire sample. By performing local optimization of individual peer sample rates we maximize approximation accuracy of the samples. We also offer several techniques for sampling in peer-to-peer databases given various amounts of known and unknown information about the network and its peers.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [41] An efficient index dissemination in unstructured peer-to-peer networks
    Takahashi, Yusuke
    Izumi, Taisuke
    Kakugawa, Hirotsugu
    Masuzawa, Toshimitsu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (07): : 1971 - 1981
  • [42] Peer-to-peer networks
    Fox, G
    COMPUTING IN SCIENCE & ENGINEERING, 2001, 3 (03) : 75 - 77
  • [43] Jupiter: Peer-to-peer networking platform over heterogeneous networks
    Ishikawa, Norihiro
    Kato, Takeshi
    Sumino, Hiromitsu
    Hjelm, Johan
    Miyatsu, Kazuhiro
    Murakami, Shingo
    3rd International Conference on Computing, Communications and Control Technologies, Vol 2, Proceedings, 2005, : 1 - 8
  • [44] Scalable peer-to-peer multimedia streaming model in heterogeneous networks
    Itaya, S
    Hayashibara, N
    Enokido, T
    Takizawa, M
    ISM 2005: SEVENTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2005, : 208 - 215
  • [45] Peer-to-peer technology for interconnecting Web services in heterogeneous networks
    Schattkowsky, T
    Loeser, C
    Müller, W
    18TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 1 (LONG PAPERS), PROCEEDINGS, 2004, : 611 - 616
  • [46] Opportunistic data dissemination in mobile peer-to-peer networks
    Sistla, AP
    Wolfson, O
    Xu, B
    ADVANCES IN SPATIAL AND TEMPORAL DATABASES, PROCEEDINGS, 2005, 3633 : 346 - 363
  • [47] Comparison of Topologies in Peer-to-Peer Data Sharing Networks
    Campos, Jordi
    Pique, Nuria
    Lopez-Sanchez, Maite
    Esteva, Marc
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2010, 220 : 49 - 58
  • [48] On the Impact of Mobile Hosts in Peer-to-Peer Data Networks
    Zhuang, Zhenyun
    Kakumanu, Sandeep
    Jeong, Yeonsik
    Sivakumar, Raghupathy
    Velayutham, Aravind
    28TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, VOLS 1 AND 2, PROCEEDINGS, 2008, : 45 - +
  • [49] Uncertain Data Clustering in Distributed Peer-to-Peer Networks
    Zhou, Jin
    Chen, Long
    Chen, C. L. Philip
    Wang, Yingxu
    Li, Han-Xiong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) : 2392 - 2406
  • [50] Hierarchical Data Distribution Scheme for Peer-to-Peer Networks
    Bhushan, Shashi
    Dave, M.
    Patel, R. B.
    INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN SCIENCE AND TECHNOLOGY (ICM2ST-10), 2010, 1324 : 332 - +