Large amount of data that is generated by Internet and enterprize applications are stored in the form of graphs. Graph processing systems are broadly used in enterprizes to process such data. With the rapid growth in mobile and social applications and complicated connections of Internet websites, massive concurrent operations need to be handled. On the other hand, the intrinsic structure and the size of real-world graphs make distributed processing of graphs more challenging. Low balanced communication and computation, low preprocessing overhead, low memory footprint, and scalability should be offered by distributed graph analytics frameworks. Moreover, the effects of network factors such as bandwidth and traffic as well as monetary cost of processing such large-scale graphs and the mutual impact of these elements have been less studied. To address these issues, we proposed two dynamic repartitioning algorithms that consider network factors, affecting public cloud environments to decrease the monetary cost of processing. A new classification of graph algorithms and processing is also introduced, which will be used to choose the best strategy for processing at any operation. We plugged these algorithms to our extended graph processing system (iGiraph) and compared them with those supported in other graph processing systems such as Giraph and Surfer on Australian National Cloud Infrastructure. We observed that up to 30% faster execution time, up to 50% network traffic decline, and more than 50% cost reduction are achieved by our algorithms compared to a framework such as the popular Giraph.
机构:
Univ Illinois, Natl Ctr Data Min, Chicago, IL 60680 USA
Open Data Group, River Forest, IL USAUniv Illinois, Natl Ctr Data Min, Chicago, IL 60680 USA
Grossman, Robert
Gu, Yunhong
论文数: 0引用数: 0
h-index: 0
机构:
Univ Illinois, Natl Ctr Data Min, Chicago, IL 60680 USAUniv Illinois, Natl Ctr Data Min, Chicago, IL 60680 USA
Gu, Yunhong
Sabala, Michal
论文数: 0引用数: 0
h-index: 0
机构:
Univ Illinois, Natl Ctr Data Min, Chicago, IL 60680 USAUniv Illinois, Natl Ctr Data Min, Chicago, IL 60680 USA
Sabala, Michal
Bennet, Colin
论文数: 0引用数: 0
h-index: 0
机构:
Open Data Group, River Forest, IL USAUniv Illinois, Natl Ctr Data Min, Chicago, IL 60680 USA
Bennet, Colin
Seidman, Jonathan
论文数: 0引用数: 0
h-index: 0
机构:
Open Data Group, River Forest, IL USAUniv Illinois, Natl Ctr Data Min, Chicago, IL 60680 USA
Seidman, Jonathan
Mambratti, Joe
论文数: 0引用数: 0
h-index: 0
机构:
Northwestern Univ, Int Ctr Adv Internet Res, Evanston, IL USAUniv Illinois, Natl Ctr Data Min, Chicago, IL 60680 USA