Automatic blockchain whitepapers analysis via heterogeneous graph neural network

被引:6
作者
Liu, Lin [1 ,2 ,3 ]
Tsai, Wei-Tek [1 ,2 ,3 ,4 ,5 ,6 ,7 ]
Bhuiyan, Md Zakirul Alam [8 ]
Yang, Dong [1 ,2 ,3 ]
机构
[1] Beihang Univ, State Key Lab Software Environm, Beijing, Peoples R China
[2] Beihang Univ, Digital Soc & Blockchain Lab, Beijing, Peoples R China
[3] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China
[4] Arizona State Univ, Tempe, AZ 85287 USA
[5] Beijing Tiande Technol, Beijing, Peoples R China
[6] Andrew Int Sandbox Inst, Qingdao, Peoples R China
[7] Natl BigData Comprehens Expt Area, IOB Lab, Guiyang, Guizhou, Peoples R China
[8] Fordham Univ, Dept Comp & Informat Sci, Bronx, NY 10458 USA
基金
中国国家自然科学基金;
关键词
Blockchain; Heterogeneous graph neural network; Classification; Clustering; Heterogeneous information networks;
D O I
10.1016/j.jpdc.2020.05.014
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The blockchain whitepaper contains detailed technical and business information, so its analysis is important for blockchain text mining. Previous works focus on analyze homogeneous objects and relations. The main problem, however, is these works do not take into account the heterogeneity of information. This paper presents a new methodology for whitepapers analysis by designing heterogeneous graph neural network, named S-HGNN. In detail, this paper first builds a Heterogeneous Information Network (HIN) using heterogeneous objects and relationships extracted from the whitepaper to obtain similarity measures, then uses Graph Convolutional Network (GCN) and Graph Attention Network (GAT) to integrate both structural information and internal semantic into the whitepaper embedding. Compared with the previous models, this model improves 0.96%similar to 33.34% in terms of F1-score for classification task, and 4.94%similar to 14.14% in terms of purity for clustering task, and gets stable results on different tasks. The results show the effectiveness and robustness of this model for whitepapers analysis. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 37 条
  • [1] Aggarwal C. C., 2012, MINING TEXT DATA, DOI 10.1007/978-1-4614-3223-4_6
  • [2] Aggarwal CC, 2006, SIAM PROC S, P479
  • [3] Privacy-friendly platform for healthcare data in cloud based on blockchain environment
    Al Omar, Abdullah
    Bhuiyan, Md Zakirul Alam
    Basu, Anirban
    Kiyomoto, Shinsaku
    Rahman, Mohammad Shahriar
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 95 : 511 - 521
  • [4] Alnafrah I., 2019, INT J INTELLECT PROP, V9
  • [5] [Anonymous], 2017, INT C LEARN REPR ICL
  • [6] Detecting hospital-acquired infections: A document classification approach using support vector machines and gradient tree boosting
    Ehrentraut, Claudia
    Ekholm, Markus
    Tanushi, Hideyuki
    Tiedemann, Jorg
    Dalianis, Hercules
    [J]. HEALTH INFORMATICS JOURNAL, 2018, 24 (01) : 24 - 42
  • [7] Ester M., 1996, P 2 INT C KNOWL DISC, P226, DOI DOI 10.5555/3001460.3001507
  • [8] Gao L, 2005, PROC WRLD ACAD SCI E, V8, P110
  • [9] Garcia V, 2017, FEW SHOT LEARNING GR
  • [10] Gomaa A., 2019, SSRN ELECT J