VIS30K: A Collection of Figures and Tables From IEEE Visualization Conference Publications

被引:30
作者
Chen, Jian [1 ]
Ling, Meng [1 ]
Li, Rui [1 ]
Isenberg, Petra [2 ]
Isenberg, Tobias [2 ]
Sedlmair, Michael [3 ]
Moeller, Torsten [4 ]
Laramee, Robert S. [5 ]
Shen, Han-Wei [1 ]
Wuensche, Katharina [4 ]
Wang, Qiru [5 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
[2] Univ Paris Saclay, CNRS, INRIA, LISN, F-91190 St Aubin, France
[3] Univ Stuttgart, D-70174 Stuttgart, Germany
[4] Univ Vienna, A-1010 Vienna, Austria
[5] Univ Nottingham, Nottingham NG7 2RD, England
基金
英国工程与自然科学研究理事会;
关键词
Data visualization; Visualization; Conferences; Metadata; Tools; Data mining; Electronic mail; IEEE VIS; InfoVis; SciVis; VAST; dataset; bibliometrics; images; figures; tables; OF-THE-ART; INFORMATION; EXPLORATION; EXTRACTION;
D O I
10.1109/TVCG.2021.3054916
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present the VIS30K dataset, a collection of 29,689 images that represents 30 years of figures and tables from each track of the IEEE Visualization conference series (Vis, SciVis, InfoVis, VAST). VIS30K's comprehensive coverage of the scientific literature in visualization not only reflects the progress of the field but also enables researchers to study the evolution of the state-of-the-art and to find relevant work based on graphical content. We describe the dataset and our semi-automatic collection process, which couples convolutional neural networks (CNN) with curation. Extracting figures and tables semi-automatically allows us to verify that no images are overlooked or extracted erroneously. To improve quality further, we engaged in a peer-search process for high-quality figures from early IEEE Visualization papers. With the resulting data, we also contribute VISImageNavigator (VIN, visimagenavigator.github.io), a web-based tool that facilitates searching and exploring VIS30K by author names, paper keywords, title and abstract, and years.
引用
收藏
页码:3826 / 3833
页数:8
相关论文
共 44 条
[21]   On methods and tools of table detection, extraction and annotation in PDF documents [J].
Khusro, Shah ;
Latif, Asima ;
Ullah, Irfan .
JOURNAL OF INFORMATION SCIENCE, 2015, 41 (01) :41-57
[22]   The State of the Art in Sentiment Visualization [J].
Kucher, Kostiantyn ;
Paradis, Carita ;
Kerren, Andreas .
COMPUTER GRAPHICS FORUM, 2018, 37 (01) :71-96
[23]  
Kucher K, 2015, IEEE PAC VIS SYMP, P117, DOI 10.1109/PACIFICVIS.2015.7156366
[24]   Empirical Studies in Information Visualization: Seven Scenarios [J].
Lam, Heidi ;
Bertini, Enrico ;
Isenberg, Petra ;
Plaisant, Catherine ;
Carpendale, Sheelagh .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2012, 18 (09) :1520-1536
[25]   Viziometrics: Analyzing Visual Information in the Scientific Literature [J].
Lee, Po-Shen ;
West, Jevin D. ;
Howe, Bill .
IEEE TRANSACTIONS ON BIG DATA, 2018, 4 (01) :117-129
[26]  
Li P., BIOINFORMATICS, V35, P4381
[27]  
Li R, 2018, 2018 IEEE SCIENTIFIC VISUALIZATION CONFERENCE (SCIVIS), P26, DOI 10.1109/SciVis.2018.8823764
[28]  
Ling Meng, 2020, P 1 WORKSH SCHOL DOC, P91, DOI DOI 10.18653/V1/2020.SDP-1.10
[29]   A model for detecting and merging vertically spanned table cells in plain text documents [J].
Long, V ;
Dale, R ;
Cassidy, S .
EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, :1242-1246
[30]  
Lopez P, 2009, LECT NOTES COMPUT SC, V5714, P473, DOI 10.1007/978-3-642-04346-8_62