A Hybrid Method for Extracting Deep Web Information

被引：0

作者：

Zhang, Yuanpeng ^{[1
]}

Wang, Li ^{[1
]}

Jiang, Kui ^{[1
]}

Qian, Danmin ^{[1
]}

Dong, Jiancheng ^{[1
]}

机构：

[1] Nantong Univ, Sch Med, Dept Med Informat, Nantong 226001, Jiangsu, Peoples R China

来源：

PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING | 2015年 / 124卷

关键词：

information extraction; clinic expert information; domain model; block importance model; SVM;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Some previous works show that more than 60% of the information available on the Web is located in Deep Web database. Such information cannot be directly indexed by search engines. In this paper, a hybrid method, which is composed of a domain model and a block importance model is proposed to extract information in Deep Web. The domain model is used for classifying and identifying whether a form is a WQI. The block importance model is used for filtering noisy information in response pages. These two models are both compared with a rule-based method. The experiment results indicate that the domain model yields a precision6.44% higher than that of the rulebased method, whereas the block importance model yields an F1 measure 10.5% higher thanthat of the XPath method.

引用

页码：777 / 782

页数：6

共 50 条

[31] Extracting Personal Information from Conversations
Tigunova, Anna
WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 284 - 288
[32] Deep learning in extracting tropical cyclone intensity and wind radius information from satellite infrared images -A review
Wang, Chong
Li, Xiaofeng
ATMOSPHERIC AND OCEANIC SCIENCE LETTERS, 2023, 16 (04)
[33] Extracting News Content with Visual Unit of Web Pages
Zhu, Wenhao
Dai, Song
Song, Yang
Lu, Zhiguo
2015 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2015, : 211 - 215
[34] Extracting Contextualized Quantity Facts from Web Tables
Ho, Vinh Thinh
Pal, Koninika
Razniewski, Simon
Berberich, Klaus
Weikum, Gerhard
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 4033 - 4042
[35] A Method of Web Information Extraction Based on Building Different Sub Trees
Wang, Yuanlong
Jiang, Hong
Bing, Zhaohong
Zhang, Li
MANUFACTURING PROCESS AND EQUIPMENT, PTS 1-4, 2013, 694-697 : 2513 - +
[36] Extracting protein-protein interaction information from biomedical text with SVM
Mitsumori, Tomohiro
Murata, Masaki
Fukuda, Yasushi
Doi, Kouichi
Doi, Hirohumi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (08) : 2464 - 2466
[37] Towards extracting semantic information from texts
Trandabat, Diana
13TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2011), 2012, : 199 - 206
[38] Extracting information from unknown protocols on CampusNet
Yu, Zhuanghui
Huang, Yongzhong
Guo, Shaozhong
Zhou, Bei
Ren, Hua
PROCEEDINGS OF THE 2007 1ST INTERNATIONAL SYMPOSIUM ON INFORMATION TECHNOLOGIES AND APPLICATIONS IN EDUCATION (ISITAE 2007), 2007, : 535 - +
[39] Research on extracting method of micro-scale remote sensing information combination and application in coastal zone
YANG Xiaomei 1
ActaOceanologicaSinica, 2009, 28 (05) : 30 - 38
[40] A METHOD FOR EXTRACTING VEGETATION INFORMATION OF URBAN UNDERLAYING SURFACE ORIENTED TO ECO-ENVIRONMENTAL QUALITY ASSESSMENT
Zhang, Xiaoyuan
Song, Yulun
Wang, Shudong
Zhang, Lifu
Zhang, Xia
2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 3479 - 3482

← 1 2 3 4 5 →