A Bespoked secure framework for an ontology-based data-extraction system

被引:0
作者
Indumathi J. [1 ]
Uma G.V. [1 ]
机构
[1] Department of Computer Science and Engineering, College of Engineering, Anna University, Chennai 600 025, Tamilnadu
来源
Journal of Software Engineering | 2010年 / 4卷 / 02期
关键词
Data extraction; Natural language processing; Ontology; Semi-structured or unstructured documents; Web pages;
D O I
10.3923/jse.2008.10.22
中图分类号
学科分类号
摘要
In this Bespoked Secure Framework for an Ontology-Based Data-Extraction System study, we report on the implementation of existing generalized framework with alternate technology. Implementation is done using Natural language processing instead of heuristic based method. Heuristic methods are based on assumptions. The assumptions are just unspecified and as a consequence not understood. If for a given secure data extraction limitation problem, the realization of model-based solutions appears to be too complicated or too pricey to carry out. Heuristic approaches need to be incorporated with a meticulous analysis designed at checking the extent to which the approach formalizes rational agency preference structures and/or data user behaviors. Our Secure Data Extraction system will allow new algorithms and ideas to be incorporated into a Data extraction system. Extraction of information from semi-structured or unstructured documents, such as web pages, is a useful yet complex task. Ontologies can achieve a high degree of accuracy and Privacy in Data extraction system while maintaining resiliency in the face of document changes. Ontologies do not, however, diminish the complexity of a Data-extraction system. As research in the field progress, the need for a modular Data-extraction system that decouples the associated processes continues to grow. © 2010 Academic Journals Inc.
引用
收藏
页码:156 / 168
页数:12
相关论文
共 8 条
  • [1] Wessman A., Liddle S.W., Embley D.W., A generalized framework for an ontology-based data-extraction, The Proceedings of the International Conference On Information Systems Technology and its Application, 63, pp. 239-253, (2005)
  • [2] Ashish N., Knoblock C., Wrapper generation for semi-structured Internet sources, SIGMOD Rec., 26, 4, pp. 8-15, (1997)
  • [3] Crescenzi V., Mecca G., Merialdo P., RoadRunner: Towards automatic data extraction from large Web sites, Proceedings of the 27th International Conference on Very Large Data Bases, pp. 109-118, (2001)
  • [4] Embley D.W., Programming with data frames for everyday data items, AFIPS '80 Proceedings, pp. 301-305, (1980)
  • [5] Hammer J., Garcia-Molina H., Nestorov S., Yemeni R., Breunig M., Vassalos V., Template-based wrappers in the Tsimmis system, Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 532-535, (1997)
  • [6] Kushmerick N., Weld D., Doorenbos R., Wrapper induction for information extraction, Proceedings of the International Joint Conference on Artificial Intelligence, pp. 729-737, (1997)
  • [7] Laender A.H.F., Ribeiro-Neto B.A., da Silva A.S., Teixeira J.S., A brief survey of Web data extraction tools, SIGMOD Rec., 31, 2, pp. 84-93, (2002)
  • [8] Liddle S.W., Embley D.W., Woodfield S.N., An Active, Object-Oriented, Model-Equivalent Programming Language, Advances in Object-Oriented Data Modeling, pp. 333-361, (2000)