Visual relationship extraction in images and a semantic interpretation with ontologies

被引:1
作者
Zga A. [1 ,2 ]
Nini B. [3 ]
机构
[1] Department of Mathematics and Computer Science, University of Larbi Ben M’hidi Oum, El Bouaghi
[2] Department Computer Science and Information Technologies, University Kasdi Merbah Ouargla (UKMO), Ghardaia Road, BP. 511
[3] Research Laboratory on Computer Science’s, Department of Mathematics and Computer Science, Complex System (RELA(CS)2), Oum El-Bouaghi University
关键词
deep learning; human-object interaction; large intra-class divergence; ontologies; semantic gap;
D O I
10.1504/IJIIDS.2022.121931
中图分类号
学科分类号
摘要
Nowadays, three challenges should be considered in order to build a strong model that is used to extract and semantically interpret the relationship between objects in images namely: long-tail problem, large intra-class divergence, and the semantic dependency or semantic gap. In order to overcome those challenges, we propose three main contributions: 1) an ontological semantic model to filter false negatives/positives using a statistical ranking module; 2) a combination of semantic ontological module and visual relationship module that both takes as input the results of the statistical ranking module and produces as output classification of < human − predicate − object >; 3) a semantic model for the visual relationship module that ranks each prediction of relation classes by transferring the spatial relationship onto a high dimension spatial feature. We use HCVRD that highlights two important practical problems, the long-tail distribution issue, and the zero-shot problem. The experimental results on the HCVRD dataset demonstrate the superior performance of the proposed approach. Copyright © 2022 Inderscience Enterprises Ltd.
引用
收藏
页码:223 / 247
页数:24
相关论文
共 39 条
  • [1] Achour F., Bouazizi E., Jaziri W., Improving the quality of service of real-time database systems through a semantics-based scheduling strategy, International Journal of Intelligent Information and Database Systems, Inderscience Publishers (IEL), 14, 1, pp. 96-114, (2021)
  • [2] Ahmad A., Abbes A., Naeem R., Semantic content-based image retrieval: a comprehensive study, Journal of Visual Communication and Image Representation, 32, 3, pp. 20-54, (2015)
  • [3] Ahmad K., Conci N., Boato G., De N., Francesco G., USED: a large-scale social event detection dataset, Proceedings of the 7th International Conference on Multimedia Systems, 9, 3, pp. 1-6, (2016)
  • [4] Anwar K., Siddiqui J., Sohail S.S., Machine learning-based book recommender system: a survey and new perspectives, International Journal of Intelligent Information and Database Systems, 13, 2–4, pp. 231-248, (2020)
  • [5] Asim M.N., Wasim M., Khan M.U.G., Mahmood W., Abbasi H.M., A survey of ontology learning techniques and applications, Database, 2018, 1, (2018)
  • [6] Chen L., Xu S., Zhu L., Zhang J., Lei X., Yang G., A deep learning based method for extracting semantic information from patent documents, Science to Metrics, 125, 1, pp. 289-312, (2010)
  • [7] Dai J., Li Y., He K., Sun J., R-FCN: object detection via region-based fully convolutional networks, Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 379-387, (2016)
  • [8] Dai J., Qi H., Xiong Y., Li Y., Zhang G., Hu H., Wei Y., Deformable convolutional networks’, Steganalysis Using Image Quality Metrics, IEEE Transactions on Image Processing, 12, 2, pp. 221-229, (2017)
  • [9] Eric M., Matthias S., Sherzod H., Kevin M., Ralph E., Ontology-driven event type classification in images, CoRR, 3, 47, pp. 47-53, (2020)
  • [10] Escalera S., Fabian J., Pardo P., Baro X., Gonzalez J., Escalante H.J., Misevic D., Steiner U., Guyon I., ChaLearn looking at people 2015: apparent age and cultural event recognition datasets and results, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), pp. 243-251, (2015)