Domain Heuristic Fusion of Multi-Word Embeddings for Nutrient Value Prediction

被引:3
|
作者
Ispirova, Gordana [1 ,2 ]
Eftimov, Tome [1 ]
Korousic Seljak, Barbara [1 ]
机构
[1] Jozef Stefan Inst, Comp Syst Dept, Ljubljana 1000, Slovenia
[2] Jozef Stefan Int Postgrad Sch, Ljubljana 1000, Slovenia
基金
欧盟地平线“2020”;
关键词
domain-specific embeddings; domain knowledge; machine learning; data mining; macronutrient prediction; representation learning; word embeddings; paragraph embeddings;
D O I
10.3390/math9161941
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Being both a poison and a cure for many lifestyle and non-communicable diseases, food is inscribing itself into the prime focus of precise medicine. The monitoring of few groups of nutrients is crucial for some patients, and methods for easing their calculations are emerging. Our proposed machine learning pipeline deals with nutrient prediction based on learned vector representations on short text-recipe names. In this study, we explored how the prediction results change when, instead of using the vector representations of the recipe description, we use the embeddings of the list of ingredients. The nutrient content of one food depends on its ingredients; therefore, the text of the ingredients contains more relevant information. We define a domain-specific heuristic for merging the embeddings of the ingredients, which combines the quantities of each ingredient in order to use them as features in machine learning models for nutrient prediction. The results from the experiments indicate that the prediction results improve when using the domain-specific heuristic. The prediction models for protein prediction were highly effective, with accuracies up to 97.98%. Implementing a domain-specific heuristic for combining multi-word embeddings yields better results than using conventional merging heuristics, with up to 60% more accuracy in some cases.
引用
收藏
页数:15
相关论文
共 7 条
  • [1] l(1) Regularization of Word Embeddings for Multi-Word Expression Identification
    Berend, Gabor
    ACTA CYBERNETICA, 2018, 23 (03): : 801 - 813
  • [2] Expanding Sentiment Lexicon with Multi-word Terms for Domain-Specific Sentiment Analysis
    Tan, Sang-Sang
    Na, Jin-Cheon
    DIGITAL LIBRARIES: KNOWLEDGE, INFORMATION, AND DATA IN AN OPEN ACCESS SOCIETY, 2016, 10075 : 285 - 296
  • [3] Multi-domain knowledge graph embeddings for gene-disease association prediction
    Nunes, Susana
    Sousa, Rita T.
    Pesquita, Catia
    JOURNAL OF BIOMEDICAL SEMANTICS, 2023, 14 (01)
  • [4] Multi-domain knowledge graph embeddings for gene-disease association prediction
    Susana Nunes
    Rita T. Sousa
    Catia Pesquita
    Journal of Biomedical Semantics, 14
  • [5] Multi-domain sentiment analysis with mimicked and polarized word embeddings for human-robot interaction
    Atzeni, Mattia
    Recupero, Diego Reforgiato
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 110 : 984 - 999
  • [6] An Empirical Semi-Supervised Machine Learning Approach on Extracting and Ranking Document Level Multi-Word Product Names Using Improved C-value Approach
    Sivashankari, R.
    Valarmathi, B.
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 770 - 775
  • [7] Robust Tool Wear Prediction using Multi-Sensor Fusion and Time-Domain Features for the Milling Process using Instance-based Domain Adaptation
    Warke, Vivek
    Kumar, Satish
    Bongale, Arunkumar
    Kotecha, Ketan
    KNOWLEDGE-BASED SYSTEMS, 2024, 288