Attention-Enhanced Multimodal Learning for Conceptual Design Evaluations

被引:11
作者
Song, Binyang [1 ]
Miller, Scarlett [2 ]
Ahmed, Faez [1 ]
机构
[1] MIT, Dept Mech Engn, Cambridge, MA 02139 USA
[2] Penn State Univ, Sch Engn Design & Innovat, State Coll, PA 16802 USA
关键词
conceptual design; creativity and concept generation; design evaluation; machine learning; multimodal learning; CREATIVITY; NOVELTY;
D O I
10.1115/1.4056669
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
Conceptual design evaluation is an indispensable component of innovation in the early stage of engineering design. Properly assessing the effectiveness of conceptual design requires a rigorous evaluation of the outputs. Traditional methods to evaluate conceptual designs are slow, expensive, and difficult to scale because they rely on human expert input. An alternative approach is to use computational methods to evaluate design concepts. However, most existing methods have limited utility because they are constrained to unimodal design representations (e.g., texts or sketches). To overcome these limitations, we propose an attention-enhanced multimodal learning (AEMML)-based machine learning (ML) model to predict five design metrics: drawing quality, uniqueness, elegance, usefulness, and creativity. The proposed model utilizes knowledge from large external datasets through transfer learning (TL), simultaneously processes text and sketch data from early-phase concepts, and effectively fuses the multimodal information through a mutual cross-attention mechanism. To study the efficacy of multimodal learning (MML) and attention-based information fusion, we compare (1) a baseline MML model and the unimodal models and (2) the attention-enhanced models with baseline models in terms of their explanatory power for the variability of the design metrics. The results show that MML improves the model explanatory power by 0.05-0.12 and the mutual cross-attention mechanism further increases the explanatory power of the approach by 0.05-0.09, leading to the highest explanatory power of 0.44 for drawing quality, 0.60 for uniqueness, 0.45 for elegance, 0.43 for usefulness, and 0.32 for creativity. Our findings highlight the benefit of using multimodal representations for design metric assessment.
引用
收藏
页数:12
相关论文
共 79 条
  • [41] Pennington J, 2014, P 2014 C EMP METH NA, P1532, DOI DOI 10.3115/V1/D14-1162
  • [42] MFAS: Multimodal Fusion Architecture Search
    Perez-Rua, Juan-Manuel
    Vielzeuf, Valentin
    Pateux, Stephane
    Baccouche, Moez
    Jurie, Frederic
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6959 - 6968
  • [43] Peters M. E., 2018, 2018 C N AM CHAPTER, V1, p2227 2237
  • [44] Rotini F., 2018, P INT DESIGN C, P1067
  • [45] Sarica S., 2019, P INT DES ENG TECH C
  • [46] TechNet: Technology semantic network based on patent data
    Sarica, Serhad
    Luo, Jianxi
    Wood, Kristin L.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 142
  • [47] Ideas generated in conceptual design and their effects on creativity
    Sarkar, Prabir
    Chakrabarti, Amaresh
    [J]. RESEARCH IN ENGINEERING DESIGN, 2014, 25 (03) : 185 - 201
  • [48] Assessing design creativity
    Sarkar, Prabir
    Chakrabarti, Amaresh
    [J]. DESIGN STUDIES, 2011, 32 (04) : 348 - 383
  • [49] Seddati O, 2015, INT WORK CONTENT MUL
  • [50] Shah J. J., 2003, Design Studies, V24, P111, DOI 10.1016/S0142-694X(02)00034-0