IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v17y2025i10p4432-d1654886.html
   My bibliography  Save this article

Contrastive Learning-Based Cross-Modal Fusion for Product Form Imagery Recognition: A Case Study on New Energy Vehicle Front-End Design

Author

Listed:
  • Yutong Zhang

    (School of Arts and Design, Yanshan University, Qinhuangdao 066000, China)

  • Jiantao Wu

    (School of Arts and Design, Yanshan University, Qinhuangdao 066000, China)

  • Li Sun

    (School of Arts and Design, Yanshan University, Qinhuangdao 066000, China
    Coastal Area Port Industry Development Collaborative Innovation Center, Yanshan University, Qinhuangdao 066000, China)

  • Guoan Yang

    (School of Arts and Design, Yanshan University, Qinhuangdao 066000, China)

Abstract

Fine-grained feature extraction and affective semantic mapping remain significant challenges in product form analysis. To address these issues, this study proposes a contrastive learning-based cross-modal fusion approach for product form imagery recognition, using the front-end design of new energy vehicles (NEVs) as a case study. The proposed method first employs the Biterm Topic Model (BTM) and Analytic Hierarchy Process (AHP) to extract thematic patterns and compute weight distributions from consumer review texts, thereby identifying key imagery style labels. These labels are then leveraged for image annotation, facilitating the construction of a multimodal dataset. Next, ResNet-50 and Transformer architectures serve as the image and text encoders, respectively, to extract and represent multimodal features. To ensure effective alignment and deep fusion of textual and visual representations in a shared embedding space, a contrastive learning mechanism is introduced, optimizing cosine similarity between positive and negative sample pairs. Finally, a fully connected multilayer network is integrated at the output of the Transformer and ResNet with Contrastive Learning (TRCL) model to enhance classification accuracy and reliability. Comparative experiments against various deep convolutional neural networks (DCNNs) demonstrate that the TRCL model effectively integrates semantic and visual information, significantly improving the accuracy and robustness of complex product form imagery recognition. These findings suggest that the proposed method holds substantial potential for large-scale product appearance evaluation and affective cognition research. Moreover, this data-driven fusion underpins sustainable product form design by streamlining evaluation and optimizing resource use.

Suggested Citation

  • Yutong Zhang & Jiantao Wu & Li Sun & Guoan Yang, 2025. "Contrastive Learning-Based Cross-Modal Fusion for Product Form Imagery Recognition: A Case Study on New Energy Vehicle Front-End Design," Sustainability, MDPI, vol. 17(10), pages 1-28, May.
  • Handle: RePEc:gam:jsusta:v:17:y:2025:i:10:p:4432-:d:1654886
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/17/10/4432/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/17/10/4432/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:17:y:2025:i:10:p:4432-:d:1654886. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.