Research on Feature Fusion and Multimodal Patent Text Based on Graph Attention Network

Research on Feature Fusion and Multimodal Patent Text Based on Graph Attention Network

Author

Listed:

Song, Zhenzhen
Liu, Ziwei
Li, Hongji

Abstract

Aiming at the challenges of cross-modal feature fusion, low computational efficiency in long patent text modeling, and insufficient hierarchical semantic coherence in patent text semantic mining, this study proposes a novel deep learning framework termed HGM-Net. The framework integrates Hierarchical Comparative Learning (HCL), a Multi-modal Graph Attention Network (M-GAT), and Multi-Granularity Sparse Attention (MSA) to achieve robust, efficient, and semantically consistent patent representation learning. Specifically, HCL introduces dynamic masking, contrastive learning, and cross-structural similarity constraints across word-, sentence-, and paragraph-level hierarchies, enabling the model to jointly capture fine-grained local semantics and high-level thematic consistency. Contrastive and cross-structural similarity constraints are particularly enforced at the word and paragraph levels, effectively enhancing semantic discrimination and global coherence within complex patent documents. Furthermore, M-GAT models patent classification codes, citation relationships, and textual semantics as heterogeneous graph structures, and employs cross-modal gated attention mechanisms to dynamically fuse multi-source and multi-modal features, thereby improving representation completeness and robustness. To address the high computational cost of long-text processing, MSA adopts a hierarchical sparse attention strategy that selectively allocates attention across multiple granularities, including words, phrases, sentences, and paragraphs, significantly reducing computational overhead while preserving critical semantic information. Extensive experimental evaluations on patent classification and similarity matching tasks demonstrate that HGM-Net consistently outperforms existing state-of-the-art deep learning approaches. The results validate the effectiveness and generalization capability of the proposed framework, highlighting its theoretical innovation and practical value in improving patent examination efficiency and enabling large-scale technology relevance mining.

Suggested Citation

Song, Zhenzhen & Liu, Ziwei & Li, Hongji, 2026. "Research on Feature Fusion and Multimodal Patent Text Based on Graph Attention Network," Journal of Computer, Signal, and System Research, George Brown Press, vol. 3(1), pages 93-100.

Handle: RePEc:dbb:jcssra:v:3:y:2026:i:1:p:93-100

Download full text from publisher

More about this item

Keywords

; ; ; ;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:dbb:jcssra:v:3:y:2026:i:1:p:93-100. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Guangyi Li (email available below). General contact details of provider: https://www.gbspress.com/index.php/JCSSR .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Research on Feature Fusion and Multimodal Patent Text Based on Graph Attention Network

Author

Abstract

Suggested Citation

Download full text from publisher

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data