Author
Listed:
- Meilong Zhu
- Mingda Li
- Kangwei Hou
- Zhaohui Wang
- Xianjun Long
Abstract
Patent application technology disclosure document is one of the important bases for judging patent novelty and uniqueness. Automated evaluation can effectively solve the problems of long time and strong subjectivity of human evaluation. The text similarity evaluation algorithm based on corpus and deep learning technology has problems such as insufficient amount of cross-library learning data and insufficient core content tendency in the similarity judgment of patent application technology disclosure document, which limits their performance and practical application. In this paper, we propose a similarity evaluation method of patent application technology disclosure document based on multi-dimensional fusion strategy to realize the similarity measurement of patents. Firstly, in the text preprocessing section, word segmentation reconstruction and similarity evaluation optimization strategies based on word frequency and part-of-speech score weighted fusion are proposed. Then, a similarity calculation method of patent application technology disclosure document based on two new mapping spaces of dot matrix and image is proposed to achieve a more diversified comprehensive evaluation. The algorithm was evaluated by using four published text similarity matching datasets (containing 0–5 or 0/1 labels) and a set of patent application technology disclosure documents. Experimental results show that on the published text similarity matching datasets, the similarity evaluation method under the multi-dimensional fusion strategy proposed in this paper has a discrimination accuracy improvement of about 10% compared to traditional vector semantic model, and can match the discriminative ability of lightweight deep learning models without the need for training. At the same time, the discrimination accuracy of the proposed method on the sample dataset of patent application technology disclosure document is superior to traditional vector semantic model (20%) and various deep learning models (1%-8%), and the precision and recall rate are relatively balanced. The visual analysis results on the dataset of the patent application technology disclosure documents also prove the effectiveness and reliability of the similarity calculation method proposed in the dot matrix and image space, which provide a new idea and method for the similarity evaluation between patent application technology disclosure document.
Suggested Citation
Meilong Zhu & Mingda Li & Kangwei Hou & Zhaohui Wang & Xianjun Long, 2023.
"A multi-dimensional fusion strategy similarity measure method for patent application technology disclosure document,"
PLOS ONE, Public Library of Science, vol. 18(10), pages 1-17, October.
Handle:
RePEc:plo:pone00:0293091
DOI: 10.1371/journal.pone.0293091
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0293091. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.