Author
Listed:
- Yusen Zhao
- Liang Tian
- Yonggang Wang
Abstract
Automated anomaly detection is vital to industrial quality control, yet conventional deep learning detectors often struggle with scalability. These models, typically following a rigid “one-model-per-task” paradigm, require separate systems for each product line, increasing operational complexity and cost in diverse manufacturing environments. To address this limitation, we propose a unified defect detection framework based on a Multimodal Large Language Model (MLLM). Our approach utilizes a two-stage fine-tuning strategy: Supervised Fine-Tuning (SFT) to impart domain-specific knowledge, followed by a novel Reinforcement Fine-Tuning (RFT) process that refines visual reasoning. This RFT stage is guided by a multi-faceted verifiable reward function designed to optimize localization accuracy, classification correctness, and output structure. On a challenging real-world glove manufacturing dataset, our RFT-enhanced MLLM achieves a mean Average Precision (mAP) of 0.63, which is comparable to a highly specialized YOLO baseline (0.62). More importantly, a single, unified MLLM trained on a mixed-product dataset maintains competitive performance (mAP 0.61), demonstrating its ability to dynamically handle different products and defect types via natural language prompts. This study validates the feasibility of using a single, flexible MLLM to replace multiple rigid models in complex industrial inspection, offering a scalable and cost-effective paradigm for future intelligent quality control systems. The open-source code will be released at https://github.com/GloamXun/Glove-MLLM.
Suggested Citation
Yusen Zhao & Liang Tian & Yonggang Wang, 2026.
"A unified vision-language model for cross-product defect detection in glove manufacturing,"
PLOS ONE, Public Library of Science, vol. 21(2), pages 1-13, February.
Handle:
RePEc:plo:pone00:0339867
DOI: 10.1371/journal.pone.0339867
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0339867. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.