Human-Machine Collaborative Enhanced Interpretable Distillation Model for High-Precision Online Defect Detection
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Online vision-based defect detection is highly preferred in smart manufacturing for its ability to provide immediate feedback and enable timely correction. However, effective human-machine collaboration in practical deployment faces significant challenges: existing models often lack interpretability, hindering operators from understanding the rationale behind model decisions, effectively intervening in critical judgments, or optimizing the process, thus limiting the system's reliability and efficiency. Concurrently, online detection imposes stringent demands on real-time performance. To address these dual challenges, this research proposes a Human-Machine Collaborative Enhanced Interpretable Knowledge Distillation strategy. It aims to boost the real-time performance of detection models while guaranteeing high accuracy and, crucially, interpretability, thereby effectively supporting human-machine collaboration. Firstly, a CNN-Transformer hybrid network is designed, leveraging the strengths of self-attention for global receptive fields and convolution operations for local receptive fields, to robustly extract features of tiny and irregularly shaped defects. Secondly, an innovative explainable knowledge quantization method is devised to quantize defect and texture features into interpretable knowledge units, explicitly characterizing the model's capability in feature extraction and providing a transparent basis for human interaction. Finally, an explainable knowledge alignment loss function is proposed. It utilizes the superior defect feature extraction capability of the teacher model as a key learning objective for the student model, enabling the student to achieve more precise defect detection with a simpler network architecture. Experimental results demonstrate that the proposed CNN-Transformer hybrid network achieves over 95% accuracy and recall. Visualization experiments confirm that the method better focuses on defect features. More importantly, the explainable knowledge distillation strategy significantly outperforms other lightweight methods. It not only satisfies the stringent accuracy and real-time requirements of online defect detection but, critically, its inherent interpretability directly empowers human-machine collaboration. This allows operators to comprehend, trust, and effectively utilize the model's outputs, collaboratively enhancing the overall performance of the detection system.