Author
Listed:
- Sheng Ran
- Tao Huang
- Wuyue Yang
Abstract
Knowledge Distillation (KD) is one of the most effective and widely used methods for model compression of large models. It has achieved significant success with the meticulous development of distillation losses. However, most state-of-the-art KD losses are manually crafted and task-specific, raising questions about their contribution to distillation efficacy. This paper unveils Learnable Knowledge Distillation (LKD), a novel approach that autonomously learns adaptive, performance-driven distillation losses. LKD revolutionizes KD by employing a bi-level optimization strategy and an iterative optimization that differentiably learns distillation losses aligned with the students’ validation loss. Building upon our proposed generic loss networks for logits and intermediate features, we derive a dynamic optimization strategy to adjust losses based on the student models’ changing states for enhanced performance and adaptability. Additionally, for a more robust loss, we introduce a uniform sampling of diverse previously-trained student models to train the loss with various convergence rates of predictions. With the more universally adaptable distillation framework of LKD, we conduct experiments on various datasets such as CIFAR and ImageNet, demonstrating our superior performance without the need for task-specific adjustments. For example, our LKD achieves 73.62% accuracy with the MobileNet model on ImageNet, significantly surpassing our KD baseline by 2.94%.
Suggested Citation
Sheng Ran & Tao Huang & Wuyue Yang, 2025.
"Tailored knowledge distillation with automated loss function learning,"
PLOS ONE, Public Library of Science, vol. 20(6), pages 1-16, June.
Handle:
RePEc:plo:pone00:0325599
DOI: 10.1371/journal.pone.0325599
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0325599. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.