Semi-Supervised Knowledge Distillation Framework towards Lightweight Large Language Model for Spoken Language Translation

Rajkhowa T.; Chowdhury A.R.; Tripathi A.M.; Sharma S.; Pandey O.J.

doi:https://doi.org/10.1109/ICASSP49660.2025.10888265

Semi-Supervised Knowledge Distillation Framework towards Lightweight Large Language Model for Spoken Language Translation

Authors

Rajkhowa T.; Chowdhury A.R.; Tripathi A.M.; Sharma S.; Pandey O.J.

Abstract

Even though large language models (LLMs) have demonstrated remarkable performance across various natural language processing tasks, their application in speech-related tasks has largely remained underexplored. This work addresses this gap by incorporating acoustic features into an LLM which can be fine-tuned for downstream direct speech-to-text translation and automatic speech recognition tasks. To address the computational demands associated with fine-tuning LLMs, a novel self and semi-supervised knowledge distillation technique is proposed to implement a lightweight LLM having 50% lesser parameters. Validated on the MuST-C and Librispeech datasets, this technique achieves over 92% of the performance of the larger LLM, demonstrating both robust performance and computational efficiency. © 2025 IEEE.