MFCS-based Barrier-Free Audio Description Production System for Film and Television Programs
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
With the rapid advancement of artificial intelligence technology, the field of accessible film and television production is encountering unprecedented opportunities and challenges. Traditional audio description production processes are often complex, costly, and labour-intensive, struggling to meet the growing and evolving demands for accessibility. In this paper, we propose and implement an automatic audio description generation system for accessible film and television programs based on the Model Function Calling Standard (MFCS). Leveraging the powerful semantic understanding and generation capabilities of large language models (LLMs), combined with MFCS's standardized tool invocation mechanism, the system integrates various external tools and services such as natural language processing APIs, speech recognition and synthesis engines, and multilingual translation models. It achieves automatic generation of audio descriptions, multilingual support, emotional adaptation, and speech synthesis, constructing an efficient, flexible, and scalable audio description production platform. Experimental results show that the system significantly improves the efficiency and quality of audio description production, reduces costs, and thereby providing visually impaired audiences with a more diverse and personalized viewing experience. Furthermore, we explore the potential for further application of MFCS in accessible film and television production offering new insights for promoting the widespread adoption of accessible information services and technological innovation.