In December 2024, NLP group has one paper accepted by AAAI 2025. The full name of AAAI 2025 is the Thirty-Nineth AAAI Conference on Artificial Intelligence, which is one of the top conferences in artificial intelligence. It is supported annually by the AAAI, the Association for the Advancement of Artificial Intelligence. AAAI 2025 will be held in Philadelphia, USA from February 25th to March 4th, 2025.
The accepted paper is summarized as follows:
- Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation (Shoutao Guo, Shaolei Zhang, Zhengrui Ma, Yang Feng)
- AAAI Main Conference, long paper
Streaming generation models generate responses while reading inputs, often requiring a policy-decision maker to determine the appropriate timing for output. Existing streaming generation methods typically adopt traditional encoder-decoder architectures and use complex dynamic programming techniques to simultaneously learn both generation and decision-making capabilities. While current large language models (LLMs) excel in text generation, they face significant challenges when acting as decision-makers using conventional training methods, limiting their exploration in streaming generation. To overcome these limitations, we propose a novel LLMs-driven Streaming Generation framework (LSG) that allows an off-the-shelf LLM to decide the timing of generation while generating output simultaneously. Specifically, LSG chooses a latency-minimizing generation strategy as the baseline policy. With this reference policy, LSG enables LLMs to design better generation policies, achieving a better balance between latency and generation quality, and generating results accordingly. Our experiments on streaming text-to-text translation, streaming speech-to-text translation, and streaming automatic speech recognition tasks show that our approach achieves state-of-the-art performance using open-source LLMs and demonstrates its practical applicability in real-world scenarios.