Abstract
In this work, we implement music production for silent film clips using LLM-driven method.
Given the strong professional demands of film music production, we propose the FilmComposer, simulating the actual workflows of professional musicians.
FilmComposer is the first to combine large generative models with a multi-agent approach, leveraging the advantages of both waveform music and symbolic music generation.
Additionally, FilmComposer is the first to focus on the three core elements of music production for film—audio quality, musicality, and musical development—and introduces various controls, such as rhythm, semantics, and visuals, to enhance these key aspects.
Specifically, FilmComposer consists of the visual processing module, rhythm-controllable MusicGen, and multi-agent assessment, arrangement and mix.
In addition, our framework can seamlessly integrate into the actual music production pipeline and allows user intervention in every step, providing strong interactivity and a high degree of creative freedom.
Furthermore, we propose MusicPro-7k which includes 7,418 film clips, music, description, rhythm spots and main melody, considering the lack of a professional and high-quality film music dataset.
Finally, both the standard metrics and the new specialized metrics we propose demonstrate that the music generated by our model achieves state-of-the-art performance in terms of quality, consistency with video, diversity, musicality, and musical development.