Prompt Relay: Inference-Time Prompt Routing for Temporal Control in Multi-Event Video Generation


S-Lab, Nanyang Technological University

Prompt Relay is now integrated into Wan šŸŽ‰



TL;DR

Existing video generation models do not have mechanisms to support fine-grained temporal control in multi-event video generation. To this end, we propose Prompt Relay, an inference-time, training-free, plug-and-play method to support granular control over the temporal placement of each text prompt.

Temporal cross-attention teaser

Method

Given a sequence of temporally-constrained text prompts {(ps, tsstart, tsend)}Ns=1, our goal is to generate a video such that each arbitrary prompt ps is realized within its designated temporal interval [tsstart, tsend]. The generated video should preserve global coherence while ensuring that each prompt influences only its assigned temporal region.

Video Gallery

Citation

If you find Prompt Relay useful in your research or projects, please consider citing our paper:

@article{chen2026prompt,
  title={Prompt Relay: Inference-Time Temporal Control for Multi-Event Video Generation},
  author={Chen, Gordon and Huang, Ziqi and Liu, Ziwei},
  journal={arXiv preprint arXiv:2604.10030},
  year={2026}
}