LiveSVG turns a static SVG into a prompt-aligned animation, using video generation for motion guidance.
The pipeline separates motion generation from SVG optimization, so the target motion is explicit before vector fitting begins.
An LLM groups related SVG elements so coherent parts can move together.
Temporary sphere-packing colors separate paths and improve pixel-space correspondence.
An image-to-video model creates a previewable motion target from the SVG and prompt.
Group homographies and local Bézier offsets deform the original vector geometry over time.
Pipeline overview. The target video provides the motion; differentiable rendering transfers it back to editable SVG paths.
LiveSVG (highlighted) produces large, topology-preserving deformations. SDS- and LLM-based baselines remain close to the static input or exhibit only limited motion.












Different random seeds produce distinct motions for the same input. Users preview each candidate before committing to SVG fitting — a key advantage of decoupling video generation from optimization.








All clips are live SVG files animating via SMIL — the original editable vector geometry is preserved in every frame.












AniClipart covers simple single-subject clipart. ChallengeSVG adds 35 complex, multi-object SVGs from SVGX-Core-250k that expose failure modes beyond this narrow setting.
Single-subject clipart with clean backgrounds and roughly 20 paths per example, biased toward human and animal subjects.
Multi-object scenes with layered occlusions, non-empty backgrounds, dense path counts, and open-domain subjects.



















LiveSVG wins human preference on both benchmarks and achieves the lowest runtime and GPU footprint among optimization-based baselines.
Best prompt alignment (X-CLIP), best appearance preservation among optimization-based methods, and the lowest computational cost by a large margin.
| Method | X-CLIP↑ | LPIPS↓ | SSIM↑ | DOVER↑ | Time↓ | VMem↓ |
|---|---|---|---|---|---|---|
| No animation | 0.211 | 0.000 | 1.000 | 0.444 | — | — |
| LLM-based | ||||||
| Vector Prism | 0.211 | 0.032 | 0.973 | 0.451 | 4.6m | — |
| Video SDS optimization | ||||||
| LiveSketch | 0.206 | 0.153 | 0.910 | 0.496 | 22.3m | 40.0G |
| AniClipart | 0.214 | 0.104 | 0.937 | 0.427 | 6.9m | 27.8G |
| FlexiClip | 0.213 | 0.092 | 0.938 | 0.431 | 14.5m | 28.1G |
| LINR-Bridge | 0.215 | 0.174 | 0.925 | 0.433 | 10.7m | 16.8G |
| Target-video fitting (ours) | ||||||
| LiveSVG (Veo 3.1)Best | 0.216 | 0.087 | 0.942 | 0.447 | 5.2m | 7.4G |
| LiveSVG (LTX 2.3) | 0.215 | 0.105 | 0.940 | 0.445 | 5.2m | 7.4G |
| LiveSVG (WAN 2.2) | 0.214 | 0.116 | 0.938 | 0.446 | 5.2m | 7.4G |
| Method | X-CLIP↑ | LPIPS↓ | SSIM↑ | DOVER↑ | Time↓ | VMem↓ |
|---|---|---|---|---|---|---|
| No animation | 0.214 | 0.000 | 1.000 | 0.470 | — | — |
| LLM-based | ||||||
| Vector Prism | 0.211 | 0.139 | 0.867 | 0.461 | 5.2m | — |
| Video SDS optimization | ||||||
| LiveSketch | 0.182 | 0.503 | 0.609 | 0.397 | 5.6m | 40.4G |
| AniClipart | 0.204 | 0.274 | 0.781 | 0.433 | 26.1m | 28.1G |
| FlexiClip | 0.201 | 0.298 | 0.773 | 0.424 | 50.5m | 28.1G |
| LINR-Bridge | 0.205 | 0.491 | 0.729 | 0.426 | 15.7m | 17.2G |
| Target-video fitting (ours) | ||||||
| LiveSVG (WAN 2.2)Best | 0.215 | 0.208 | 0.844 | 0.476 | 4.7m | 7.4G |
BibTeX coming soon. A citation entry will be made available once the paper is published.