LiveSVG: Zero-Shot SVG Animation via Video Generation

Overview

Animated motion while preserving editable SVG geometry

LiveSVG animates an existing SVG from a motion prompt without training on vector-animation data. Instead of optimizing against a noisy in-loop video prior, it first generates a concrete target video that can be previewed, then fits the original SVG paths to that target with differentiable rendering. The result is prompt-aligned motion that remains editable as SVG.

Method

How LiveSVG works

The pipeline separates motion generation from SVG optimization, so the target motion is explicit before vector fitting begins.

Group

An LLM groups related SVG elements so coherent parts can move together.

Recolor

Temporary sphere-packing colors separate paths and improve pixel-space correspondence.

Generate video

An image-to-video model creates a previewable motion target from the SVG and prompt.

Fit SVG

Group homographies and local Bézier offsets deform the original vector geometry over time.

Pipeline overview. The target video provides the motion; differentiable rendering transfers it back to editable SVG paths.

Qualitative Comparison

LiveSVG vs. prior methods

LiveSVG (highlighted) produces large, topology-preserving deformations. SDS- and LLM-based baselines remain close to the static input or exhibit only limited motion.

"Full gymnastic split on the floor, both hands raised high."

LiveSVG (Ours)

AniClipart

FlexiClip

LiveSketch

LINR-Bridge

Vector Prism

"A man sits on the floor."

LiveSVG (Ours)

AniClipart

FlexiClip

LiveSketch

LINR-Bridge

Vector Prism

Diversity

Multiple plausible motions from a single prompt

Different random seeds produce distinct motions for the same input. Users preview each candidate before committing to SVG fitting — a key advantage of decoupling video generation from optimization.

Prompt — "A person rapping into a microphone"

Variant A

Variant B

Variant C

Variant D

Prompt — "A woman standing and waving her hand"

Variant A

Variant B

Variant C

Variant D

Gallery

Animated SVGs across diverse subjects

All clips are live SVG files animating via SMIL — the original editable vector geometry is preserved in every frame.

From the AniClipart benchmark

Dancing energetically

Surfboard tilts, legs crouch

Joyful bird flying up

Swaying torso, tail flicks

Breakdancing on one hand

Quick side-to-side slips

ChallengeSVG — complex multi-object scenes

Man surfing on a wave

Two men wrestling

Mom feeding her baby

Liberty Statue

Man juggling

Woman climbing a mountain

New Benchmark

Introducing ChallengeSVG

AniClipart covers simple single-subject clipart. ChallengeSVG adds 35 complex, multi-object SVGs from SVGX-Core-250k that expose failure modes beyond this narrow setting.

Existing benchmarkAniClipart — 43 examples

Single-subject clipart with clean backgrounds and roughly 20 paths per example, biased toward human and animal subjects.

Single-subject clipart
Clean white backgrounds
Skeleton-friendly subjects

New benchmarkChallengeSVG — 35 examples

Multi-object scenes with layered occlusions, non-empty backgrounds, dense path counts, and open-domain subjects.

Multi-object & multi-part scenes
Non-empty backgrounds
Not skeleton-centric

Example inputs from ChallengeSVG

Climber

Cyclist

Rider

Jellyfish

Surfer

Earth

Juggler

Comparison on a ChallengeSVG example

"The Earth spinning around its axis." ChallengeSVG

LiveSVG (Ours)

AniClipart

FlexiClip

LiveSketch

LINR-Bridge

Vector Prism

"A moving jellyfish." ChallengeSVG

LiveSVG (Ours)

AniClipart

FlexiClip

LiveSketch

LINR-Bridge

Vector Prism

Evaluation

Human preference and automatic metrics

LiveSVG wins human preference on both benchmarks and achieves the lowest runtime and GPU footprint among optimization-based baselines.

86.7%

AniClipart
human preference (Oracle)

84.8%

ChallengeSVG
human preference

5.2 min

Runtime per SVG
vs. up to 22.3 min

7.4 GB

GPU memory
vs. up to 40 GB

Human preference: LiveSVG wins vs. each baseline

LiveSVG preferred (> 50%) Baseline preferred 50% chance level

Show automatic metric tables

Automatic Metrics

Quantitative evaluation

Best prompt alignment (X-CLIP), best appearance preservation among optimization-based methods, and the lowest computational cost by a large margin.

AniClipart Benchmark

Method	X-CLIP↑	LPIPS↓	SSIM↑	DOVER↑	Time↓	VMem↓
No animation	0.211	0.000	1.000	0.444	—	—
LLM-based
Vector Prism	0.211	0.032	0.973	0.451	4.6m	—
Video SDS optimization
LiveSketch	0.206	0.153	0.910	0.496	22.3m	40.0G
AniClipart	0.214	0.104	0.937	0.427	6.9m	27.8G
FlexiClip	0.213	0.092	0.938	0.431	14.5m	28.1G
LINR-Bridge	0.215	0.174	0.925	0.433	10.7m	16.8G
Target-video fitting (ours)
LiveSVG (Veo 3.1)Best	0.216	0.087	0.942	0.447	5.2m	7.4G
LiveSVG (LTX 2.3)	0.215	0.105	0.940	0.445	5.2m	7.4G
LiveSVG (WAN 2.2)	0.214	0.116	0.938	0.446	5.2m	7.4G

ChallengeSVG Benchmark

Method	X-CLIP↑	LPIPS↓	SSIM↑	DOVER↑	Time↓	VMem↓
No animation	0.214	0.000	1.000	0.470	—	—
LLM-based
Vector Prism	0.211	0.139	0.867	0.461	5.2m	—
Video SDS optimization
LiveSketch	0.182	0.503	0.609	0.397	5.6m	40.4G
AniClipart	0.204	0.274	0.781	0.433	26.1m	28.1G
FlexiClip	0.201	0.298	0.773	0.424	50.5m	28.1G
LINR-Bridge	0.205	0.491	0.729	0.426	15.7m	17.2G
Target-video fitting (ours)
LiveSVG (WAN 2.2)Best	0.215	0.208	0.844	0.476	4.7m	7.4G

Citation

Cite this work

bibtex

@article{levy2026livesvg,
  title={LiveSVG: Zero-Shot SVG Animation via Video Generation},
  author={Levy, Matan and Margolin, Ran and Cavia, Bar and Samuel, Dvir and Pritch, Yael and Peleg, Shmuel and Acha, Alex Rav and Shamir, Ariel and Lischinski, Dani},
  journal={arXiv preprint arXiv:2605.30174},
  year={2026}
}