Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
922 changes: 922 additions & 0 deletions cache_dit.hpp

Large diffs are not rendered by default.

126 changes: 126 additions & 0 deletions docs/caching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
## Caching

Caching methods accelerate diffusion inference by reusing intermediate computations when changes between steps are small.

### Cache Modes

| Mode | Target | Description |
|------|--------|-------------|
| `ucache` | UNET models | Condition-level caching with error tracking |
| `easycache` | DiT models | Condition-level cache |
| `dbcache` | DiT models | Block-level L1 residual threshold |
| `taylorseer` | DiT models | Taylor series approximation |
| `cache-dit` | DiT models | Combined DBCache + TaylorSeer |

### UCache (UNET Models)

UCache caches the residual difference (output - input) and reuses it when input changes are below threshold.

```bash
sd-cli -m model.safetensors -p "a cat" --cache-mode ucache --cache-option "threshold=1.5"
```

#### Parameters

| Parameter | Description | Default |
|-----------|-------------|---------|
| `threshold` | Error threshold for reuse decision | 1.0 |
| `start` | Start caching at this percent of steps | 0.15 |
| `end` | Stop caching at this percent of steps | 0.95 |
| `decay` | Error decay rate (0-1) | 1.0 |
| `relative` | Scale threshold by output norm (0/1) | 1 |
| `reset` | Reset error after computing (0/1) | 1 |

#### Reset Parameter

The `reset` parameter controls error accumulation behavior:

- `reset=1` (default): Resets accumulated error after each computed step. More aggressive caching, works well with most samplers.
- `reset=0`: Keeps error accumulated. More conservative, recommended for `euler_a` sampler.

### EasyCache (DiT Models)

Condition-level caching for DiT models. Caches and reuses outputs when input changes are below threshold.

```bash
--cache-mode easycache --cache-option "threshold=0.3"
```

#### Parameters

| Parameter | Description | Default |
|-----------|-------------|---------|
| `threshold` | Input change threshold for reuse | 0.2 |
| `start` | Start caching at this percent of steps | 0.15 |
| `end` | Stop caching at this percent of steps | 0.95 |

### Cache-DIT (DiT Models)

For DiT models like FLUX and QWEN, use block-level caching modes.

#### DBCache

Caches blocks based on L1 residual difference threshold:

```bash
--cache-mode dbcache --cache-option "threshold=0.25,warmup=4"
```

#### TaylorSeer

Uses Taylor series approximation to predict block outputs:

```bash
--cache-mode taylorseer
```

#### Cache-DIT (Combined)

Combines DBCache and TaylorSeer:

```bash
--cache-mode cache-dit --cache-preset fast
```

#### Parameters

| Parameter | Description | Default |
|-----------|-------------|---------|
| `Fn` | Front blocks to always compute | 8 |
| `Bn` | Back blocks to always compute | 0 |
| `threshold` | L1 residual difference threshold | 0.08 |
| `warmup` | Steps before caching starts | 8 |

#### Presets

Available presets: `slow`, `medium`, `fast`, `ultra` (or `s`, `m`, `f`, `u`).

```bash
--cache-mode cache-dit --cache-preset fast
```

#### SCM Options

Steps Computation Mask controls which steps can be cached:

```bash
--scm-mask "1,1,1,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,1"
```

Mask values: `1` = compute, `0` = can cache.

| Policy | Description |
|--------|-------------|
| `dynamic` | Check threshold before caching |
| `static` | Always cache on cacheable steps |

```bash
--scm-policy dynamic
```

### Performance Tips

- Start with default thresholds and adjust based on output quality
- Lower threshold = better quality, less speedup
- Higher threshold = more speedup, potential quality loss
- More steps generally means more caching opportunities
9 changes: 8 additions & 1 deletion examples/cli/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,5 +126,12 @@ Generation Options:
--skip-layers layers to skip for SLG steps (default: [7,8,9])
--high-noise-skip-layers (high noise) layers to skip for SLG steps (default: [7,8,9])
-r, --ref-image reference image for Flux Kontext models (can be used multiple times)
--easycache enable EasyCache for DiT models with optional "threshold,start_percent,end_percent" (default: 0.2,0.15,0.95)
--cache-mode caching method: 'easycache' (DiT), 'ucache' (UNET), 'dbcache'/'taylorseer'/'cache-dit' (DiT block-level)
--cache-option named cache params (key=value format, comma-separated):
- easycache/ucache: threshold=,start=,end=,decay=,relative=,reset=
- dbcache/taylorseer/cache-dit: Fn=,Bn=,threshold=,warmup=
Examples: "threshold=0.25" or "threshold=1.5,reset=0"
--cache-preset cache-dit preset: 'slow'/'s', 'medium'/'m', 'fast'/'f', 'ultra'/'u'
--scm-mask SCM steps mask: comma-separated 0/1 (1=compute, 0=can cache)
--scm-policy SCM policy: 'dynamic' (default) or 'static'
```
4 changes: 2 additions & 2 deletions examples/cli/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -610,7 +610,7 @@ int main(int argc, const char* argv[]) {
gen_params.pm_style_strength,
}, // pm_params
ctx_params.vae_tiling_params,
gen_params.easycache_params,
gen_params.cache_params,
};

results = generate_image(sd_ctx, &img_gen_params);
Expand All @@ -635,7 +635,7 @@ int main(int argc, const char* argv[]) {
gen_params.seed,
gen_params.video_frames,
gen_params.vace_strength,
gen_params.easycache_params,
gen_params.cache_params,
};

results = generate_video(sd_ctx, &vid_gen_params, &num_results);
Expand Down
Loading
Loading