diff options
| author | Paul Buetow <paul@buetow.org> | 2026-03-18 19:08:17 +0200 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2026-03-18 19:08:17 +0200 |
| commit | 1c906b2378c49d28b47889e0db659cb6d9cf5395 (patch) | |
| tree | c6ed75b7f2566c0fdca7b012a8a26667dbd70f8e /snippets/hyperstack/hyperstack-vm1.toml | |
| parent | 3b01d5cb2c8932207127e7dd72848cea91c6347d (diff) | |
vllm: skip docker pull on model switch, persist torch compile cache
- model switch now passes pull_image: false to avoid surprise multi-GB
image downloads when the upstream vLLM image was updated upstream;
docker pull is still run on initial install (pull_image: true default)
- mount /ephemeral/vllm_cache → /root/.cache/vllm so torch.compile
artifacts survive container restarts; saves ~30-60 s on warm switches
- add vllm_compile_cache_dir helper (sibling of hug_cache_dir)
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'snippets/hyperstack/hyperstack-vm1.toml')
0 files changed, 0 insertions, 0 deletions
