summaryrefslogtreecommitdiff
path: root/snippets/hyperstack/hyperstack-vm1.toml
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2026-03-18 19:08:17 +0200
committerPaul Buetow <paul@buetow.org>2026-03-18 19:08:17 +0200
commit1c906b2378c49d28b47889e0db659cb6d9cf5395 (patch)
treec6ed75b7f2566c0fdca7b012a8a26667dbd70f8e /snippets/hyperstack/hyperstack-vm1.toml
parent3b01d5cb2c8932207127e7dd72848cea91c6347d (diff)
vllm: skip docker pull on model switch, persist torch compile cache
- model switch now passes pull_image: false to avoid surprise multi-GB image downloads when the upstream vLLM image was updated upstream; docker pull is still run on initial install (pull_image: true default) - mount /ephemeral/vllm_cache → /root/.cache/vllm so torch.compile artifacts survive container restarts; saves ~30-60 s on warm switches - add vllm_compile_cache_dir helper (sibling of hug_cache_dir) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'snippets/hyperstack/hyperstack-vm1.toml')
0 files changed, 0 insertions, 0 deletions