summaryrefslogtreecommitdiff
path: root/snippets/hyperstack/README.md
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2026-03-18 12:06:07 +0200
committerPaul Buetow <paul@buetow.org>2026-03-18 12:06:07 +0200
commit2a2704fa4cac96a6754d4fea1bc341a27c5bb6c8 (patch)
tree6bb555b988c8bef2b738c36a21905327567f27eb /snippets/hyperstack/README.md
parentb49cb03bb629a20dc459b8146ad8e735578d925d (diff)
Add vLLM model presets and live model switching
- New [vllm.presets.*] TOML section with two presets: qwen3-coder-next bullpoint/Qwen3-Coder-Next-AWQ-4bit (256k ctx, coding) nemotron-super solidrust/Llama-3.3-Nemotron-Super-49B-v1-AWQ (131k ctx, analysis) - New CLI subcommand: `model list` — show presets, mark the active one - New CLI subcommand: `model switch PRESET [--dry-run]` — switch the running VM to a different preset without redeploying: 1. stops old Docker container (if container_name differs) 2. starts new container and waits for model readiness 3. hot-reloads LiteLLM config via litellm_reload_script (no venv reinstall) 4. updates state file with new vllm_model / vllm_container_name / vllm_preset - New `create --model PRESET` flag — deploy with a non-default preset - vllm_install_script and litellm_install_script now accept preset_config:/ model_override: so callers can override individual fields without duplicating the full config - State file now tracks vllm_container_name and vllm_preset for clean container lifecycle management across switches Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'snippets/hyperstack/README.md')
0 files changed, 0 insertions, 0 deletions