hypr - My "local" LLM setup with Hyperstack.

diff options

author	Paul Buetow <paul@buetow.org>	2026-04-06 11:02:43 +0300
committer	Paul Buetow <paul@buetow.org>	2026-04-06 11:02:43 +0300
commit	0664ffcc62b2fb240286fde463635e510a41df84 (patch)
tree	c3528d94974a36e975d967c673bfc29890ac9fae /lib/hyperstack/config.rb
parent	ce6adba0cfb47b06506976636bd2b4861112ddd8 (diff)

hyperstack: switch to Gemma 4 31B on VM2, Qwen3-Coder-Next on VM1

VM1 (hyperstack-vm1-coder.toml, renamed from hyperstack-vm1-gptoss.toml): - Default model switched from gpt-oss-120b to qwen3-coder-next - Config file renamed to reflect actual default model VM2 (hyperstack-vm2.toml): - Default model switched from qwen3-coder-next to Gemma 4 31B AWQ - Uses vLLM nightly image + transformers==5.5.0 workaround: Gemma 4 architecture is registered in transformers 5.x but vLLM stable pins <5 - max_model_len=131072 (128K context); KV cache fills ~95% of H100-80GB VRAM - Added gemma4-31b preset watcher.rb: - Add loading_status field to VmSnapshot to show live model-load progress (last relevant log line during startup instead of generic "loading" message) - fetch_vm_stats now captures both Engine 0 stats and loading-phase log lines in a single SSH call using a shell variable to avoid two docker log invocations - clean_log_line() strips vLLM PID/timestamp prefix for readable display cli.rb: update all hardcoded hyperstack-vm1-gptoss.toml references to hyperstack-vm1-coder.toml hypr.fish: replace pi-hyperstack-nemotron with pi-hyperstack-coder (VM1), add pi-hyperstack-gemma4 (VM2) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Diffstat (limited to 'lib/hyperstack/config.rb')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: