From 1acb65324e8d7d520483535843a753757b0dd4a0 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Sat, 21 Mar 2026 14:56:38 +0200 Subject: Fix Nemotron OOM; add VM lifecycle fish abbrs; document automated setup MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - hyperstack-vm1/vm2.toml: reduce nemotron-super max_model_len 262144→131072 and add --enforce-eager to disable CUDA graph capture (~3-4 GB overhead). Nemotron 120B weights (~60 GB) leave too little VRAM headroom for KV cache allocation and CUDA graph buffers at 262K context on a single A100 80GB. 131K context with eager mode is stable. README VRAM table updated to match. - hyperstack.fish: add hyperstack-create/delete/test and hyperstack-create/delete-both abbreviations for VM lifecycle management alongside the existing pi-* aliases. - README.md: add "Automated setup reference" section with single-VM and two-VM command flows before the manual vLLM Docker setup section. End-to-end tested: single VM (GPT-OSS 120B), dual VM (Nemotron + Qwen3-Coder), pi queries on all three models — all passed. Co-Authored-By: Claude Sonnet 4.6 --- hyperstack.fish | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) (limited to 'hyperstack.fish') diff --git a/hyperstack.fish b/hyperstack.fish index 1b32a37..09706b5 100644 --- a/hyperstack.fish +++ b/hyperstack.fish @@ -1,3 +1,11 @@ +# Single-VM setup (hyperstack-vm.toml → hyperstack.wg1) abbr pi-hyperstack pi --model hyperstack/openai/gpt-oss-120b -abbr pi-hyperstack-nemotron pi --model hyperstack1/cyankiwi/NVIDIA-Nemotron-3-Super-120B-A12B-AWQ-4bit -abbr pi-hyperstack-coder pi --model hyperstack2/bullpoint/Qwen3-Coder-Next-AWQ-4bit +abbr hyperstack-create ruby ~/git/hyperstack/hyperstack.rb create +abbr hyperstack-delete ruby ~/git/hyperstack/hyperstack.rb delete +abbr hyperstack-test ruby ~/git/hyperstack/hyperstack.rb test + +# Dual-VM setup (hyperstack-vm1/vm2.toml → hyperstack1/2.wg1) +abbr pi-hyperstack-nemotron pi --model hyperstack1/cyankiwi/NVIDIA-Nemotron-3-Super-120B-A12B-AWQ-4bit +abbr pi-hyperstack-coder pi --model hyperstack2/bullpoint/Qwen3-Coder-Next-AWQ-4bit +abbr hyperstack-create-both ruby ~/git/hyperstack/hyperstack.rb create-both +abbr hyperstack-delete-both ruby ~/git/hyperstack/hyperstack.rb delete-both -- cgit v1.2.3