| Age | Commit message (Collapse) | Author |
|
- Add per-VM 10s fetch timeout so one dead VM cannot stall the dashboard
- Make fallback logic check VM state (public_ip + ACTIVE status) instead of
just file existence, so a stale/deleted VM1 state does not block watch
- Auto-replace cached SSH host keys when a VM is recreated instead of failing
- Suppress Ruby thread exception noise on killed SSH threads
Fixes 'just watch' showing blank screen when VM1 is deleted but has a stale
state file, and SSH host-key mismatch on VM recreation.
|
|
provisioned
|
|
Remove the vm_api_reachable? filter from run_watch so VMs that are
currently booting are not silently dropped from the dashboard.
Add exponential-backoff retry logic (up to 4 attempts, sleeping
2s, 4s, 8s, 16s) inside VllmWatcher#fetch_vm_stats for transient
SSH/WireGuard errors such as connection refused, host unreachable,
and exit 255. This lets watch automatically recover while a VM
is still starting up instead of failing immediately.
|
|
- Drop single-VM default hyperstack-vm.toml and @config_path/@config_explicit machinery
- Add global --vm flag (default: 1) mapping to hyperstack-vm1.toml and/or hyperstack-vm2.toml
- Fold create-both and delete-both into create/delete --vm both
- Teach status, watch, test, model to accept --vm (default: 1)
- Update help text and README/AGENTS/fish abbreviations accordingly
|
|
|
|
In run_create_both, VM1's thread rescue unconditionally set
vm1_wg_state[:error], even when the WireGuard step had already
signaled success (vm1_wg_state[:done] = true). If VM2 was
waiting on the condition variable at that moment, it would raise
'VM1 WireGuard setup failed' and abort needlessly.
Now the rescue only sets :error when :done is still false, so a
downstream VM1 failure (e.g. vLLM install) no longer leaks to VM2.
Resolves agent task ic.
|
|
|
|
Move the former hyperstack-vm1-coder.toml to hyperstack-vm1.toml as the
standard VM1 profile (Qwen3-Coder-Next on single GPU). Preserve the
dual-H100 Nemotron-3-Super stack as hyperstack-vm1-nemotron.toml. Point
create-both at hyperstack-vm1.toml and refresh README for current defaults.
Made-with: Cursor
|
|
VM1 (hyperstack-vm1-coder.toml, renamed from hyperstack-vm1-gptoss.toml):
- Default model switched from gpt-oss-120b to qwen3-coder-next
- Config file renamed to reflect actual default model
VM2 (hyperstack-vm2.toml):
- Default model switched from qwen3-coder-next to Gemma 4 31B AWQ
- Uses vLLM nightly image + transformers==5.5.0 workaround: Gemma 4
architecture is registered in transformers 5.x but vLLM stable pins <5
- max_model_len=131072 (128K context); KV cache fills ~95% of H100-80GB VRAM
- Added gemma4-31b preset
watcher.rb:
- Add loading_status field to VmSnapshot to show live model-load progress
(last relevant log line during startup instead of generic "loading" message)
- fetch_vm_stats now captures both Engine 0 stats and loading-phase log lines
in a single SSH call using a shell variable to avoid two docker log invocations
- clean_log_line() strips vLLM PID/timestamp prefix for readable display
cli.rb: update all hardcoded hyperstack-vm1-gptoss.toml references to
hyperstack-vm1-coder.toml
hypr.fish: replace pi-hyperstack-nemotron with pi-hyperstack-coder (VM1),
add pi-hyperstack-gemma4 (VM2)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Add watch_config_loaders that filters status_config_loaders results with
a TCP probe on each VM's WireGuard inference port. VMs with stale state
files (deleted from the console without `hyperstack.rb delete`) are
excluded from the watch loop. Falls back to all tracked loaders when
none are reachable so the watcher can still render error output when
WireGuard is down.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
end-to-end test on create
- cli: introduce REPO_ROOT constant so create-both/delete-both/watch
find TOML configs at the repo root instead of lib/hyperstack/
- manager: with_polling prints a heartbeat every 30s so silent waits
(SSH, VM ready, etc.) are visibly alive
- provisioning: bootstrap_guest streams SSH output in real time so
apt-lock waits and setup steps are visible as they happen
- provisioning: vLLM wait loop reads docker logs to show the current
startup stage (shard loading %, torch.compile, CUDA graphs, API up)
instead of a plain "not ready yet" counter
- manager: create automatically runs the end-to-end inference test
after provisioning completes, removing the manual 'test' step
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
Extracts all classes from hyperstack.rb into focused library files:
- lib/hyperstack/config.rb — ConfigLoader + Config (TOML loading, validation)
- lib/hyperstack/state.rb — StateStore + PrefixedOutput (JSON state, threaded output)
- lib/hyperstack/client.rb — HyperstackClient (REST API + retry logic)
- lib/hyperstack/wireguard.rb — LocalWireGuard (wg1.conf peer management, /etc/hosts)
- lib/hyperstack/provisioning.rb — ProvisioningScripts + RemoteProvisioner (SSH bootstrap)
- lib/hyperstack/manager.rb — Manager (VM lifecycle orchestration)
- lib/hyperstack/watcher.rb — VllmWatcher (Prometheus + GPU dashboard)
- lib/hyperstack/cli.rb — CLI (OptionParser command dispatch)
hyperstack.rb becomes a 46-line entry point with require_relative calls.
All files pass `ruby -c` syntax check and `--help` runs correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|