summaryrefslogtreecommitdiff
path: root/lib/hyperstack/provisioning.rb
AgeCommit message (Collapse)Author
2026-05-24fix(provisioning): recover from vLLM readiness timeout and increase poll windowPaul Buetow
When create timed out during vLLM readiness polling (common for large models like Qwen3.6-27B-FP8), rerunning create would stop and restart the already-running container, restarting the whole startup sequence. Now the vLLM install script checks if the container is already running and serving the correct model before touching it. If it detects a healthy container, it skips the stop/pull/start cycle entirely. Also increases the readiness timeout from 20 min (240x5s) to 30 min (360x5s) to accommodate cold starts with model download and CUDA graph capture on large models.
2026-05-24chore(config): remove gpt-oss-120b references since qwen3.6 is betterPaul Buetow
2026-05-24cleanup: remove ComfyUI and photo-related code from lib/hyperstackPaul Buetow
2026-05-24fix(provisioning): narrow ComfyUI install chmod to models_dir and ↵Paul Buetow
output_dir\n\nThe comfyui_install_script previously ran\n\n chmod -R 0777 File.dirname(models_dir)\n\nwhich chmods the *parent* directory (e.g. /ephemeral). If models_dir\nis configured directly under /ephemeral that gives world-write access to\nall sibling directories (vLLM hug cache, Ollama models, etc.).\n\nNow chmod only the two directories that actually need it: models_dir\nand output_dir.
2026-05-24fix(provisioning): chown models_dir itself, not its parentPaul Buetow
2026-04-06provisioner: support docker_image and pre_start_cmd for Gemma 4 startupPaul Buetow
Adds docker_image and pre_start_cmd config fields to config.rb and provisioning.rb so the Gemma 4 31B workarounds are baked in: - docker_image = "vllm/vllm-openai:nightly" (stable lacks Gemma 4 support) - pre_start_cmd = "pip install -q transformers==5.5.0" (stable pins <5) - extra_docker_env = ["CUDA_VISIBLE_DEVICES=0"] (required with --entrypoint bash) When pre_start_cmd is set, the provisioner switches to --entrypoint bash and chains the patch command before launching vLLM, so create-both works end-to-end without manual container replacement. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26hyperstack: fix TOML paths, add live provisioning progress, and auto ↵Paul Buetow
end-to-end test on create - cli: introduce REPO_ROOT constant so create-both/delete-both/watch find TOML configs at the repo root instead of lib/hyperstack/ - manager: with_polling prints a heartbeat every 30s so silent waits (SSH, VM ready, etc.) are visibly alive - provisioning: bootstrap_guest streams SSH output in real time so apt-lock waits and setup steps are visible as they happen - provisioning: vLLM wait loop reads docker logs to show the current startup stage (shard loading %, torch.compile, CUDA graphs, API up) instead of a plain "not ready yet" counter - manager: create automatically runs the end-to-end inference test after provisioning completes, removing the manual 'test' step Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25hyperstack: split 3335-line monolith into lib/hyperstack/ modulesPaul Buetow
Extracts all classes from hyperstack.rb into focused library files: - lib/hyperstack/config.rb — ConfigLoader + Config (TOML loading, validation) - lib/hyperstack/state.rb — StateStore + PrefixedOutput (JSON state, threaded output) - lib/hyperstack/client.rb — HyperstackClient (REST API + retry logic) - lib/hyperstack/wireguard.rb — LocalWireGuard (wg1.conf peer management, /etc/hosts) - lib/hyperstack/provisioning.rb — ProvisioningScripts + RemoteProvisioner (SSH bootstrap) - lib/hyperstack/manager.rb — Manager (VM lifecycle orchestration) - lib/hyperstack/watcher.rb — VllmWatcher (Prometheus + GPU dashboard) - lib/hyperstack/cli.rb — CLI (OptionParser command dispatch) hyperstack.rb becomes a 46-line entry point with require_relative calls. All files pass `ruby -c` syntax check and `--help` runs correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>