diff options
| -rw-r--r-- | AGENTS.md | 10 | ||||
| -rw-r--r-- | README.md | 50 | ||||
| -rw-r--r-- | hyperstack-vm1.toml | 2 | ||||
| -rw-r--r-- | hyperstack-vm2.toml | 2 | ||||
| -rw-r--r-- | hypr.fish | 4 | ||||
| -rw-r--r-- | lib/hyperstack/cli.rb | 197 |
6 files changed, 135 insertions, 130 deletions
@@ -18,7 +18,7 @@ The H100 80GB (`n3-H100x1`) is the fallback. # hyperstack-vm2.toml flavor_name = "n3-A100x1" ``` -2. Run `ruby hyperstack.rb --config hyperstack-vm2.toml create`. +2. Run `ruby hyperstack.rb --vm 2 create`. 3. If the API returns a flavor-not-available error, switch to H100: ```toml flavor_name = "n3-H100x1" @@ -43,7 +43,7 @@ failed to extract layer ... EOF The provisioner retries twice automatically. If all attempts fail, just re-run create: ```bash -ruby hyperstack.rb --config hyperstack-vm2.toml create +ruby hyperstack.rb --vm 2 create ``` The VM already exists and is tracked in the state file; `create` resumes from where it @@ -144,7 +144,7 @@ curl -s http://192.168.3.3:11434/v1/models | python3 -c \ "import sys,json; print([m['id'] for m in json.load(sys.stdin)['data']])" # 4. Full automated test -ruby hyperstack.rb --config hyperstack-vm2.toml test +ruby hyperstack.rb --vm 2 test ``` Note: `curl` to the public IP will time out — port 11434 is firewalled to @@ -191,7 +191,7 @@ If `create` exits non-zero partway through (e.g. WireGuard retries exhausted, Do EOF), the VM is still running and the state file tracks it. Simply re-run: ```bash -ruby hyperstack.rb --config hyperstack-vm2.toml create +ruby hyperstack.rb --vm 2 create ``` The script checks `vllm_setup_at` and `bootstrapped_at` in the state file and skips @@ -204,7 +204,7 @@ already-completed steps. Typical resume flow: If you want to force a full reprovision from scratch: ```bash -ruby hyperstack.rb --config hyperstack-vm2.toml create --replace +ruby hyperstack.rb --vm 2 create --replace ``` This deletes the existing VM, clears the state file, and starts over. @@ -78,7 +78,7 @@ Runs two A100 VMs concurrently — each serving a different model — with [Pi]( ## WireGuard setup -`hyperstack.rb` runs `wg1-setup.sh` automatically during `create` / `create-both`. +`hyperstack.rb` runs `wg1-setup.sh` automatically during `create` and `create --vm both`. This section explains the tunnel design for reference and manual troubleshooting. ### Tunnel design @@ -137,19 +137,11 @@ ssh ubuntu@<vm-public-ip> 'sudo systemctl start wg-quick@wg1' ```bash # Deploy both VMs in parallel, set up WireGuard + vLLM (~10 min) -ruby hyperstack.rb create-both +ruby hyperstack.rb create --vm both # Verify both VMs are working -ruby hyperstack.rb --config hyperstack-vm1.toml test -ruby hyperstack.rb --config hyperstack-vm2.toml test - -# Launch Pi coding agents — one per terminal (fish abbreviations from hypr.fish) -pi-hyperstack-coder # Qwen3-Coder-Next on VM1 -pi-hyperstack-qwen36 # Qwen3.6 27B FP8 on VM2 -pi-hyperstack-gemma4 # Gemma 4 31B on VM2 - -# Tear down both VMs -ruby hyperstack.rb delete-both +ruby hyperstack.rb test --vm 1 +ruby hyperstack.rb test --vm 2 ``` ## Using Pi @@ -270,8 +262,8 @@ No API key or account required. Uses DuckDuckGo's free HTML endpoint. Each VM has independent state files so they can be managed separately: ```bash -ruby hyperstack.rb --config hyperstack-vm1.toml status -ruby hyperstack.rb --config hyperstack-vm2.toml status +ruby hyperstack.rb --vm 1 status +ruby hyperstack.rb --vm 2 status ``` ## Switching models @@ -279,8 +271,8 @@ ruby hyperstack.rb --config hyperstack-vm2.toml status Each VM has named model presets in its TOML config. Hot-switch without reprovisioning: ```bash -ruby hyperstack.rb --config hyperstack-vm1.toml model switch qwen3-coder-next -ruby hyperstack.rb --config hyperstack-vm2.toml model switch qwen3-coder-next +ruby hyperstack.rb --vm 1 model switch qwen3-coder-next +ruby hyperstack.rb --vm 2 model switch qwen3-coder-next ``` Available presets (both VMs share the same set): @@ -301,23 +293,23 @@ Available presets (both VMs share the same set): ## CLI reference ``` -ruby hyperstack.rb [--config path] <command> [options] +ruby hyperstack.rb [--vm 1|2|both] <command> [options] Commands: create Deploy a new VM and run full provisioning - create-both Deploy VM1 + VM2 in parallel (uses hyperstack-vm1/vm2.toml) delete Destroy the tracked VM - delete-both Destroy both VM1 and VM2 status Show VM and WireGuard status watch Live dashboard: vLLM + GPU stats for all active VMs (refreshes every 5 s) test Run end-to-end inference tests (vLLM) model switch <preset> Hot-switch the running vLLM model -create / create-both options: - --replace Delete existing tracked VM before creating - --dry-run Print the plan without making changes - --vllm / --no-vllm Override config: enable/disable vLLM setup +create options: + --replace Delete existing tracked VM before creating + --dry-run Print the plan without making changes + --vllm / --no-vllm Override config: enable/disable vLLM setup --ollama / --no-ollama Override config: enable/disable Ollama setup + +All commands accept --vm 1|2|both (default: 1). ``` ## Configuration @@ -367,11 +359,11 @@ ruby hyperstack.rb delete ```bash # Deploy both VMs in parallel, set up tunnel and vLLM on each (~10 min) -ruby hyperstack.rb create-both +ruby hyperstack.rb create --vm both # Test each VM individually -ruby hyperstack.rb --config hyperstack-vm1.toml test -ruby hyperstack.rb --config hyperstack-vm2.toml test +ruby hyperstack.rb test --vm 1 +ruby hyperstack.rb test --vm 2 # Launch Pi coding agents — one per terminal pi-hyperstack-coder # fish abbreviation → Qwen3-Coder-Next on VM1 @@ -379,15 +371,15 @@ pi-hyperstack-qwen36 # fish abbreviation → Qwen3.6 27B FP8 on VM2 pi-hyperstack-gemma4 # fish abbreviation → Gemma 4 31B on VM2 # Tear down both VMs -ruby hyperstack.rb delete-both +ruby hyperstack.rb delete --vm both ``` ### Hot-switching models without reprovisioning ```bash # Switch the running vLLM container to a different model preset -ruby hyperstack.rb --config hyperstack-vm1.toml model switch qwen3-coder-next -ruby hyperstack.rb --config hyperstack-vm2.toml model switch qwen3-coder-next +ruby hyperstack.rb --vm 1 model switch qwen3-coder-next +ruby hyperstack.rb --vm 2 model switch qwen3-coder-next ``` See the [VM configuration](#vm-configuration) and [Switching models](#switching-models) diff --git a/hyperstack-vm1.toml b/hyperstack-vm1.toml index 5a5be2a..c6fb2df 100644 --- a/hyperstack-vm1.toml +++ b/hyperstack-vm1.toml @@ -70,7 +70,7 @@ gpu_memory_utilization = 0.92 tensor_parallel_size = 1 tool_call_parser = "qwen3_coder" -# Named model presets for 'ruby hyperstack.rb --config hyperstack-vm1.toml model switch <name>'. +# Named model presets for 'ruby hyperstack.rb --vm 1 model switch <name>'. # Each preset overrides the matching [vllm] field; unset fields fall back to [vllm] defaults. [vllm.presets.qwen3-coder-next] diff --git a/hyperstack-vm2.toml b/hyperstack-vm2.toml index 14f1c6a..c3605ff 100644 --- a/hyperstack-vm2.toml +++ b/hyperstack-vm2.toml @@ -74,7 +74,7 @@ tensor_parallel_size = 1 tool_call_parser = "qwen3_coder" extra_vllm_args = ["--reasoning-parser", "qwen3"] -# Named model presets for 'ruby hyperstack.rb --config hyperstack-vm2.toml model switch <name>'. +# Named model presets for 'ruby hyperstack.rb --vm 2 model switch <name>'. # Core model fields override the matching [vllm] values; preset-only extras such as # extra_vllm_args / extra_docker_env / docker_image / pre_start_cmd must be set explicitly. @@ -2,5 +2,5 @@ abbr pi-hyperstack-coder pi --model hyperstack1/bullpoint/Qwen3-Coder-Next-AWQ-4bit abbr pi-hyperstack-qwen36 pi --model hyperstack2/Qwen/Qwen3.6-27B-FP8 abbr pi-hyperstack-gemma4 pi --model hyperstack2/cyankiwi/gemma-4-31B-it-AWQ-4bit -abbr hyperstack-create-both ruby ~/git/hyperstack/hyperstack.rb create-both -abbr hyperstack-delete-both ruby ~/git/hyperstack/hyperstack.rb delete-both +abbr hyperstack-create-both ruby ~/git/hyperstack/hyperstack.rb create --vm both +abbr hyperstack-delete-both ruby ~/git/hyperstack/hyperstack.rb delete --vm both diff --git a/lib/hyperstack/cli.rb b/lib/hyperstack/cli.rb index 9be78b0..2669186 100644 --- a/lib/hyperstack/cli.rb +++ b/lib/hyperstack/cli.rb @@ -11,36 +11,31 @@ module HyperstackVM def initialize(argv) @argv = argv.dup - @config_path = File.join(REPO_ROOT, 'hyperstack-vm.toml') - @config_explicit = false + @vm = '1' end def show_help puts @global_parser puts puts 'Commands:' - puts ' create [--replace] [--dry-run] [--vllm|--no-vllm] [--ollama|--no-ollama] [--model PRESET]' - puts ' create-both [--replace] [--dry-run] [--vllm|--no-vllm] [--ollama|--no-ollama]' - puts ' Provision hyperstack-vm1.toml and hyperstack-vm2.toml concurrently.' - puts ' WireGuard setup is serialized: VM1 writes the base wg1.conf first,' - puts ' then VM2 adds its peer. Requires both TOML files next to the script.' - puts ' delete [--vm-id ID] [--dry-run]' - puts ' delete-both [--dry-run]' - puts ' Delete the VMs tracked by hyperstack-vm1.toml and hyperstack-vm2.toml.' + puts ' create [--replace] [--dry-run] [--vllm|--no-vllm] [--ollama|--no-ollama] [--model PRESET]' + puts ' delete [--vm-id ID] [--dry-run]' puts ' status' puts ' watch' - puts ' Poll all active VMs for vLLM and GPU stats every 60 s.' + puts ' Poll active VMs for vLLM and GPU stats every 60 s.' puts ' test' puts ' model list' puts ' model switch PRESET [--dry-run]' + puts + puts 'All commands accept --vm 1|2|both (default: 1).' end def run @global_parser = OptionParser.new do |opts| - opts.banner = 'Usage: ruby hyperstack.rb [--config path] <create|delete|status> [options]' - opts.on('--config PATH', "Path to TOML config (default: #{@config_path})") do |value| - @config_path = value - @config_explicit = true + opts.banner = 'Usage: ruby hyperstack.rb [--vm 1|2|both] <create|delete|status|watch|test|model> [options]' + opts.on('--vm 1|2|both', 'Target VM (default: 1)') do |value| + raise Error, "Invalid --vm value #{value.inspect}. Use 1, 2, or both." unless %w[1 2 both].include?(value) + @vm = value end opts.on('-h', '--help', 'Show help') do show_help @@ -55,79 +50,68 @@ module HyperstackVM exit 0 end - # create-both loads its own config files and does not use the default config path. - # Parse it before building the manager so we avoid loading the default config needlessly. - if command == 'create-both' - opts = parse_create_options(@argv, include_model_preset: false) - run_create_both(**opts) - return - end - - if command == 'delete-both' - opts = parse_delete_both_options(@argv) - run_delete_both(**opts) - return - end - - if command == 'status' - run_status - return - end - - if command == 'watch' - run_watch - return - end - - # All other commands operate on a single VM defined by the --config path. - config_loader = ConfigLoader.load(@config_path) - manager = build_manager(config_loader.config) - case command when 'create' - opts = parse_create_options(@argv) - manager.create(**opts) + if @vm == 'both' + opts = parse_create_options(@argv, include_model_preset: false) + run_create_both(**opts) + else + opts = parse_create_options(@argv) + build_manager_for_vm(@vm).create(**opts) + end when 'delete' - vm_id = nil - dry_run = false - parser = OptionParser.new do |opts| - opts.on('--vm-id ID', Integer, 'Delete a VM by ID instead of using the local state file') do |value| - vm_id = value + if @vm == 'both' + opts = parse_delete_options(@argv) + run_delete_both(**opts) + else + vm_id = nil + dry_run = false + parser = OptionParser.new do |opts| + opts.on('--vm-id ID', Integer, 'Delete a VM by ID instead of using the local state file') do |value| + vm_id = value + end + opts.on('--dry-run', 'Show which VM would be deleted without deleting it') { dry_run = true } end - opts.on('--dry-run', 'Show which VM would be deleted without deleting it') { dry_run = true } + parser.parse!(@argv) + build_manager_for_vm(@vm).delete(vm_id: vm_id, dry_run: dry_run) end - parser.parse!(@argv) - manager.delete(vm_id: vm_id, dry_run: dry_run) + when 'status' + run_status + when 'watch' + run_watch when 'test' - manager.test + run_test when 'model' - sub = @argv.shift - raise Error, 'Missing model subcommand. Use: model list | model switch PRESET [--dry-run]' if sub.nil? - - case sub - when 'list' - manager.list_models - when 'switch' - preset = @argv.shift - raise Error, 'Missing preset name. Usage: model switch PRESET [--dry-run]' if preset.nil? - - dry_run = false - OptionParser.new { |o| o.on('--dry-run') { dry_run = true } }.parse!(@argv) - manager.switch_model(preset_name: preset, dry_run: dry_run) - else - raise Error, "Unknown model subcommand #{sub.inspect}. Use list or switch." - end + run_model else raise Error, - "Unknown command #{command.inspect}. Use create, create-both, delete, delete-both, status, watch, test, or model." + "Unknown command #{command.inspect}. Use create, delete, status, watch, test, or model." end end private + def vm_config_path(vm) + File.join(REPO_ROOT, "hyperstack-vm#{vm}.toml") + end + + def build_manager_for_vm(vm) + loader = ConfigLoader.load(vm_config_path(vm)) + build_manager(loader.config) + end + + def selected_config_loaders + case @vm + when 'both' + pair_config_loaders + else + [ConfigLoader.load(vm_config_path(@vm))] + end + end + # Parses the shared --replace / --dry-run / --vllm / --ollama / --model flags - # used by both 'create' and 'create-both'. When include_model_preset is false - # (create-both), the --model flag is not registered because each VM uses its own + # used by 'create' and by 'create --vm both'. When include_model_preset is false + # (both), the --model flag is not registered because each VM uses its own # TOML default. Returns a hash suitable for splatting into Manager#create. def parse_create_options(argv, include_model_preset: true) opts = { replace: false, dry_run: false, install_vllm: nil, install_ollama: nil, @@ -148,7 +132,7 @@ module HyperstackVM opts end - def parse_delete_both_options(argv) + def parse_delete_options(argv) opts = { dry_run: false } OptionParser.new do |o| o.on('--dry-run', 'Show which VMs would be deleted without deleting them') { opts[:dry_run] = true } @@ -180,18 +164,61 @@ module HyperstackVM ) end + def run_test + loaders = selected_config_loaders + loaders.each do |loader| + if loaders.size > 1 + puts + puts "[#{File.basename(loader.path)}]" + end + build_manager(loader.config).test + end + end + + def run_model + sub = @argv.shift + raise Error, 'Missing model subcommand. Use: model list | model switch PRESET [--dry-run]' if sub.nil? + + case sub + when 'list' + loaders = selected_config_loaders + loaders.each do |loader| + if loaders.size > 1 + puts + puts "[#{File.basename(loader.path)}]" + end + build_manager(loader.config).list_models + end + when 'switch' + preset = @argv.shift + raise Error, 'Missing preset name. Usage: model switch PRESET [--dry-run]' if preset.nil? + + dry_run = false + OptionParser.new { |o| o.on('--dry-run') { dry_run = true } }.parse!(@argv) + loaders = selected_config_loaders + loaders.each do |loader| + if loaders.size > 1 + puts + puts "[#{File.basename(loader.path)}]" + end + build_manager(loader.config).switch_model(preset_name: preset, dry_run: dry_run) + end + else + raise Error, "Unknown model subcommand #{sub.inspect}. Use list or switch." + end + end + # Starts the VllmWatcher dashboard restricted to VMs that are currently reachable. # Uses watch_config_loaders instead of status_config_loaders so VMs whose state # files are stale (e.g. deleted from the console without `delete`) are excluded. def run_watch loaders = watch_config_loaders - raise Error, 'No active VMs found. Run `create` or `create-both` first.' if loaders.empty? - + raise Error, 'No active VMs found. Run `create --vm 1|2|both` first.' if loaders.empty? VllmWatcher.new(config_loaders: loaders).run end def run_status - loaders = status_config_loaders + loaders = selected_config_loaders if loaders.one? build_manager(loaders.first.config).status return @@ -214,7 +241,7 @@ module HyperstackVM # Falls back to all state-tracked loaders when none are reachable (e.g. WireGuard down), # so the watcher can still render meaningful error output instead of raising. def watch_config_loaders - loaders = status_config_loaders + loaders = selected_config_loaders reachable = loaders.select { |l| vm_api_reachable?(l.config) } reachable.empty? ? loaders : reachable end @@ -230,20 +257,6 @@ module HyperstackVM false end - def status_config_loaders - return [ConfigLoader.load(@config_path)] if @config_explicit - - candidates = [ - @config_path, - File.join(REPO_ROOT, 'hyperstack-vm1.toml'), - File.join(REPO_ROOT, 'hyperstack-vm2.toml') - ].uniq.select { |path| File.exist?(path) } - - loaders = candidates.map { |path| ConfigLoader.load(path) } - tracked = loaders.select { |loader| File.exist?(loader.config.state_file) } - tracked.empty? ? [ConfigLoader.load(@config_path)] : tracked - end - def pair_config_loaders [ ConfigLoader.load(File.join(REPO_ROOT, 'hyperstack-vm1.toml')), |
