hypr - My "local" LLM setup with Hyperstack.

Age	Commit message (Collapse)	Author
13 days	fix(loop-scheduler): replace ctx.waitForIdle() with isIdle() polling in ↵main	Paul Buetow
	agent_end handler The agent_end event handler receives ExtensionContext, not ExtensionCommandContext, so ctx.waitForIdle() is not available. Replace with a polling loop using ctx.isIdle() to wait for the run to finish before draining pending jobs, preventing stuck follow-up messages.
13 days	fix(loop-scheduler): await waitForIdle in agent_end before draining	Paul Buetow
	Inside an agent_end listener, agent.state.isStreaming is still true — finishRun() only clears it in the finally block of runWithLifecycle, after all agent_end listeners settle. So when we dispatched a pending job from agent_end and called pi.sendUserMessage(..., { deliverAs: 'followUp' }), the message was routed into agent.followUpQueue. The agent loop had already passed its getFollowUpMessages() check, so it exited without draining the queue. The message sat there as a stuck 'Follow-up: ...' in pi's UI, agentBusy stayed true forever, and every subsequent pending loop was blocked because no further agent_end fired. Await ctx.waitForIdle() in the agent_end handler before resetting agentBusy and calling drainPendingJobs. By then finishRun() has cleared isStreaming, so sendUserMessage starts a fresh run instead of enqueueing into a dead followUp queue, and pending loops drain serially as designed. Amp-Thread-ID: https://ampcode.com/threads/T-019e5de9-a0c3-7559-9cf0-f81ce751e763 Co-authored-by: Amp <amp@ampcode.com>
13 days	fix(watch): auto-recover when default VM is dead or replaced	Paul Buetow
	- Add per-VM 10s fetch timeout so one dead VM cannot stall the dashboard - Make fallback logic check VM state (public_ip + ACTIVE status) instead of just file existence, so a stale/deleted VM1 state does not block watch - Auto-replace cached SSH host keys when a VM is recreated instead of failing - Suppress Ruby thread exception noise on killed SSH threads Fixes 'just watch' showing blank screen when VM1 is deleted but has a stale state file, and SSH host-key mismatch on VM recreation.
13 days	update hyperstack2 VM state and config after recreation	Paul Buetow

14 days	fix(provisioning): recover from vLLM readiness timeout and increase poll window	Paul Buetow
	When create timed out during vLLM readiness polling (common for large models like Qwen3.6-27B-FP8), rerunning create would stop and restart the already-running container, restarting the whole startup sequence. Now the vLLM install script checks if the container is already running and serving the correct model before touching it. If it detects a healthy container, it skips the stop/pull/start cycle entirely. Also increases the readiness timeout from 20 min (240x5s) to 30 min (360x5s) to accommodate cold starts with model download and CUDA graph capture on large models.
14 days	feat(tooling): add ollama fish abbreviations for kimi, qwen, glm, minimax	Paul Buetow

14 days	feat(pi): add ollama provider with kimi-k2.6:cloud, qwen3.5:cloud, ↵	Paul Buetow
	glm-5.1:cloud, minimax-m2.7:cloud
14 days	chore(config): revert vm2 default to n3-A100x1; simplify justfile	Paul Buetow

14 days	chore(tooling): add justfile for common VM lifecycle, observability, and ↵	Paul Buetow
	debugging commands
2026-05-24	chore(vm2): H100 provisioning, L40 plan, and H100-specific vLLM tuning	Paul Buetow

2026-05-24	fix(cli): watch/status/test auto-detect active VMs when default VM1 is not ↵	Paul Buetow
	provisioned
2026-05-24	chore(config): remove gpt-oss-120b references since qwen3.6 is better	Paul Buetow

2026-05-24	fix(watcher): show actionable error when VM not provisioned or SSH fails	Paul Buetow

2026-05-24	replace qwen3-coder-next with qwen3.6-27b across configs, docs, and tooling	Paul Buetow

2026-05-24	feat(watch): retry SSH connection failures with exponential backoff	Paul Buetow
	Remove the vm_api_reachable? filter from run_watch so VMs that are currently booting are not silently dropped from the dashboard. Add exponential-backoff retry logic (up to 4 attempts, sleeping 2s, 4s, 8s, 16s) inside VllmWatcher#fetch_vm_stats for transient SSH/WireGuard errors such as connection refused, host unreachable, and exit 255. This lets watch automatically recover while a VM is still starting up instead of failing immediately.
2026-05-24	chore: add pi/prompt-history.json to .gitignore	Paul Buetow

2026-05-24	fix(loop-scheduler): always pass deliverAs followUp for scheduled messages	Paul Buetow
	The runtime now requires a streamingBehavior (steer/followUp) to queue a message when the agent is already processing. Previously only Gemma 4 models passed { deliverAs: 'followUp' }, causing all other models to throw 'Agent is already processing' and leaving the job stuck in pending. Scheduled and watch messages are independent prompts, so followUp is the correct behavior for all models.
2026-05-24	chore(gitignore): ignore hyperstack state temp files	Paul Buetow
	Add patterns for .hyperstack--state.json and .hyperstack--state.json.known_hosts to keep ephemeral VM state and WireGuard artifacts out of version control.
2026-05-24	docs: refresh README, hypr.fish, AGENTS.md for consolidated --vm CLI	Paul Buetow

2026-05-24	feat(cli): replace --config with --vm 1\|2\|both, remove create-both/delete-both	Paul Buetow
	- Drop single-VM default hyperstack-vm.toml and @config_path/@config_explicit machinery - Add global --vm flag (default: 1) mapping to hyperstack-vm1.toml and/or hyperstack-vm2.toml - Fold create-both and delete-both into create/delete --vm both - Teach status, watch, test, model to accept --vm (default: 1) - Update help text and README/AGENTS/fish abbreviations accordingly
2026-05-24	docs: remove single-VM and ComfyUI/photo references	Paul Buetow

2026-05-24	cleanup: remove ComfyUI and photo-related code from lib/hyperstack	Paul Buetow

2026-05-24	chore: remove photo/ComfyUI top-level files	Paul Buetow
	Delete hyperstack-vm-photo.toml, photo-enhance.rb, photo-enhance-review.md, smart_photo_node.py, workflows/photo-enhance.json (and empty workflows/ dir), and __pycache__/smart_photo_node.cpython-314.pyc (and empty __pycache__/ dir). No .hyperstack-vm-photo-state.json* state files were present. ComfyUI references in lib/hyperstack/*.rb intentionally left for task T2.
2026-05-24	fix(loop-scheduler): reset agentBusy when drainPending detects idle context	Paul Buetow
	The agentBusy flag could get stuck if an agent_start event fired but no matching agent_end followed (e.g. crash or forced shutdown). The scheduler would then show 'pending' forever even though the agent was completely idle. Now drainPendingJobs() and drainPendingWatchJobs() ask the ExtensionContext's isIdle() as a ground-truth fallback whenever agentBusy is true. If the context reports idle, we reset agentBusy = false and proceed to dispatch pending jobs instead of bailing out.
2026-05-24	fix(provisioning): narrow ComfyUI install chmod to models_dir and ↵	Paul Buetow
	output_dir\n\nThe comfyui_install_script previously ran\n\n chmod -R 0777 File.dirname(models_dir)\n\nwhich chmods the parent directory (e.g. /ephemeral). If models_dir\nis configured directly under /ephemeral that gives world-write access to\nall sibling directories (vLLM hug cache, Ollama models, etc.).\n\nNow chmod only the two directories that actually need it: models_dir\nand output_dir.
2026-05-24	fix(wireguard): handle leading whitespace in /etc/hosts lines	Paul Buetow
	In prune_host_line, body.split(/\s+/) on a line with leading whitespace produced tokens starting with an empty string, which was then shifted into as ''. This caused the rewritten /etc/hosts entry to lose its IP silently. Fix by stripping the body before splitting: body.strip.split(/\s+/). Refs: hc
2026-05-24	fix(cli): avoid false VM2 abort when VM1 fails after WG step succeeded	Paul Buetow
	In run_create_both, VM1's thread rescue unconditionally set vm1_wg_state[:error], even when the WireGuard step had already signaled success (vm1_wg_state[:done] = true). If VM2 was waiting on the condition variable at that moment, it would raise 'VM1 WireGuard setup failed' and abort needlessly. Now the rescue only sets :error when :done is still false, so a downstream VM1 failure (e.g. vLLM install) no longer leaks to VM2. Resolves agent task ic.
2026-05-24	fix(watcher): remove Timeout.timeout to prevent orphaning SSH child processes	Paul Buetow
	Replace Timeout.timeout(15) around Open3.capture3 with SSH-level keepalive options (ServerAliveInterval=5, ServerAliveCountMax=3). Ruby's Timeout raises in a background thread but leaves the ssh process running; SSH's own timeouts self-terminate cleanly.
2026-05-24	cleanup	Paul Buetow

2026-05-24	feat: improve task plan mode widget display and update settings version	Paul Buetow

2026-05-24	fix(cli): synchronize access to errors hash in run_create_both	Paul Buetow

2026-05-24	fix(provisioning): chown models_dir itself, not its parent	Paul Buetow

2026-05-24	fix(config): memoize detected_operator_cidr failure to avoid repeated probes	Paul Buetow
	When all public IP probes fail (network down, DNS broken), detect_public_operator_cidr raises HyperstackVM::Error. The old code did not cache this failure, so every call to resolved_allowed_cidrs re-ran all probes, compounding slowness. Add a rescue block in detected_operator_cidr that stores the exception in @detected_operator_cidr_error and re-raises it. On subsequent calls the cached error is re-raised immediately, preventing redundant probe retries.
2026-05-24	fix(smart_photo_node): strip .orient.<ext> suffix for all image formats	Paul Buetow

2026-05-24	fix(manager): only delete state file when VM deletion is confirmed	Paul Buetow
	Ensure Manager#delete does not wipe the state file on generic/transient API failures. The rescue now checks whether the error message indicates the VM is already gone (404, not_found, does not exist) before removing state. This prevents orphaned billable VMs after exhausted retries or transient network errors.
2026-05-24	fix(photo-enhance): ensure .orient tempfiles are always cleaned up in ↵	Paul Buetow
	enhance_one Wrap enhance_one body in begin/ensure to unconditionally delete upload_path (and tmp_png) on every exit path, not just success. Prevents 50+ MB RAW->TIFF leaks when upload_image/submit_prompt/wait_for_output/save_with_corrections raises or when ComfyUI connection errors are caught.
2026-05-24	fix(photo-enhance.rb): add open_timeout and read_timeout to all ComfyUI HTTP ↵	Paul Buetow
	calls
2026-05-24	fix(wg1-setup.sh): escape WG_HOSTNAME in sed /etc/hosts cleanup	Paul Buetow
	Escape regex metacharacters in WG_HOSTNAME before embedding into the sed delete pattern so '.' (always present in hostnames like hyperstack1.wg1) is treated as a literal dot rather than a wildcard. Replace the literal-space anchor with [[:space:]] so tab-indented lines in /etc/hosts are also matched and removed correctly.
2026-04-24	add qwen	Paul Buetow

2026-04-24	task 78: make Qwen3.6-27B the VM2 default	Paul Buetow

2026-04-11	remove the path	Paul Buetow

2026-04-11	pi: point task CLI docs and matching from do to ask	Paul Buetow
	DO_CLI_REF and resolveDoExecutable use ~/go/bin/ask; matchDoInvocation still accepts legacy do prefixes. Update README and Nemotron hints. Made-with: Cursor
2026-04-11	update	Paul Buetow

2026-04-11	Pi extensions: document and invoke task CLI as ~/go/bin/do	Paul Buetow
	Use DO_CLI_REF and resolveDoExecutable in agent-plan-mode; accept both do and ~/go/bin/do in bash guards. Ask-mode shares matchDoInvocation. Nemotron/Qwen tool discipline points to ~/go/bin/do done. Made-with: Cursor
2026-04-11	Rename VM1 configs: default hyperstack-vm1.toml, Nemotron in -nemotron	Paul Buetow
	Move the former hyperstack-vm1-coder.toml to hyperstack-vm1.toml as the standard VM1 profile (Qwen3-Coder-Next on single GPU). Preserve the dual-H100 Nemotron-3-Super stack as hyperstack-vm1-nemotron.toml. Point create-both at hyperstack-vm1.toml and refresh README for current defaults. Made-with: Cursor
2026-04-08	pi: use do CLI instead of ask for task management	Paul Buetow
	Rename task-wrapper invocations and prompts from ask to do in agent-plan-mode (exec, bash guards, workflow strings), plan-mode README, ask-mode readonly-command detection, and nemotron-tool-repair discipline text. Internal helpers renamed for consistency (runDo, isSafeDoCommand). Made-with: Cursor
2026-04-06	provisioner: support docker_image and pre_start_cmd for Gemma 4 startup	Paul Buetow
	Adds docker_image and pre_start_cmd config fields to config.rb and provisioning.rb so the Gemma 4 31B workarounds are baked in: - docker_image = "vllm/vllm-openai:nightly" (stable lacks Gemma 4 support) - pre_start_cmd = "pip install -q transformers==5.5.0" (stable pins <5) - extra_docker_env = ["CUDA_VISIBLE_DEVICES=0"] (required with --entrypoint bash) When pre_start_cmd is set, the provisioner switches to --entrypoint bash and chains the patch command before launching vLLM, so create-both works end-to-end without manual container replacement. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06	hyperstack: switch to Gemma 4 31B on VM2, Qwen3-Coder-Next on VM1	Paul Buetow
	VM1 (hyperstack-vm1-coder.toml, renamed from hyperstack-vm1-gptoss.toml): - Default model switched from gpt-oss-120b to qwen3-coder-next - Config file renamed to reflect actual default model VM2 (hyperstack-vm2.toml): - Default model switched from qwen3-coder-next to Gemma 4 31B AWQ - Uses vLLM nightly image + transformers==5.5.0 workaround: Gemma 4 architecture is registered in transformers 5.x but vLLM stable pins <5 - max_model_len=131072 (128K context); KV cache fills ~95% of H100-80GB VRAM - Added gemma4-31b preset watcher.rb: - Add loading_status field to VmSnapshot to show live model-load progress (last relevant log line during startup instead of generic "loading" message) - fetch_vm_stats now captures both Engine 0 stats and loading-phase log lines in a single SSH call using a shell variable to avoid two docker log invocations - clean_log_line() strips vLLM PID/timestamp prefix for readable display cli.rb: update all hardcoded hyperstack-vm1-gptoss.toml references to hyperstack-vm1-coder.toml hypr.fish: replace pi-hyperstack-nemotron with pi-hyperstack-coder (VM1), add pi-hyperstack-gemma4 (VM2) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27	agent: use ask ids in task extensions	Paul Buetow

2026-03-26	eee97223-bfde-48d7-93f5-d1bb0ecddaba add /watch command	Paul Buetow