| Age | Commit message (Collapse) | Author |
|
agent_end handler
The agent_end event handler receives ExtensionContext, not
ExtensionCommandContext, so ctx.waitForIdle() is not available.
Replace with a polling loop using ctx.isIdle() to wait for the
run to finish before draining pending jobs, preventing stuck
follow-up messages.
|
|
Inside an agent_end listener, agent.state.isStreaming is still true —
finishRun() only clears it in the finally block of runWithLifecycle,
after all agent_end listeners settle. So when we dispatched a pending
job from agent_end and called pi.sendUserMessage(..., { deliverAs:
'followUp' }), the message was routed into agent.followUpQueue. The
agent loop had already passed its getFollowUpMessages() check, so it
exited without draining the queue. The message sat there as a stuck
'Follow-up: ...' in pi's UI, agentBusy stayed true forever, and every
subsequent pending loop was blocked because no further agent_end fired.
Await ctx.waitForIdle() in the agent_end handler before resetting
agentBusy and calling drainPendingJobs. By then finishRun() has cleared
isStreaming, so sendUserMessage starts a fresh run instead of enqueueing
into a dead followUp queue, and pending loops drain serially as designed.
Amp-Thread-ID: https://ampcode.com/threads/T-019e5de9-a0c3-7559-9cf0-f81ce751e763
Co-authored-by: Amp <amp@ampcode.com>
|
|
- Add per-VM 10s fetch timeout so one dead VM cannot stall the dashboard
- Make fallback logic check VM state (public_ip + ACTIVE status) instead of
just file existence, so a stale/deleted VM1 state does not block watch
- Auto-replace cached SSH host keys when a VM is recreated instead of failing
- Suppress Ruby thread exception noise on killed SSH threads
Fixes 'just watch' showing blank screen when VM1 is deleted but has a stale
state file, and SSH host-key mismatch on VM recreation.
|
|
|
|
When create timed out during vLLM readiness polling (common for large
models like Qwen3.6-27B-FP8), rerunning create would stop and restart
the already-running container, restarting the whole startup sequence.
Now the vLLM install script checks if the container is already running
and serving the correct model before touching it. If it detects a
healthy container, it skips the stop/pull/start cycle entirely.
Also increases the readiness timeout from 20 min (240x5s) to 30 min
(360x5s) to accommodate cold starts with model download and CUDA graph
capture on large models.
|
|
|
|
glm-5.1:cloud, minimax-m2.7:cloud
|
|
|
|
debugging commands
|
|
|
|
provisioned
|
|
|
|
|
|
|
|
Remove the vm_api_reachable? filter from run_watch so VMs that are
currently booting are not silently dropped from the dashboard.
Add exponential-backoff retry logic (up to 4 attempts, sleeping
2s, 4s, 8s, 16s) inside VllmWatcher#fetch_vm_stats for transient
SSH/WireGuard errors such as connection refused, host unreachable,
and exit 255. This lets watch automatically recover while a VM
is still starting up instead of failing immediately.
|
|
|
|
The runtime now requires a streamingBehavior (steer/followUp) to queue a
message when the agent is already processing. Previously only Gemma 4
models passed { deliverAs: 'followUp' }, causing all other models to
throw 'Agent is already processing' and leaving the job stuck in pending.
Scheduled and watch messages are independent prompts, so followUp is
the correct behavior for all models.
|
|
Add patterns for .hyperstack-*-state.json and .hyperstack-*-state.json.known_hosts
to keep ephemeral VM state and WireGuard artifacts out of version control.
|
|
|
|
- Drop single-VM default hyperstack-vm.toml and @config_path/@config_explicit machinery
- Add global --vm flag (default: 1) mapping to hyperstack-vm1.toml and/or hyperstack-vm2.toml
- Fold create-both and delete-both into create/delete --vm both
- Teach status, watch, test, model to accept --vm (default: 1)
- Update help text and README/AGENTS/fish abbreviations accordingly
|
|
|
|
|
|
Delete hyperstack-vm-photo.toml, photo-enhance.rb, photo-enhance-review.md,
smart_photo_node.py, workflows/photo-enhance.json (and empty workflows/ dir),
and __pycache__/smart_photo_node.cpython-314.pyc (and empty __pycache__/ dir).
No .hyperstack-vm-photo-state.json* state files were present.
ComfyUI references in lib/hyperstack/*.rb intentionally left for task T2.
|
|
The agentBusy flag could get stuck if an agent_start event fired but no
matching agent_end followed (e.g. crash or forced shutdown). The
scheduler would then show 'pending' forever even though the agent was
completely idle.
Now drainPendingJobs() and drainPendingWatchJobs() ask the ExtensionContext's
isIdle() as a ground-truth fallback whenever agentBusy is true. If the
context reports idle, we reset agentBusy = false and proceed to dispatch
pending jobs instead of bailing out.
|
|
output_dir\n\nThe comfyui_install_script previously ran\n\n chmod -R 0777 File.dirname(models_dir)\n\nwhich chmods the *parent* directory (e.g. /ephemeral). If models_dir\nis configured directly under /ephemeral that gives world-write access to\nall sibling directories (vLLM hug cache, Ollama models, etc.).\n\nNow chmod only the two directories that actually need it: models_dir\nand output_dir.
|
|
In prune_host_line, body.split(/\s+/) on a line with leading whitespace
produced tokens starting with an empty string, which was then shifted
into as ''. This caused the rewritten /etc/hosts entry to lose
its IP silently.
Fix by stripping the body before splitting: body.strip.split(/\s+/).
Refs: hc
|
|
In run_create_both, VM1's thread rescue unconditionally set
vm1_wg_state[:error], even when the WireGuard step had already
signaled success (vm1_wg_state[:done] = true). If VM2 was
waiting on the condition variable at that moment, it would raise
'VM1 WireGuard setup failed' and abort needlessly.
Now the rescue only sets :error when :done is still false, so a
downstream VM1 failure (e.g. vLLM install) no longer leaks to VM2.
Resolves agent task ic.
|
|
Replace Timeout.timeout(15) around Open3.capture3 with SSH-level
keepalive options (ServerAliveInterval=5, ServerAliveCountMax=3).
Ruby's Timeout raises in a background thread but leaves the ssh
process running; SSH's own timeouts self-terminate cleanly.
|
|
|
|
|
|
|
|
|
|
When all public IP probes fail (network down, DNS broken), detect_public_operator_cidr
raises HyperstackVM::Error. The old code did not cache this failure, so every
call to resolved_allowed_cidrs re-ran all probes, compounding slowness.
Add a rescue block in detected_operator_cidr that stores the exception in
@detected_operator_cidr_error and re-raises it. On subsequent calls the cached
error is re-raised immediately, preventing redundant probe retries.
|
|
|
|
Ensure Manager#delete does not wipe the state file on generic/transient API failures. The rescue now checks whether the error message indicates the VM is already gone (404, not_found, does not exist) before removing state. This prevents orphaned billable VMs after exhausted retries or transient network errors.
|
|
enhance_one
Wrap enhance_one body in begin/ensure to unconditionally delete upload_path
(and tmp_png) on every exit path, not just success. Prevents 50+ MB RAW->TIFF
leaks when upload_image/submit_prompt/wait_for_output/save_with_corrections
raises or when ComfyUI connection errors are caught.
|
|
calls
|
|
Escape regex metacharacters in WG_HOSTNAME before embedding into the sed
delete pattern so '.' (always present in hostnames like hyperstack1.wg1)
is treated as a literal dot rather than a wildcard.
Replace the literal-space anchor with [[:space:]] so tab-indented lines
in /etc/hosts are also matched and removed correctly.
|
|
|
|
|
|
|
|
DO_CLI_REF and resolveDoExecutable use ~/go/bin/ask; matchDoInvocation
still accepts legacy do prefixes. Update README and Nemotron hints.
Made-with: Cursor
|
|
|
|
Use DO_CLI_REF and resolveDoExecutable in agent-plan-mode; accept both
do and ~/go/bin/do in bash guards. Ask-mode shares matchDoInvocation.
Nemotron/Qwen tool discipline points to ~/go/bin/do done.
Made-with: Cursor
|
|
Move the former hyperstack-vm1-coder.toml to hyperstack-vm1.toml as the
standard VM1 profile (Qwen3-Coder-Next on single GPU). Preserve the
dual-H100 Nemotron-3-Super stack as hyperstack-vm1-nemotron.toml. Point
create-both at hyperstack-vm1.toml and refresh README for current defaults.
Made-with: Cursor
|
|
Rename task-wrapper invocations and prompts from ask to do in
agent-plan-mode (exec, bash guards, workflow strings), plan-mode README,
ask-mode readonly-command detection, and nemotron-tool-repair discipline
text. Internal helpers renamed for consistency (runDo, isSafeDoCommand).
Made-with: Cursor
|
|
Adds docker_image and pre_start_cmd config fields to config.rb and
provisioning.rb so the Gemma 4 31B workarounds are baked in:
- docker_image = "vllm/vllm-openai:nightly" (stable lacks Gemma 4 support)
- pre_start_cmd = "pip install -q transformers==5.5.0" (stable pins <5)
- extra_docker_env = ["CUDA_VISIBLE_DEVICES=0"] (required with --entrypoint bash)
When pre_start_cmd is set, the provisioner switches to --entrypoint bash and
chains the patch command before launching vLLM, so create-both works end-to-end
without manual container replacement.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
VM1 (hyperstack-vm1-coder.toml, renamed from hyperstack-vm1-gptoss.toml):
- Default model switched from gpt-oss-120b to qwen3-coder-next
- Config file renamed to reflect actual default model
VM2 (hyperstack-vm2.toml):
- Default model switched from qwen3-coder-next to Gemma 4 31B AWQ
- Uses vLLM nightly image + transformers==5.5.0 workaround: Gemma 4
architecture is registered in transformers 5.x but vLLM stable pins <5
- max_model_len=131072 (128K context); KV cache fills ~95% of H100-80GB VRAM
- Added gemma4-31b preset
watcher.rb:
- Add loading_status field to VmSnapshot to show live model-load progress
(last relevant log line during startup instead of generic "loading" message)
- fetch_vm_stats now captures both Engine 0 stats and loading-phase log lines
in a single SSH call using a shell variable to avoid two docker log invocations
- clean_log_line() strips vLLM PID/timestamp prefix for readable display
cli.rb: update all hardcoded hyperstack-vm1-gptoss.toml references to
hyperstack-vm1-coder.toml
hypr.fish: replace pi-hyperstack-nemotron with pi-hyperstack-coder (VM1),
add pi-hyperstack-gemma4 (VM2)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
|
|
|