summaryrefslogtreecommitdiff
path: root/lib/hyperstack/manager.rb
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2026-05-24 22:56:19 +0300
committerPaul Buetow <paul@buetow.org>2026-05-24 22:56:19 +0300
commit5343872a58f30fa7470011d740b404cfdd7ecdf2 (patch)
treeb5add2fd535e44eb08bedbbb42af0d955b233e3f /lib/hyperstack/manager.rb
parentd3787698a8a16b92006d4b5a9d285b170881f225 (diff)
fix(provisioning): recover from vLLM readiness timeout and increase poll window
When create timed out during vLLM readiness polling (common for large models like Qwen3.6-27B-FP8), rerunning create would stop and restart the already-running container, restarting the whole startup sequence. Now the vLLM install script checks if the container is already running and serving the correct model before touching it. If it detects a healthy container, it skips the stop/pull/start cycle entirely. Also increases the readiness timeout from 20 min (240x5s) to 30 min (360x5s) to accommodate cold starts with model download and CUDA graph capture on large models.
Diffstat (limited to 'lib/hyperstack/manager.rb')
0 files changed, 0 insertions, 0 deletions