diff options
| author | Paul Buetow <paul@buetow.org> | 2026-05-24 22:56:19 +0300 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2026-05-24 22:56:19 +0300 |
| commit | 5343872a58f30fa7470011d740b404cfdd7ecdf2 (patch) | |
| tree | b5add2fd535e44eb08bedbbb42af0d955b233e3f /lib/hyperstack/manager.rb | |
| parent | d3787698a8a16b92006d4b5a9d285b170881f225 (diff) | |
fix(provisioning): recover from vLLM readiness timeout and increase poll window
When create timed out during vLLM readiness polling (common for large
models like Qwen3.6-27B-FP8), rerunning create would stop and restart
the already-running container, restarting the whole startup sequence.
Now the vLLM install script checks if the container is already running
and serving the correct model before touching it. If it detects a
healthy container, it skips the stop/pull/start cycle entirely.
Also increases the readiness timeout from 20 min (240x5s) to 30 min
(360x5s) to accommodate cold starts with model download and CUDA graph
capture on large models.
Diffstat (limited to 'lib/hyperstack/manager.rb')
0 files changed, 0 insertions, 0 deletions
