| Age | Commit message (Collapse) | Author |
|
Include garage in f3s host list so DNS, TLS (acme), and httpd/relayd
templates generate config for the new hostname.
Made-with: Cursor
|
|
Amp-Thread-ID: https://ampcode.com/threads/T-019d6727-d603-72c5-97a0-c1e419211767
Co-authored-by: Amp <amp@ampcode.com>
|
|
remove duplicate controllers.server
Amp-Thread-ID: https://ampcode.com/threads/T-019d6154-8fdf-74fe-b865-f796d8a4214a
Co-authored-by: Amp <amp@ampcode.com>
|
|
Amp-Thread-ID: https://ampcode.com/threads/T-019d6154-8fdf-74fe-b865-f796d8a4214a
Co-authored-by: Amp <amp@ampcode.com>
|
|
threads, increase worker timeout
Amp-Thread-ID: https://ampcode.com/threads/T-019d6154-8fdf-74fe-b865-f796d8a4214a
Co-authored-by: Amp <amp@ampcode.com>
|
|
- Increase liveness probe tolerance (60s delay, 30s period, 10s timeout, 6 failures)
- Increase readiness probe tolerance (15s delay, 10s period, 5s timeout, 6 failures)
- Add resource requests (100m CPU, 512Mi RAM) and limits (2Gi RAM)
- Fixes crash loop caused by probe killing postgres during recovery
Amp-Thread-ID: https://ampcode.com/threads/T-019d5f54-27f2-740c-ac41-0f980e7aecd3
Co-authored-by: Amp <amp@ampcode.com>
|
|
apply
|
|
resources
|
|
|
|
|
|
Amp-Thread-ID: https://ampcode.com/threads/T-019d47a3-2deb-75c3-8a75-b0f39006a35d
Co-authored-by: Amp <amp@ampcode.com>
|
|
Amp-Thread-ID: https://ampcode.com/threads/T-019d47a3-2deb-75c3-8a75-b0f39006a35d
Co-authored-by: Amp <amp@ampcode.com>
|
|
Amp-Thread-ID: https://ampcode.com/threads/T-019d47a3-2deb-75c3-8a75-b0f39006a35d
Co-authored-by: Amp <amp@ampcode.com>
|
|
Amp-Thread-ID: https://ampcode.com/threads/T-019d47a3-2deb-75c3-8a75-b0f39006a35d
Co-authored-by: Amp <amp@ampcode.com>
|
|
|
|
Add a QEMU/KVM OpenBSD VM for native compilation of CGo packages
(e.g. dtail with DataDog/zstd). The VM is fully automated via expect
driving the serial console installer.
- packages/buildvm/: setup, provision, start, stop scripts and expect installer
- packages/scripts/pkg-dtail-openbsd.sh: multi-binary package with signify signing
- packages/Makefile: build VM management and dtail-openbsd target using git archive
- frontends/Rexfile: dtail_install task uses custom pkg repo, dtail task enabled
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
Packaging logic is now OS-agnostic shell scripts + Makefile, reusable for
any Go project. Cross-compiles locally, SCPs to target host for native
packaging, and uploads to the PV.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
Packages are now signed via pkg_sign with the custom-pkg signify key
on the OpenBSD build host. The public key at /etc/signify/custom-pkg.pub
on each client allows pkg_add to verify without -D unsigned.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
Replace manual binary copy in gogios_install with pkg install (FreeBSD)
and pkg_add (OpenBSD). Add pkgrepo_setup task that configures PKG_PATH
in root's .profile on OpenBSD frontends. The gogios task now calls
gogios_install automatically.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
FreeBSD: use -p plist flag so files are actually included in the package.
OpenBSD: use -D COMMENT flag and separate desc file as required by
pkg_create, auto-detect OS version for repo path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
The root path returns 404 by design, so probes need a dedicated
/healthz endpoint that returns 200.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
Serve custom-built FreeBSD and OpenBSD packages via nginx in the k3s
cluster. Includes helm chart, ArgoCD app, test artifact build script,
and DNS entry via frontends Rexfile.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
The wait-for-nfs init container was checking for nfs.DO_NOT_REMOVE but
the actual file on disk is k3svolumes.DO_NOT_REMOVE. This caused every
new pod from the rolling update to be permanently stuck in Init:0/1,
leaving two postgres pods running indefinitely (old + stuck new).
|
|
Amp-Thread-ID: https://ampcode.com/threads/T-019d14d5-4dbf-71a7-a619-d9c5afed3f7c
Co-authored-by: Amp <amp@ampcode.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
hyperstack.rb and wg1-setup.sh for multi-VM WireGuard support
|
|
- model switch now passes pull_image: false to avoid surprise multi-GB
image downloads when the upstream vLLM image was updated upstream;
docker pull is still run on initial install (pull_image: true default)
- mount /ephemeral/vllm_cache → /root/.cache/vllm so torch.compile
artifacts survive container restarts; saves ~30-60 s on warm switches
- add vllm_compile_cache_dir helper (sibling of hug_cache_dir)
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
|
max_position_embeddings=131072 in model config.json; exceeding it causes
NaN/CUDA OOB. 163840 was rejected by vLLM at startup. The 135K error
requires starting a fresh opencode conversation instead.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
|
131K was still too small — observed 135K token conversations in practice.
Physical KV capacity is 168K blocks so 160K is safe without OOM.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
|
MXFP4 KV cache is compact enough that vLLM allocated 168K token blocks
(10560×16) at 0.92 utilization — the 40K limit was too conservative and
caused negative max_tokens errors in long Claude Code sessions.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
|
|
|
|
|
|
|
- Created ConfigLoader for TOML loading and validation
- Kept Config for configuration value access only
- Reduced Config from 489 lines to ~200 lines
- Fixed CLI to use ConfigLoader and pass @path to Config
|
|
Show the currently loaded model (from state file, or config default)
so it's immediately visible without running `model list`.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
|
Tested 1M context (NoPE allows arbitrary max_position_embeddings without
YaRN) — OOMs on A100 80GB due to insufficient VRAM after 60GB model weights.
256K (262144) is the practical ceiling on this hardware.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
|
Both Nemotron and Qwen3-XML use identical <tool_call><function=name>
<parameter=p>value</parameter></function></tool_call> format.
qwen3_xml correctly parses Nemotron's output; tool calling now works
with opencode and other API clients.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
|
vLLM 0.17.1 has no tool call parser for Nemotron's custom XML format
(<tool_call><function=...><parameter=...>). Setting llama3_json produced
garbage output. Reverted to tool_call_parser="" with a clear comment.
Added --reasoning-parser nemotron_v3 via extra_vllm_args so <think> tokens
are properly exposed as reasoning_content in the API response.
For agentic work requiring tool calls, switch to qwen3-coder-next or devstral.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
|
known_hosts
- hyperstack-vm.toml: set tool_call_parser=llama3_json for nemotron-super so vLLM
accepts tool_choice requests from opencode; model won't spontaneously call tools
so the vLLM 0.17.1 token_ids crash in llama3_json won't trigger
- hyperstack.rb: wait_for_ssh now also removes the WireGuard hostname
(hyperstack.wg1) from known_hosts alongside the IP, preventing
StrictHostKeyChecking failures across VM recreates
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|