summaryrefslogtreecommitdiff
path: root/lib/hyperstack
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2026-03-26 09:15:57 +0200
committerPaul Buetow <paul@buetow.org>2026-03-26 09:15:57 +0200
commitacaea730cfca43bf96f4e70b3104559db7977a3f (patch)
treec9f05056ab8de446715d6c7ce284e00217d44850 /lib/hyperstack
parent5281c4fe10b7600da0aa5170963f1f44119058aa (diff)
hyperstack: tune nemotron-super preset for single A100-80GB
Model weights occupy ~73.6 GiB leaving ~5.6 GiB for KV cache. Reduce max_model_len to 32768 and raise gpu_memory_utilization to 0.98 to fit. Add --enforce-eager to disable CUDA graph capture, which profiling-phase requires ~2 GiB headroom that simply isn't available on a single A100. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Diffstat (limited to 'lib/hyperstack')
0 files changed, 0 insertions, 0 deletions