diff options
| author | Paul Buetow <paul@buetow.org> | 2025-07-01 21:03:53 +0300 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2025-07-01 21:03:53 +0300 |
| commit | 4e4034a129e3400ffccb23975a6f3a0cd3011309 (patch) | |
| tree | 4f77f8b72c9fe93481c66053749a56aa78ee9651 /benchmarks/trace_overhead_report.md | |
| parent | de974de3d8f4ed38ff7fac6447fdcfec2c00cfc1 (diff) | |
perf: optimize turbo mode for 2.87x faster serverless performance
Major performance improvements in turbo mode:
- Fixed trace logging overhead by adding early level checks before expensive runtime.Caller() operations
- Improved buffering strategy by removing forced immediate flush in serverless mode
- Turbo mode now 2.87x faster (was 3-5x slower before optimization)
Changes:
- internal/io/dlog/dlog.go: Added early return in Trace() and Devel() when logging disabled
- internal/server/handlers/turbo_writer.go: Removed serverless immediate flush condition
Performance results:
- Before: Turbo mode was 3-5x slower than non-turbo mode
- After: Turbo mode is 2.87x faster (65% improvement)
- All integration tests pass
Added comprehensive benchmarking tools in benchmarks/ directory
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Diffstat (limited to 'benchmarks/trace_overhead_report.md')
| -rw-r--r-- | benchmarks/trace_overhead_report.md | 89 |
1 files changed, 89 insertions, 0 deletions
diff --git a/benchmarks/trace_overhead_report.md b/benchmarks/trace_overhead_report.md new file mode 100644 index 0000000..6aa5d83 --- /dev/null +++ b/benchmarks/trace_overhead_report.md @@ -0,0 +1,89 @@ +# Trace Logging Performance Impact in Turbo Mode + +## Summary + +The `dlog.Server.Trace` calls are causing significant performance overhead in turbo mode, even when trace logging is disabled (default log level is "info"). The overhead comes primarily from `runtime.Caller(1)` which is called on every trace invocation to get file/line information. + +## Key Findings + +### 1. Trace Call Distribution +- **42 total trace calls** across the codebase +- **16 calls in turbo_writer.go** - the hot path for turbo mode +- **8 trace calls in WriteLineData** - called for every line processed + +### 2. Performance Measurements + +#### Per-Call Overhead +- **497 nanoseconds per trace call** when logging is disabled +- Almost all overhead (>99%) comes from `runtime.Caller(1)` +- The log level check itself has negligible overhead + +#### Real-World Impact +Testing with a 100,000 line file showed: +- **With trace calls**: 240ms (416K lines/sec) +- **Without trace calls**: 19ms (5.36M lines/sec) +- **Performance degradation**: 1,188% slower (12x slower!) +- **Throughput loss**: 4.9 million lines/sec + +#### Extrapolated Impact on Large Files +- 1M lines: 2.2 seconds overhead +- 10M lines: 22 seconds overhead +- 100M lines: 3 minutes 41 seconds overhead + +### 3. Hot Path Analysis + +The most impactful trace calls are in `WriteLineData`: +1. Entry trace with line number and content length +2. Stats trace with lines/bytes written +3. Channel operation traces (3-4 calls) +4. Success/retry traces + +For a 1M line file, this results in 8M trace calls, adding ~4 seconds of overhead. + +## Root Cause + +Even when trace logging is disabled, each `dlog.Server.Trace` call: +1. Calls `runtime.Caller(1)` to get file/line info (expensive!) +2. Allocates memory for the args slice +3. Formats the file/line string +4. Only then checks if trace level is enabled + +The current implementation in `dlog.go`: +```go +func (d *DLog) Trace(args ...interface{}) string { + _, file, line, _ := runtime.Caller(1) // Always executed! + args = append(args, fmt.Sprintf("at %s:%d", file, line)) + return d.log(Trace, args) // Level check happens inside here +} +``` + +## Recommendations + +### 1. Immediate Fix - Add Early Level Check +```go +func (d *DLog) Trace(args ...interface{}) string { + if d.maxLevel < Trace { + return "" // Skip runtime.Caller when trace is disabled + } + _, file, line, _ := runtime.Caller(1) + args = append(args, fmt.Sprintf("at %s:%d", file, line)) + return d.log(Trace, args) +} +``` + +### 2. Build-Time Solution - Conditional Compilation +Use build tags to completely remove trace calls in production: +```go +// +build !debug + +func (d *DLog) Trace(args ...interface{}) string { + return "" // No-op in production builds +} +``` + +### 3. Long-Term Solution - Remove from Hot Paths +Consider removing trace calls from performance-critical paths like `WriteLineData` or using a sampling approach (e.g., trace every 1000th line). + +## Conclusion + +The trace logging overhead is severe enough to negate much of the performance benefit of turbo mode. With the simple fix of checking the log level before calling `runtime.Caller`, we can eliminate 99% of the overhead and restore turbo mode's full performance potential.
\ No newline at end of file |
