summaryrefslogtreecommitdiff
path: root/benchmarks/trace_overhead_report.md
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2025-07-01 21:03:53 +0300
committerPaul Buetow <paul@buetow.org>2025-07-01 21:03:53 +0300
commit4e4034a129e3400ffccb23975a6f3a0cd3011309 (patch)
tree4f77f8b72c9fe93481c66053749a56aa78ee9651 /benchmarks/trace_overhead_report.md
parentde974de3d8f4ed38ff7fac6447fdcfec2c00cfc1 (diff)
perf: optimize turbo mode for 2.87x faster serverless performance
Major performance improvements in turbo mode: - Fixed trace logging overhead by adding early level checks before expensive runtime.Caller() operations - Improved buffering strategy by removing forced immediate flush in serverless mode - Turbo mode now 2.87x faster (was 3-5x slower before optimization) Changes: - internal/io/dlog/dlog.go: Added early return in Trace() and Devel() when logging disabled - internal/server/handlers/turbo_writer.go: Removed serverless immediate flush condition Performance results: - Before: Turbo mode was 3-5x slower than non-turbo mode - After: Turbo mode is 2.87x faster (65% improvement) - All integration tests pass Added comprehensive benchmarking tools in benchmarks/ directory 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Diffstat (limited to 'benchmarks/trace_overhead_report.md')
-rw-r--r--benchmarks/trace_overhead_report.md89
1 files changed, 89 insertions, 0 deletions
diff --git a/benchmarks/trace_overhead_report.md b/benchmarks/trace_overhead_report.md
new file mode 100644
index 0000000..6aa5d83
--- /dev/null
+++ b/benchmarks/trace_overhead_report.md
@@ -0,0 +1,89 @@
+# Trace Logging Performance Impact in Turbo Mode
+
+## Summary
+
+The `dlog.Server.Trace` calls are causing significant performance overhead in turbo mode, even when trace logging is disabled (default log level is "info"). The overhead comes primarily from `runtime.Caller(1)` which is called on every trace invocation to get file/line information.
+
+## Key Findings
+
+### 1. Trace Call Distribution
+- **42 total trace calls** across the codebase
+- **16 calls in turbo_writer.go** - the hot path for turbo mode
+- **8 trace calls in WriteLineData** - called for every line processed
+
+### 2. Performance Measurements
+
+#### Per-Call Overhead
+- **497 nanoseconds per trace call** when logging is disabled
+- Almost all overhead (>99%) comes from `runtime.Caller(1)`
+- The log level check itself has negligible overhead
+
+#### Real-World Impact
+Testing with a 100,000 line file showed:
+- **With trace calls**: 240ms (416K lines/sec)
+- **Without trace calls**: 19ms (5.36M lines/sec)
+- **Performance degradation**: 1,188% slower (12x slower!)
+- **Throughput loss**: 4.9 million lines/sec
+
+#### Extrapolated Impact on Large Files
+- 1M lines: 2.2 seconds overhead
+- 10M lines: 22 seconds overhead
+- 100M lines: 3 minutes 41 seconds overhead
+
+### 3. Hot Path Analysis
+
+The most impactful trace calls are in `WriteLineData`:
+1. Entry trace with line number and content length
+2. Stats trace with lines/bytes written
+3. Channel operation traces (3-4 calls)
+4. Success/retry traces
+
+For a 1M line file, this results in 8M trace calls, adding ~4 seconds of overhead.
+
+## Root Cause
+
+Even when trace logging is disabled, each `dlog.Server.Trace` call:
+1. Calls `runtime.Caller(1)` to get file/line info (expensive!)
+2. Allocates memory for the args slice
+3. Formats the file/line string
+4. Only then checks if trace level is enabled
+
+The current implementation in `dlog.go`:
+```go
+func (d *DLog) Trace(args ...interface{}) string {
+ _, file, line, _ := runtime.Caller(1) // Always executed!
+ args = append(args, fmt.Sprintf("at %s:%d", file, line))
+ return d.log(Trace, args) // Level check happens inside here
+}
+```
+
+## Recommendations
+
+### 1. Immediate Fix - Add Early Level Check
+```go
+func (d *DLog) Trace(args ...interface{}) string {
+ if d.maxLevel < Trace {
+ return "" // Skip runtime.Caller when trace is disabled
+ }
+ _, file, line, _ := runtime.Caller(1)
+ args = append(args, fmt.Sprintf("at %s:%d", file, line))
+ return d.log(Trace, args)
+}
+```
+
+### 2. Build-Time Solution - Conditional Compilation
+Use build tags to completely remove trace calls in production:
+```go
+// +build !debug
+
+func (d *DLog) Trace(args ...interface{}) string {
+ return "" // No-op in production builds
+}
+```
+
+### 3. Long-Term Solution - Remove from Hot Paths
+Consider removing trace calls from performance-critical paths like `WriteLineData` or using a sampling approach (e.g., trace every 1000th line).
+
+## Conclusion
+
+The trace logging overhead is severe enough to negate much of the performance benefit of turbo mode. With the simple fix of checking the log level before calling `runtime.Caller`, we can eliminate 99% of the overhead and restore turbo mode's full performance potential. \ No newline at end of file