From 4e4034a129e3400ffccb23975a6f3a0cd3011309 Mon Sep 17 00:00:00 2001 From: Paul Buetow Date: Tue, 1 Jul 2025 21:03:53 +0300 Subject: perf: optimize turbo mode for 2.87x faster serverless performance MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Major performance improvements in turbo mode: - Fixed trace logging overhead by adding early level checks before expensive runtime.Caller() operations - Improved buffering strategy by removing forced immediate flush in serverless mode - Turbo mode now 2.87x faster (was 3-5x slower before optimization) Changes: - internal/io/dlog/dlog.go: Added early return in Trace() and Devel() when logging disabled - internal/server/handlers/turbo_writer.go: Removed serverless immediate flush condition Performance results: - Before: Turbo mode was 3-5x slower than non-turbo mode - After: Turbo mode is 2.87x faster (65% improvement) - All integration tests pass Added comprehensive benchmarking tools in benchmarks/ directory 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- benchmarks/trace_overhead_report.md | 89 +++++++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 benchmarks/trace_overhead_report.md (limited to 'benchmarks/trace_overhead_report.md') diff --git a/benchmarks/trace_overhead_report.md b/benchmarks/trace_overhead_report.md new file mode 100644 index 0000000..6aa5d83 --- /dev/null +++ b/benchmarks/trace_overhead_report.md @@ -0,0 +1,89 @@ +# Trace Logging Performance Impact in Turbo Mode + +## Summary + +The `dlog.Server.Trace` calls are causing significant performance overhead in turbo mode, even when trace logging is disabled (default log level is "info"). The overhead comes primarily from `runtime.Caller(1)` which is called on every trace invocation to get file/line information. + +## Key Findings + +### 1. Trace Call Distribution +- **42 total trace calls** across the codebase +- **16 calls in turbo_writer.go** - the hot path for turbo mode +- **8 trace calls in WriteLineData** - called for every line processed + +### 2. Performance Measurements + +#### Per-Call Overhead +- **497 nanoseconds per trace call** when logging is disabled +- Almost all overhead (>99%) comes from `runtime.Caller(1)` +- The log level check itself has negligible overhead + +#### Real-World Impact +Testing with a 100,000 line file showed: +- **With trace calls**: 240ms (416K lines/sec) +- **Without trace calls**: 19ms (5.36M lines/sec) +- **Performance degradation**: 1,188% slower (12x slower!) +- **Throughput loss**: 4.9 million lines/sec + +#### Extrapolated Impact on Large Files +- 1M lines: 2.2 seconds overhead +- 10M lines: 22 seconds overhead +- 100M lines: 3 minutes 41 seconds overhead + +### 3. Hot Path Analysis + +The most impactful trace calls are in `WriteLineData`: +1. Entry trace with line number and content length +2. Stats trace with lines/bytes written +3. Channel operation traces (3-4 calls) +4. Success/retry traces + +For a 1M line file, this results in 8M trace calls, adding ~4 seconds of overhead. + +## Root Cause + +Even when trace logging is disabled, each `dlog.Server.Trace` call: +1. Calls `runtime.Caller(1)` to get file/line info (expensive!) +2. Allocates memory for the args slice +3. Formats the file/line string +4. Only then checks if trace level is enabled + +The current implementation in `dlog.go`: +```go +func (d *DLog) Trace(args ...interface{}) string { + _, file, line, _ := runtime.Caller(1) // Always executed! + args = append(args, fmt.Sprintf("at %s:%d", file, line)) + return d.log(Trace, args) // Level check happens inside here +} +``` + +## Recommendations + +### 1. Immediate Fix - Add Early Level Check +```go +func (d *DLog) Trace(args ...interface{}) string { + if d.maxLevel < Trace { + return "" // Skip runtime.Caller when trace is disabled + } + _, file, line, _ := runtime.Caller(1) + args = append(args, fmt.Sprintf("at %s:%d", file, line)) + return d.log(Trace, args) +} +``` + +### 2. Build-Time Solution - Conditional Compilation +Use build tags to completely remove trace calls in production: +```go +// +build !debug + +func (d *DLog) Trace(args ...interface{}) string { + return "" // No-op in production builds +} +``` + +### 3. Long-Term Solution - Remove from Hot Paths +Consider removing trace calls from performance-critical paths like `WriteLineData` or using a sampling approach (e.g., trace every 1000th line). + +## Conclusion + +The trace logging overhead is severe enough to negate much of the performance benefit of turbo mode. With the simple fix of checking the log level before calling `runtime.Caller`, we can eliminate 99% of the overhead and restore turbo mode's full performance potential. \ No newline at end of file -- cgit v1.2.3