summaryrefslogtreecommitdiff
path: root/benchmarks/turbo_final_report.md
blob: a9009338b347dd3525d078529edc693c1a01778f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# DTail Turbo Mode vs Non-Turbo Mode Benchmark Analysis

## Executive Summary

Based on our benchmark analysis and code investigation, we've discovered important insights about DTail's turbo mode performance:

### Key Findings

1. **Turbo Mode is Slower in Serverless Mode**: Our benchmarks show turbo mode actually degrades performance when running in serverless (direct output) mode
   - Non-Turbo DCat: ~210ms for 10MB file  
   - Turbo DCat: ~678ms for 10MB file (3.2x slower)

2. **Design Intent**: Turbo mode was designed specifically for server mode network transmission optimization, not for local/serverless operation

3. **Implementation Issues in Serverless Mode**:
   - Immediate per-line writes without buffering
   - Unnecessary protocol formatting (REMOTE|hostname|...) for local output
   - Per-line color processing overhead
   - Bypasses efficient channel-based batching

## Benchmark Results

### Quick Benchmark (10MB files, serverless mode)

| Operation | Non-Turbo (ns/op) | Turbo (ns/op) | Performance Impact |
|-----------|-------------------|---------------|-------------------|
| DCat | 210,587,327 | 678,704,048 | -222% (slower) |
| DGrep (1% hits) | 96,118,628 | 570,982,203 | -494% (slower) |
| DGrep (10% hits) | 93,247,339 | 599,087,604 | -542% (slower) |

### Throughput Comparison

| Operation | Non-Turbo | Turbo | 
|-----------|-----------|--------|
| DCat | 334,946 lines/sec | 103,468 lines/sec |
| DGrep | 733,250 lines/sec | 123,046 lines/sec |

## Root Cause Analysis

From examining the code in `turbo_writer.go` and `readcommand.go`:

1. **Serverless Mode Issues**:
   ```go
   // DirectTurboWriter forces flush after each line
   tw.buf.Flush() // Line 125 - defeats buffering purpose
   ```

2. **Protocol Overhead**:
   - Adds `REMOTE|hostname|100|lineNum|sourceID|` prefix even for local output
   - This formatting is meant for network protocol, not stdout

3. **No Batching Benefits**:
   - Traditional channel approach can batch multiple lines
   - Turbo mode writes each line immediately

## When to Use Turbo Mode

### ✅ **USE** Turbo Mode When:
- Running in server mode (`--server` flag)
- Processing large files over network
- Network latency is a bottleneck
- NOT using MapReduce operations

### ❌ **AVOID** Turbo Mode When:
- Running in serverless/local mode
- Using MapReduce/aggregate operations
- Processing small files
- Output goes directly to stdout

## Recommendations

1. **For Best Performance**:
   - Serverless mode: Keep `DTAIL_TURBOBOOST_ENABLE` unset
   - Server mode with large files: Set `DTAIL_TURBOBOOST_ENABLE=yes`

2. **Code Improvements Needed**:
   - Disable turbo mode automatically in serverless mode
   - Remove protocol formatting for local output
   - Implement proper buffering for serverless turbo mode

3. **Configuration**:
   ```bash
   # For server mode with large files
   export DTAIL_TURBOBOOST_ENABLE=yes
   dcat --server myserver:2222 /var/log/large.log
   
   # For serverless mode (better without turbo)
   unset DTAIL_TURBOBOOST_ENABLE
   dcat /var/log/large.log
   ```

## Conclusion

Turbo mode is a specialized optimization for server mode network transmission that actually harms performance in serverless mode. Users should only enable it when using DTail in server mode with network communication, particularly for large file transfers where network latency is the primary bottleneck.

The significant performance degradation in serverless mode (3-5x slower) indicates that the turbo mode implementation needs refinement to either:
1. Automatically detect and disable itself in serverless mode
2. Implement different optimization strategies for serverless vs server modes

Until these improvements are made, users should be cautious about enabling turbo mode globally and instead enable it selectively for server mode operations only.