doc/turboboost_optimization.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118

# DTail Turbo Boost Optimization

## Overview

This document describes the turbo boost optimization feature that provides significant performance improvements for DTail operations by using channel-less processing and optimized I/O.

## Problem Statement

The original dgrep implementation used multiple channels in a pipeline:
- `rawLines chan *bytes.Buffer` (buffer: 100) - Raw lines read from file
- `lines chan *line.Line` (buffer: 100) - Filtered lines to send to client

This created several performance issues:
1. Fixed channel buffer sizes causing blocking under high throughput
2. Context switching overhead between goroutines
3. Channel synchronization overhead
4. Memory allocations for channel operations

## Solution

The channel-less implementation replaces the channel pipeline with direct function calls using a `LineProcessor` interface.

### Key Components

1. **LineProcessor Interface** (`internal/io/line/processor.go`)
   - Defines methods for processing lines without channels
   - `ProcessLine()` - Handle a single line
   - `Flush()` - Ensure buffered data is written
   - `Close()` - Clean up resources

2. **GrepLineProcessor** (`internal/server/handlers/lineprocessor.go`)
   - Implements LineProcessor for grep operations
   - Writes directly to the network connection
   - Uses internal buffering for efficiency (64KB buffer)
   - Thread-safe with mutex protection

3. **Modified File Reading** (`internal/io/fs/readfile_processor.go`)
   - `StartWithProcessor()` - Channel-less file reading
   - Direct callbacks instead of channel sends
   - Inline regex filtering without goroutines

4. **Optimized File Reading** (`internal/io/fs/readfile_processor_optimized.go`)
   - Uses buffered line reading instead of byte-by-byte
   - Custom scanner with 256KB buffer
   - Efficient handling of long lines
   - Special optimization for tail mode

### Feature Flags

The implementation can be controlled via environment variables:
- `DTAIL_CHANNELLESS_GREP=yes` - Enable channel-less grep implementation
- `DTAIL_OPTIMIZED_READER=yes` - Use optimized buffered reader

### Benefits

1. **Reduced Latency**: No channel queuing delays
2. **Lower Memory Usage**: No channel buffers
3. **Better CPU Efficiency**: Fewer context switches
4. **Simpler Code Flow**: Direct processing without goroutine coordination
5. **Predictable Performance**: No channel blocking

### Backward Compatibility

- Original channel-based implementation remains available
- Same command-line interface
- Protocol compatibility maintained
- All integration tests pass unchanged

### Performance Testing

Use the provided script to compare performance:

```bash
./test_channelless_performance.sh
```

This will test:
1. Original channel-based implementation
2. Channel-less implementation
3. Optimized channel-less implementation

### Usage

To use the channel-less implementation:

```bash
# Enable channel-less grep
export DTAIL_CHANNELLESS_GREP=yes

# Also enable optimized reader
export DTAIL_OPTIMIZED_READER=yes

# Run dgrep normally
dgrep -regex "pattern" file.log
```

### Future Improvements

1. Extend channel-less approach to other commands (dcat, dtail)
2. Add configurable buffer sizes
3. Implement zero-copy optimizations
4. Add performance metrics collection
5. Consider using io_uring on Linux for async I/O

## Usage

To enable turbo boost optimizations:

```bash
export DTAIL_TURBOBOOST_ENABLE=yes
```

This enables:
- Channel-less implementation for grep and cat operations
- Optimized buffered I/O reader (256KB buffer)
- Buffer pooling to reduce memory allocations

The turbo boost mode is designed to be extended to other DTail commands in the future.