summaryrefslogtreecommitdiff
path: root/doc/pgo_implementation.md
blob: edcfe40e0b3e97d0078d7109fa708530160a8f92 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
# Profile-Guided Optimization (PGO) Implementation for DTail

## Overview

This document describes the Profile-Guided Optimization (PGO) implementation for DTail tools. PGO is a compiler optimization technique that uses runtime profiling data to guide optimization decisions, resulting in better performance for real-world usage patterns.

## Implementation Details

### Architecture

The PGO implementation is integrated into the dtail-tools command as a subcommand:

```bash
dtail-tools pgo [options] [commands...]
```

### Core Components

1. **PGO Module** (`internal/tools/pgo/pgo.go`)
   - Handles the complete PGO workflow
   - Manages profile generation, merging, and PGO builds
   - Provides performance comparison

2. **Profiling Integration**
   - All dtail commands now support the `-profile` flag
   - dserver uses HTTP pprof endpoint for profiling
   - Profiles are generated during realistic workloads

3. **Makefile Integration**
   - `make pgo` - Complete PGO workflow
   - `make pgo-quick` - Quick PGO with smaller datasets
   - `make pgo-generate` - Generate profiles only
   - `make build-pgo` - Build with existing profiles
   - `make install-pgo` - Install PGO-optimized binaries

### Workflow

1. **Build Baseline Binaries**: Standard Go builds without PGO
2. **Generate Profiles**: Run workloads to collect CPU profiles
3. **Merge Profiles**: Combine multiple profile iterations
4. **Build with PGO**: Use profiles to guide optimization
5. **Compare Performance**: Measure improvement

### Profile Generation Details

Each command has specific workloads designed to exercise common code paths:

- **dcat**: Reading large log files
- **dgrep**: Pattern matching with various regex patterns
- **dmap**: MapReduce queries on CSV data
- **dtail**: Following growing log files with filtering
- **dserver**: Handling concurrent client connections

### Special Handling

1. **Empty Profiles**: I/O-bound operations may generate empty profiles. The implementation handles this gracefully by creating empty profile files that allow the workflow to continue.

2. **dserver Profiling**: Uses HTTP pprof endpoint instead of command-line flags, allowing profile capture during server operation.

3. **dtail Workload**: Simulates a growing log file with various log levels to exercise the tail functionality.

## Performance Results

Based on testing with PGO optimization:

### Individual Command Improvements
- **dcat**: 3.75-5.40% improvement
- **dgrep**: Up to 19% improvement (varies by pattern hit rate)
- **dmap**: Up to 39% improvement for specific queries

### Overall Performance Progression
From pre-turbo to turbo+PGO:
- **dcat**: 14-21x faster overall
- **dgrep**: 9-15x faster overall
- **dmap**: 9-29% faster overall

## Usage Examples

### Generate PGO-Optimized Binaries
```bash
# Full PGO workflow
make pgo

# Quick PGO with smaller datasets
make pgo-quick

# Generate profiles only
make pgo-generate

# Build with existing profiles
make build-pgo
```

### Using dtail-tools Directly
```bash
# Optimize all commands
dtail-tools pgo

# Optimize specific commands
dtail-tools pgo dcat dgrep

# Verbose mode with custom iterations
dtail-tools pgo -v -iterations 5

# Generate profiles only
dtail-tools pgo -profileonly
```

### Custom PGO Options
```bash
# Custom data size
dtail-tools pgo -datasize 5000000

# Custom profile directory
dtail-tools pgo -profiledir my-profiles

# Custom output directory
dtail-tools pgo -outdir my-pgo-build
```

## Technical Considerations

1. **Profile Quality**: The quality of PGO optimization depends on how representative the profiling workload is of real-world usage.

2. **Binary Size**: PGO-optimized binaries may be slightly larger due to function cloning and inlining decisions.

3. **Build Time**: Building with PGO takes longer than standard builds due to profile processing.

4. **Go Version**: PGO requires Go 1.20 or later.

## Integration with CI/CD

To integrate PGO into your build pipeline:

1. Generate profiles periodically with production-like workloads
2. Store profiles in version control or artifact repository
3. Use `make build-pgo` in your build process
4. Monitor performance metrics to validate improvements

## Profile Files

Profile files are stored in the `pgo-profiles/` directory:
- `dcat.pprof` - DCat CPU profile
- `dgrep.pprof` - DGrep CPU profile
- `dmap.pprof` - DMap CPU profile
- `dtail.pprof` - DTail CPU profile (may be empty for I/O-bound operations)
- `dserver.pprof` - DServer CPU profile

## Troubleshooting

### Empty Profiles
Some commands may generate empty profiles if they are I/O-bound. This is normal and the PGO workflow handles it gracefully.

### Profile Merge Failures
If profile merging fails, check that:
- All profile files are valid
- Go tools are properly installed
- Sufficient disk space is available

### Performance Not Improving
If PGO doesn't show improvement:
- Ensure profiles represent real workloads
- Check that the profile has sufficient samples
- Verify the correct profile is being used during build

## Future Enhancements

1. **Automated Profile Collection**: Collect profiles from production deployments
2. **Profile Versioning**: Track profile versions with code changes
3. **Multi-Architecture Support**: Generate architecture-specific profiles
4. **Continuous Profiling**: Regular profile updates based on usage patterns