diff options
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/pgo_implementation.md | 171 |
1 files changed, 171 insertions, 0 deletions
diff --git a/doc/pgo_implementation.md b/doc/pgo_implementation.md new file mode 100644 index 0000000..edcfe40 --- /dev/null +++ b/doc/pgo_implementation.md @@ -0,0 +1,171 @@ +# Profile-Guided Optimization (PGO) Implementation for DTail + +## Overview + +This document describes the Profile-Guided Optimization (PGO) implementation for DTail tools. PGO is a compiler optimization technique that uses runtime profiling data to guide optimization decisions, resulting in better performance for real-world usage patterns. + +## Implementation Details + +### Architecture + +The PGO implementation is integrated into the dtail-tools command as a subcommand: + +```bash +dtail-tools pgo [options] [commands...] +``` + +### Core Components + +1. **PGO Module** (`internal/tools/pgo/pgo.go`) + - Handles the complete PGO workflow + - Manages profile generation, merging, and PGO builds + - Provides performance comparison + +2. **Profiling Integration** + - All dtail commands now support the `-profile` flag + - dserver uses HTTP pprof endpoint for profiling + - Profiles are generated during realistic workloads + +3. **Makefile Integration** + - `make pgo` - Complete PGO workflow + - `make pgo-quick` - Quick PGO with smaller datasets + - `make pgo-generate` - Generate profiles only + - `make build-pgo` - Build with existing profiles + - `make install-pgo` - Install PGO-optimized binaries + +### Workflow + +1. **Build Baseline Binaries**: Standard Go builds without PGO +2. **Generate Profiles**: Run workloads to collect CPU profiles +3. **Merge Profiles**: Combine multiple profile iterations +4. **Build with PGO**: Use profiles to guide optimization +5. **Compare Performance**: Measure improvement + +### Profile Generation Details + +Each command has specific workloads designed to exercise common code paths: + +- **dcat**: Reading large log files +- **dgrep**: Pattern matching with various regex patterns +- **dmap**: MapReduce queries on CSV data +- **dtail**: Following growing log files with filtering +- **dserver**: Handling concurrent client connections + +### Special Handling + +1. **Empty Profiles**: I/O-bound operations may generate empty profiles. The implementation handles this gracefully by creating empty profile files that allow the workflow to continue. + +2. **dserver Profiling**: Uses HTTP pprof endpoint instead of command-line flags, allowing profile capture during server operation. + +3. **dtail Workload**: Simulates a growing log file with various log levels to exercise the tail functionality. + +## Performance Results + +Based on testing with PGO optimization: + +### Individual Command Improvements +- **dcat**: 3.75-5.40% improvement +- **dgrep**: Up to 19% improvement (varies by pattern hit rate) +- **dmap**: Up to 39% improvement for specific queries + +### Overall Performance Progression +From pre-turbo to turbo+PGO: +- **dcat**: 14-21x faster overall +- **dgrep**: 9-15x faster overall +- **dmap**: 9-29% faster overall + +## Usage Examples + +### Generate PGO-Optimized Binaries +```bash +# Full PGO workflow +make pgo + +# Quick PGO with smaller datasets +make pgo-quick + +# Generate profiles only +make pgo-generate + +# Build with existing profiles +make build-pgo +``` + +### Using dtail-tools Directly +```bash +# Optimize all commands +dtail-tools pgo + +# Optimize specific commands +dtail-tools pgo dcat dgrep + +# Verbose mode with custom iterations +dtail-tools pgo -v -iterations 5 + +# Generate profiles only +dtail-tools pgo -profileonly +``` + +### Custom PGO Options +```bash +# Custom data size +dtail-tools pgo -datasize 5000000 + +# Custom profile directory +dtail-tools pgo -profiledir my-profiles + +# Custom output directory +dtail-tools pgo -outdir my-pgo-build +``` + +## Technical Considerations + +1. **Profile Quality**: The quality of PGO optimization depends on how representative the profiling workload is of real-world usage. + +2. **Binary Size**: PGO-optimized binaries may be slightly larger due to function cloning and inlining decisions. + +3. **Build Time**: Building with PGO takes longer than standard builds due to profile processing. + +4. **Go Version**: PGO requires Go 1.20 or later. + +## Integration with CI/CD + +To integrate PGO into your build pipeline: + +1. Generate profiles periodically with production-like workloads +2. Store profiles in version control or artifact repository +3. Use `make build-pgo` in your build process +4. Monitor performance metrics to validate improvements + +## Profile Files + +Profile files are stored in the `pgo-profiles/` directory: +- `dcat.pprof` - DCat CPU profile +- `dgrep.pprof` - DGrep CPU profile +- `dmap.pprof` - DMap CPU profile +- `dtail.pprof` - DTail CPU profile (may be empty for I/O-bound operations) +- `dserver.pprof` - DServer CPU profile + +## Troubleshooting + +### Empty Profiles +Some commands may generate empty profiles if they are I/O-bound. This is normal and the PGO workflow handles it gracefully. + +### Profile Merge Failures +If profile merging fails, check that: +- All profile files are valid +- Go tools are properly installed +- Sufficient disk space is available + +### Performance Not Improving +If PGO doesn't show improvement: +- Ensure profiles represent real workloads +- Check that the profile has sufficient samples +- Verify the correct profile is being used during build + +## Future Enhancements + +1. **Automated Profile Collection**: Collect profiles from production deployments +2. **Profile Versioning**: Track profile versions with code changes +3. **Multi-Architecture Support**: Generate architecture-specific profiles +4. **Continuous Profiling**: Regular profile updates based on usage patterns
\ No newline at end of file |
