summaryrefslogtreecommitdiff
path: root/doc/pgo_implementation.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/pgo_implementation.md')
-rw-r--r--doc/pgo_implementation.md171
1 files changed, 171 insertions, 0 deletions
diff --git a/doc/pgo_implementation.md b/doc/pgo_implementation.md
new file mode 100644
index 0000000..edcfe40
--- /dev/null
+++ b/doc/pgo_implementation.md
@@ -0,0 +1,171 @@
+# Profile-Guided Optimization (PGO) Implementation for DTail
+
+## Overview
+
+This document describes the Profile-Guided Optimization (PGO) implementation for DTail tools. PGO is a compiler optimization technique that uses runtime profiling data to guide optimization decisions, resulting in better performance for real-world usage patterns.
+
+## Implementation Details
+
+### Architecture
+
+The PGO implementation is integrated into the dtail-tools command as a subcommand:
+
+```bash
+dtail-tools pgo [options] [commands...]
+```
+
+### Core Components
+
+1. **PGO Module** (`internal/tools/pgo/pgo.go`)
+ - Handles the complete PGO workflow
+ - Manages profile generation, merging, and PGO builds
+ - Provides performance comparison
+
+2. **Profiling Integration**
+ - All dtail commands now support the `-profile` flag
+ - dserver uses HTTP pprof endpoint for profiling
+ - Profiles are generated during realistic workloads
+
+3. **Makefile Integration**
+ - `make pgo` - Complete PGO workflow
+ - `make pgo-quick` - Quick PGO with smaller datasets
+ - `make pgo-generate` - Generate profiles only
+ - `make build-pgo` - Build with existing profiles
+ - `make install-pgo` - Install PGO-optimized binaries
+
+### Workflow
+
+1. **Build Baseline Binaries**: Standard Go builds without PGO
+2. **Generate Profiles**: Run workloads to collect CPU profiles
+3. **Merge Profiles**: Combine multiple profile iterations
+4. **Build with PGO**: Use profiles to guide optimization
+5. **Compare Performance**: Measure improvement
+
+### Profile Generation Details
+
+Each command has specific workloads designed to exercise common code paths:
+
+- **dcat**: Reading large log files
+- **dgrep**: Pattern matching with various regex patterns
+- **dmap**: MapReduce queries on CSV data
+- **dtail**: Following growing log files with filtering
+- **dserver**: Handling concurrent client connections
+
+### Special Handling
+
+1. **Empty Profiles**: I/O-bound operations may generate empty profiles. The implementation handles this gracefully by creating empty profile files that allow the workflow to continue.
+
+2. **dserver Profiling**: Uses HTTP pprof endpoint instead of command-line flags, allowing profile capture during server operation.
+
+3. **dtail Workload**: Simulates a growing log file with various log levels to exercise the tail functionality.
+
+## Performance Results
+
+Based on testing with PGO optimization:
+
+### Individual Command Improvements
+- **dcat**: 3.75-5.40% improvement
+- **dgrep**: Up to 19% improvement (varies by pattern hit rate)
+- **dmap**: Up to 39% improvement for specific queries
+
+### Overall Performance Progression
+From pre-turbo to turbo+PGO:
+- **dcat**: 14-21x faster overall
+- **dgrep**: 9-15x faster overall
+- **dmap**: 9-29% faster overall
+
+## Usage Examples
+
+### Generate PGO-Optimized Binaries
+```bash
+# Full PGO workflow
+make pgo
+
+# Quick PGO with smaller datasets
+make pgo-quick
+
+# Generate profiles only
+make pgo-generate
+
+# Build with existing profiles
+make build-pgo
+```
+
+### Using dtail-tools Directly
+```bash
+# Optimize all commands
+dtail-tools pgo
+
+# Optimize specific commands
+dtail-tools pgo dcat dgrep
+
+# Verbose mode with custom iterations
+dtail-tools pgo -v -iterations 5
+
+# Generate profiles only
+dtail-tools pgo -profileonly
+```
+
+### Custom PGO Options
+```bash
+# Custom data size
+dtail-tools pgo -datasize 5000000
+
+# Custom profile directory
+dtail-tools pgo -profiledir my-profiles
+
+# Custom output directory
+dtail-tools pgo -outdir my-pgo-build
+```
+
+## Technical Considerations
+
+1. **Profile Quality**: The quality of PGO optimization depends on how representative the profiling workload is of real-world usage.
+
+2. **Binary Size**: PGO-optimized binaries may be slightly larger due to function cloning and inlining decisions.
+
+3. **Build Time**: Building with PGO takes longer than standard builds due to profile processing.
+
+4. **Go Version**: PGO requires Go 1.20 or later.
+
+## Integration with CI/CD
+
+To integrate PGO into your build pipeline:
+
+1. Generate profiles periodically with production-like workloads
+2. Store profiles in version control or artifact repository
+3. Use `make build-pgo` in your build process
+4. Monitor performance metrics to validate improvements
+
+## Profile Files
+
+Profile files are stored in the `pgo-profiles/` directory:
+- `dcat.pprof` - DCat CPU profile
+- `dgrep.pprof` - DGrep CPU profile
+- `dmap.pprof` - DMap CPU profile
+- `dtail.pprof` - DTail CPU profile (may be empty for I/O-bound operations)
+- `dserver.pprof` - DServer CPU profile
+
+## Troubleshooting
+
+### Empty Profiles
+Some commands may generate empty profiles if they are I/O-bound. This is normal and the PGO workflow handles it gracefully.
+
+### Profile Merge Failures
+If profile merging fails, check that:
+- All profile files are valid
+- Go tools are properly installed
+- Sufficient disk space is available
+
+### Performance Not Improving
+If PGO doesn't show improvement:
+- Ensure profiles represent real workloads
+- Check that the profile has sufficient samples
+- Verify the correct profile is being used during build
+
+## Future Enhancements
+
+1. **Automated Profile Collection**: Collect profiles from production deployments
+2. **Profile Versioning**: Track profile versions with code changes
+3. **Multi-Architecture Support**: Generate architecture-specific profiles
+4. **Continuous Profiling**: Regular profile updates based on usage patterns \ No newline at end of file