| Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
- Add comprehensive PGO module in internal/tools/pgo/
- Integrate PGO into dtail-tools command with full CLI support
- Add Makefile targets for PGO workflow:
- make pgo: Full PGO workflow
- make pgo-quick: Quick PGO with smaller datasets
- make pgo-generate: Generate profiles only
- make build-pgo: Build with existing profiles
- make install-pgo: Install optimized binaries
- Add convenience functions to data generator for PGO
- Document PGO workflow in CLAUDE.md
Performance improvements observed:
- DCat: 3.8-7.0% additional improvement over turbo mode
- DGrep: Up to 19% improvement for low hit rates
- DMap: Variable impact, up to 64% for min_max on large files
Benchmarks show total performance gains (pre-turbo → turbo+PGO):
- DCat: 14-21x faster
- DGrep: 9-15x faster
- DMap: 9-29% faster
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
|
|
|
|
This commit adds comprehensive performance benchmarking comparing DTail v4.3.0
(before turbo mode) with the current implementation that has turbo boost enabled
by default.
Performance Improvements:
- DCat: 2,535% improvement (26.3x faster)
- DGrep: 1,334-1,811% improvement (14-19x faster depending on hit rate)
- DMap: 25-55% improvement for most query types
Files added:
- benchmarks/baselines/baseline_20250704_130947_turbo-enabled.txt
New baseline with turbo mode enabled for future comparisons
- doc/turbo_performance_analysis.md
Detailed technical analysis of performance improvements including
methodology, results, and implementation details
- benchmark_comparison_report.md
Summary report comparing v4.3.0 baseline with turbo-enabled baseline
The turbo mode optimizations bypass channels for direct output operations
and use direct line processing for MapReduce in server mode, resulting in
dramatic performance improvements while maintaining compatibility.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
|
|
- Created new baseline with turbo mode enabled (default configuration)
- Added comprehensive performance analysis comparing v4.3.0 to turbo mode
- Documented significant performance improvements:
- DCat: Up to 93% faster on large files
- DGrep: Up to 93% faster with better scaling
- DMap: 27-39% improvements across all operations
- Analysis shows turbo mode is especially effective for large files
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
- Delete temporary benchmark shell scripts (7 files)
- Delete temporary log files from root and integrationtests
- Delete .out test output files
- Delete temporary Python analysis scripts
- Move documentation to doc/ directory:
- TURBOBOOST_OPTIMIZATION.md → doc/turboboost_optimization.md
- performance_optimization_summary.md → doc/performance_optimization_summary.md
- integrationtests/REFACTORING_GUIDE.md → doc/refactoring_guide.md
- benchmarks/PROFILING.md → doc/profiling.md
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
- Add scanner_pool.go with tiered buffer pools (1MB, 64KB, 4KB)
- Modify readWithProcessorOptimized to use pooled scanner buffers
- Update tailWithProcessorOptimized to pool 64KB read buffers
- Increase BytesBuffer pool initial capacity from 128B to 4KB
- Add buffer_pool_test.go to benchmark pooling effectiveness
This reduces memory allocations by ~36% in turbo mode by reusing
buffers instead of allocating new ones for each file operation.
All integration tests pass.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Major performance improvements in turbo mode:
- Fixed trace logging overhead by adding early level checks before expensive runtime.Caller() operations
- Improved buffering strategy by removing forced immediate flush in serverless mode
- Turbo mode now 2.87x faster (was 3-5x slower before optimization)
Changes:
- internal/io/dlog/dlog.go: Added early return in Trace() and Devel() when logging disabled
- internal/server/handlers/turbo_writer.go: Removed serverless immediate flush condition
Performance results:
- Before: Turbo mode was 3-5x slower than non-turbo mode
- After: Turbo mode is 2.87x faster (65% improvement)
- All integration tests pass
Added comprehensive benchmarking tools in benchmarks/ directory
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
- Fixed race condition in periodicTruncateCheck by using context cancellation
- Added turbo mode support to TestDCat2 server configuration
- Removed problematic wait for pending files in readCommand.Start
- Fixed potential panic when truncate channel is closed while goroutine is running
The test now properly enables turbo mode on both client and server, preventing
the timeout issues that occurred when only the client had turbo mode enabled.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Following the successful refactoring to Go-based tooling, this commit:
1. Removes all obsolete bash scripts:
- benchmarks/benchmark.sh
- profiling/profile.sh
- profiling/profile_benchmarks.sh
- profiling/profile_dmap.sh
- profiling/profile_quick.sh
2. Updates all documentation to use dtail-tools:
- README.md: Updated benchmark commands to use dtail-tools
- PROFILING.md: Updated profiling instructions to use dtail-tools
3. Updates Go code references:
- profile_runner.go: Uses dtail-tools instead of profile.sh
- profile_example.go: Uses dtail-tools for profile analysis
The new dtail-tools provides all the functionality of the old bash
scripts with better cross-platform compatibility, error handling,
and maintainability.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
This major refactoring replaces all bash-based profiling and benchmarking
scripts with a unified Go tool (dtail-tools) that provides:
- Better cross-platform compatibility
- Improved error handling and reliability
- Structured data generation for test files
- Consistent command-line interface
- Easier maintenance and extensibility
Key changes:
- Created dtail-tools command with profile and benchmark subcommands
- Implemented common utilities for data generation and file operations
- Updated Makefile to use the new Go-based tools
- Maintained backward compatibility with existing make targets
- Fixed ParseSize to handle single-letter suffixes (10M, 1G, etc.)
The new tool supports all previous functionality:
- profile-quick, profile-all, profile-dmap
- benchmark creation, comparison, and management
- Test data generation with multiple formats
- Profile analysis and listing
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
- Moved profile_benchmarks.sh, profile_dmap.sh, and profile_quick.sh
to the profiling/ directory where they belong
- Updated Makefile targets to reference new locations
- Fixed profile_dmap.sh to remove outfile clauses since they're not
needed for profiling and were preventing proper execution
- Updated .gitignore to exclude generated files in profiling/
This better separates benchmarking (performance comparison) from
profiling (performance analysis).
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
- Move main package files to benchmarks/cmd/ to fix test failures
- Update CLAUDE.md with comprehensive benchmarking and profiling instructions
- Fix unused imports in serverless.go
- Remove experimental buffered pipe/copy implementations
- Remove outdated documentation files
All integration tests now pass successfully.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Implement channel-based bidirectional copying in serverless connector
to prevent deadlocks that occur with io.Copy when processing large files.
Changes:
- Replace direct io.Copy with channel-based approach in serverless.go
- Add bufferedpipe and bufferedcopy utilities (for future use)
- Add tests to verify deadlock prevention
- Fix dmap profiling example to use absolute paths
The fix successfully handles files up to ~10KB in serverless mode.
Larger files still experience issues and will be addressed in a
follow-up fix.
Fixes profiling hang issue when using -cfg none without servers.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Added || true to kill and wait commands in run_profile_dmap() to prevent
script failures when processes exit with non-zero codes after receiving
SIGINT. This is expected behavior when interrupting dmap.
Note: There appears to be a separate issue where dtail commands (dcat,
dgrep, dmap) hang when run in serverless mode with file arguments. This
needs further investigation.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
dmap is designed to run continuously and report MapReduce results at
intervals, which caused it to hang during profiling. Fixed by:
- Added run_profile_dmap() function that runs dmap in background
- Sends SIGINT after 3 seconds to cleanly exit dmap
- Updated all dmap profiling calls to use the new function
- Applied fix to both profile_benchmarks.sh and profile_dmap.sh
This ensures dmap can be profiled successfully without timing out.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Fixed multiple issues preventing dmap from being profiled correctly:
- Updated profile_dmap.sh to use DTail default log format
- Fixed MapReduce queries to use correct field syntax
- Reduced file sizes and run counts for faster profiling
- Added proper command echoing to all data generation steps
Optimizations:
- Reduced PROFILE_RUNS from 3 to 1
- Reduced test data sizes (1MB/10MB instead of 10MB/100MB/1GB)
- Commented out medium/large file tests for faster runs
- Reduced dmap test data from 1000/10000 to 100/1000 lines
The profiling framework now successfully profiles all three commands
(dcat, dgrep, dmap) with reasonable execution times.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Created a comprehensive profiling framework for dtail commands (dcat, dgrep, dmap)
to analyze CPU usage and memory allocations. The framework now prints all executed
commands to stdout for full transparency.
Key features:
- Integrated Go profiling (CPU, memory, allocations) into all three commands
- Created profile.sh bash script for analyzing pprof profiles
- Added multiple Makefile targets for different profiling scenarios
- Automated profiling scripts with command echoing
- Support for different data sizes (quick, normal, full)
- Special handling for dmap MapReduce format
All profiling commands are now echoed to stdout before execution, making it
easy to understand what the framework is doing and reproduce commands manually.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
Created baseline snapshot of current performance metrics for version v4.3.0.
This baseline can be used for future performance comparisons.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
|
- Create benchmark framework to measure performance of dcat, dgrep, and dmap
- Generate test files of 10MB, 100MB, and 1GB with configurable patterns
- Support benchmarking with gzip and zstd compressed files
- Implement tool-specific benchmarks:
* DCat: Simple reading, multiple files, compressed files
* DGrep: Pattern matching, regex complexity, context lines, inverted grep
* DMap: Aggregations, group by operations, complex queries, time intervals
- Track performance metrics: throughput (MB/sec), lines/sec, memory usage
- Save results in multiple formats: JSON, CSV, and Markdown reports
- Add Makefile targets: benchmark, benchmark-quick, benchmark-full
- Support environment variables for configuration (sizes, timeouts, etc.)
- Automatically clean up temporary .tmp files after benchmarks
The framework provides consistent performance testing across the DTail toolset
and enables tracking performance regressions between commits.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
|