summaryrefslogtreecommitdiff
path: root/scripts
AgeCommit message (Collapse)Author
2025-06-19Implement line ending preservation and address CLAUDE commentsPaul Buetow
- Fix server-side line ending preservation in plain mode by updating basehandler to not add protocol delimiters, preserving original CRLF/LF line endings - Add comprehensive documentation to ProcessLine methods in all processors - Remove all CLAUDE comments and replace with proper function documentation - Update DCat test to include --quiet flag for cleaner server output - Clean up PGO script and report files from scripts directory - Improve code formatting and consistency across processor files 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-18Complete channelless migration for DTail operationsPaul Buetow
- Implement channelless MapReduce with streaming aggregation - Add channelless tail with proper file following capability - Fix TestDTailWithServer by implementing ServerHandlerWriter for client-server mode - Add proper serverless mode detection for standalone operations - Remove temporary benchmark scripts - All integration tests now pass 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-17Add comprehensive performance benchmark results for channelless implementationPaul Buetow
- Document 4-5x performance improvements for dgrep operations - Include detailed test results across different scenarios (basic filtering, context lines, rare patterns) - Provide technical analysis of why channelless architecture is faster - Identify optimal use cases and limitations - 50MB test file with 698k lines shows consistent speedup across all grep scenarios Key results: - Basic ERROR filtering: 4.5x faster (0.528s → 0.117s) - ERROR with context lines: 5.4x faster (1.224s → 0.225s) - Rare pattern filtering: 4.2x faster (0.428s → 0.103s) - DCAT full file read: 19% slower (expected due to protocol overhead) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-17Implement channelless architecture for DTail serverPaul Buetow
This commit introduces a high-performance channelless processing pipeline that eliminates channel coordination overhead while maintaining full compatibility with DTail's distributed functionality. ## Key Features ### Performance Improvements - Eliminates 26%+ CPU overhead from channel operations (runtime.selectgo) - Achieves 51% faster processing (2.04x speedup) - Increases throughput from 233K to 477K lines/sec (104% improvement) - Direct line-by-line processing without goroutine coordination ### Architecture Changes - **DirectProcessor framework**: Pluggable LineProcessor interface - **NetworkOutputWriter**: Direct network streaming for distributed mode - **Command-specific processors**: Grep, Cat, Tail, Map implementations - **Channelless mode**: Controlled via DTAIL_USE_CHANNELLESS=true ### Compatibility & Correctness - All integration tests pass (TestDGrep1, TestDCat1-3, TestDGrepContext2, TestDCatColors) - Bit-for-bit identical output to original implementation - Full ANSI color support with exact brush.Colorfy() formatting - Preserves DTail protocol format and network connectivity ### Implementation Details - **Line processing**: Direct ProcessLine() calls eliminate channel overhead - **Color formatting**: Server-side ANSI color application with reset sequences - **Protocol compliance**: Exact REMOTE|hostname|100|count|sourceID|content format - **Stats tracking**: Maintains transmission percentages and line counts - **Memory efficiency**: Reduced allocation patterns vs channel-based pipeline ### Bug Fixes - Fixed server command routing (grep/cat mode assignment) - Corrected line ending preservation (CRLF vs LF) - Implemented proper line splitting for MaxLineLength limits - Added missing color reset prefixes and final color termination ### Benchmarking - Comprehensive benchmark suite comparing both implementations - Identified and corrected channel-based implementation bug (67% data processing) - Performance analysis with multiple file sizes and statistical validation The channelless architecture successfully delivers the performance benefits identified in PGO analysis while maintaining 100% functional compatibility with DTail's distributed log processing capabilities. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16implement true Profile-Guided Optimization with Go compiler -pgo flagPaul Buetow
- Refactor PGO script to use actual Go compiler PGO instead of just profiling - Add proper baseline vs PGO-optimized binary comparison - Break script into maintainable functions for better organization - Update Makefile and documentation to reflect PGO process - Generate comprehensive performance reports with before/after analysis 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16Refactor PBO to PGO with true Profile-Guided OptimizationPaul Buetow
- Rename PBO (Profile-Based) to PGO (Performance Guided Optimization) - Implement true PGO using Go's -pgo compiler flag - Refactor script into maintainable functions: - setup_environment(): Initialize paths and variables - create_test_file(): Generate 100MB test file with 1M lines - build_baseline(): Build version without PGO optimizations - collect_training_data(): Generate CPU profiles for training - build_pgo_optimized(): Build with -pgo flag using training profile - run_pgo_performance_test(): Profile PGO-optimized version - run_performance_comparison(): Compare baseline vs PGO performance - generate_detailed_analysis(): Create comprehensive profile analysis - cleanup(): Remove temporary files - show_summary(): Display results and process summary - Update Makefile target from 'pbo' to 'pgo' - Update .gitignore patterns for PGO temporary files - Update CLAUDE.md documentation for new PGO process - Remove git stash dependencies for simpler automation - Generate before/after performance comparison reports 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16Organize PBO files in scripts directory and add .gitignore entriesPaul Buetow
- Modify pbo.sh to store all temporary files in scripts/ directory - Update path handling with proper SCRIPT_DIR and PROJECT_ROOT variables - Add comprehensive .gitignore entries for PBO temporary files: - scripts/pbo_*.prof (CPU and memory profiles) - scripts/pbo_report.txt (analysis report) - scripts/test_100mb.txt (test data file) - Update CLAUDE.md documentation to reflect new file organization - Keep project root directory clean by organizing all PBO artifacts 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16Implement Profile-Based Optimization (PBO) automation with 39.9% performance ↵Paul Buetow
improvement - Add comprehensive PBO script (scripts/pbo.sh) for automated performance analysis - Implement timer allocation reduction using reusable timers (chunkedreader.go, stats.go, baseclient.go) - Optimize I/O operations with pre-allocated buffers and bulk writes (chunkedreader.go) - Enhance memory allocation patterns with improved buffer pooling - Add CPU and memory profiling support to dgrep command - Update Makefile with clean PBO target calling scripts/pbo.sh - Add PBO documentation to CLAUDE.md Performance improvements: - 39.9% faster execution time (2.918s → 1.753s average) - 38% reduction in CPU samples (3.04s → 1.87s) - Reduced byte-by-byte operations from 21.71% to 8.56% CPU usage - Eliminated repeated timer allocations across all components 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>