summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2025-07-04 11:25:17 +0300
committerPaul Buetow <paul@buetow.org>2025-07-04 11:25:17 +0300
commit0645644bb945c4ce4707252c38a8d454b2ac9567 (patch)
treeaaff70f07cb07b85cbdcb53faf35c13ca40292ef /doc
parentaa2f547cf2b6136dc60f541f30c27a426ec7c6c8 (diff)
chore: clean up temporary files and reorganize documentation
- Delete temporary benchmark shell scripts (7 files) - Delete temporary log files from root and integrationtests - Delete .out test output files - Delete temporary Python analysis scripts - Move documentation to doc/ directory: - TURBOBOOST_OPTIMIZATION.md → doc/turboboost_optimization.md - performance_optimization_summary.md → doc/performance_optimization_summary.md - integrationtests/REFACTORING_GUIDE.md → doc/refactoring_guide.md - benchmarks/PROFILING.md → doc/profiling.md šŸ¤– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Diffstat (limited to 'doc')
-rw-r--r--doc/performance_optimization_summary.md62
-rw-r--r--doc/profiling.md376
-rw-r--r--doc/refactoring_guide.md240
-rw-r--r--doc/turboboost_optimization.md118
4 files changed, 796 insertions, 0 deletions
diff --git a/doc/performance_optimization_summary.md b/doc/performance_optimization_summary.md
new file mode 100644
index 0000000..9d1162b
--- /dev/null
+++ b/doc/performance_optimization_summary.md
@@ -0,0 +1,62 @@
+# DTail Performance Optimization Summary
+
+## Changes Made
+
+### 1. Optimized Trace Logging (`/internal/io/dlog/dlog.go`)
+
+**Problem**: The `Trace()` and `Devel()` functions were calling `runtime.Caller(1)` for every invocation, even when trace logging was disabled. This was causing ~497ns overhead per call.
+
+**Solution**: Added early level checks before the expensive `runtime.Caller()` operation:
+
+```go
+func (d *DLog) Trace(args ...interface{}) string {
+ // Early check to avoid expensive runtime.Caller when trace is disabled
+ if d.maxLevel < Trace {
+ return ""
+ }
+ // ... rest of function
+}
+```
+
+### 2. Improved Buffering in Turbo Mode (`/internal/server/handlers/turbo_writer.go`)
+
+**Problem**: Turbo mode was forcing immediate flush after every line in serverless mode, defeating the purpose of buffering.
+
+**Solution**: Removed the immediate flush condition for serverless mode, allowing proper buffering:
+
+```go
+// Changed from:
+if w.writeBuf.Len() >= w.bufSize || w.serverless {
+ return w.flushBuffer()
+}
+
+// To:
+if w.writeBuf.Len() >= w.bufSize {
+ return w.flushBuffer()
+}
+```
+
+## Performance Results
+
+### Before Optimization
+- Turbo mode was **3-5x slower** than non-turbo mode
+- DCat (10MB): 678ms (turbo) vs 210ms (non-turbo)
+- DGrep (10MB): 570ms (turbo) vs 96ms (non-turbo)
+
+### After Optimization
+- Turbo mode is now **2.87x faster** than non-turbo mode
+- DCat (1M lines): 0.66s (turbo) vs 1.89s (non-turbo) - **65% improvement**
+- DCat with colors: 1.69s (turbo) vs 2.24s (non-turbo) - **24% improvement**
+
+## Verification
+- āœ… All unit tests pass
+- āœ… All integration tests pass
+- āœ… No functionality regression
+- āœ… Backward compatible
+
+## Files Modified
+1. `/internal/io/dlog/dlog.go` - Lines 196-216
+2. `/internal/server/handlers/turbo_writer.go` - Lines 104-108
+
+## Key Takeaway
+The trace logging overhead was the primary bottleneck, causing DTail to spend more time logging than processing data. By adding simple level checks before expensive operations, we achieved a ~3x performance improvement in turbo mode. \ No newline at end of file
diff --git a/doc/profiling.md b/doc/profiling.md
new file mode 100644
index 0000000..7925fb3
--- /dev/null
+++ b/doc/profiling.md
@@ -0,0 +1,376 @@
+# DTail Profiling Framework
+
+This document describes the profiling framework for dtail commands (dcat, dgrep, dmap) to analyze CPU usage and memory allocations.
+
+## Overview
+
+The profiling framework provides:
+- CPU profiling to identify performance bottlenecks
+- Memory profiling to track allocations and detect leaks
+- Integration with existing benchmarks
+- Analysis tools for profile interpretation
+
+## Quick Start
+
+### 1. Build the Tools
+
+```bash
+make build # Builds all tools including dprofile
+```
+
+### 2. Run Commands with Profiling
+
+Each command now supports profiling flags:
+
+```bash
+# Profile dcat
+./dcat -profile -profiledir profiles -plain -cfg none /path/to/file.log
+
+# Profile dgrep with specific profiling types
+./dgrep -cpuprofile -memprofile -profiledir profiles -regex "error" /path/to/file.log
+
+# Profile dmap
+./dmap -profile -query "select count(*) from data.csv"
+```
+
+### 3. Analyze Profiles
+
+Use dtail-tools for quick analysis:
+
+```bash
+# List all profiles
+./dtail-tools profile -mode list
+
+# Analyze a specific profile
+./dtail-tools profile -mode analyze profiles/dcat_cpu_20240101_120000.prof
+
+# Open web browser with flame graph
+./dtail-tools profile -mode analyze profiles/dcat_cpu_*.prof -web
+
+# You can also use go tool pprof directly:
+go tool pprof profiles/dcat_cpu_20240101_120000.prof
+```
+
+## Profiling Options
+
+### Command-line Flags
+
+All dtail commands support these profiling flags:
+
+- `-cpuprofile`: Enable CPU profiling only
+- `-memprofile`: Enable memory profiling only
+- `-profile`: Enable both CPU and memory profiling
+- `-profiledir <dir>`: Directory to store profiles (default: "profiles")
+
+### Profile Types
+
+1. **CPU Profile** (`*_cpu_*.prof`)
+ - Samples CPU usage during execution
+ - Identifies hot functions and code paths
+ - Useful for optimizing computational bottlenecks
+
+2. **Memory Profile** (`*_mem_*.prof`)
+ - Captures heap allocations at end of execution
+ - Shows memory usage by function
+ - Helps identify memory leaks
+
+3. **Allocation Profile** (`*_alloc_*.prof`)
+ - Tracks all allocations during execution
+ - More detailed than memory profile
+ - Useful for reducing allocation pressure
+
+## Using with Benchmarks
+
+### Automated Profiling
+
+Run profiling using dtail-tools:
+
+```bash
+# Quick profiling with small datasets
+./dtail-tools profile -mode quick
+
+# Full profiling suite
+./dtail-tools profile -mode full
+
+# Profile dmap specifically (with MapReduce format)
+./dtail-tools profile -mode dmap
+```
+
+This tool:
+- Generates test data of various sizes
+- Profiles dcat, dgrep, and dmap with different workloads
+- Stores profiles in the `profiles` directory
+- Provides immediate analysis of results
+
+### Using Make Targets
+
+```bash
+# Quick profiling with immediate results
+make profile-quick
+
+# Full profiling suite
+make profile-all
+
+# Profile dmap specifically
+make profile-dmap
+
+# List available profiles
+make profile-list
+
+# Analyze a specific profile
+make profile-analyze PROFILE=profiles/dcat_cpu_*.prof
+
+# Open web interface for profile
+make profile-web PROFILE=profiles/dcat_cpu_*.prof
+```
+
+### Benchmark Integration
+
+Run profiling-enabled benchmarks:
+
+```bash
+cd benchmarks
+go test -bench="WithProfiling" -benchtime=1x
+```
+
+### Custom Profile Runner
+
+Use the profile runner in your benchmarks:
+
+```go
+import "github.com/mimecast/dtail/benchmarks"
+
+func BenchmarkMyFeature(b *testing.B) {
+ benchmarks.ProfileBenchmark(b, "MyFeature", "dcat",
+ "--plain", "--cfg", "none", "testfile.log")
+}
+```
+
+## Profile Analysis
+
+### Using go tool pprof
+
+For interactive analysis:
+
+```bash
+# Interactive mode
+go tool pprof profiles/dcat_cpu_*.prof
+
+# Common pprof commands:
+# top - Show top functions
+# list func - Show source code for function
+# web - Generate SVG graph
+# peek func - Show callers/callees of function
+```
+
+Generate visualizations:
+
+```bash
+# Flame graph (requires graphviz)
+go tool pprof -http=:8080 profiles/dcat_cpu_*.prof
+
+# Generate SVG
+go tool pprof -svg profiles/dgrep_mem_*.prof > profile.svg
+
+# Generate text report
+go tool pprof -text profiles/dmap_alloc_*.prof > report.txt
+```
+
+### Using dtail-tools profile
+
+The dtail-tools profile command provides quick summaries:
+
+```bash
+# List all profiles
+./dtail-tools profile -mode list
+
+# Analyze specific profile
+./dtail-tools profile -mode analyze profiles/dcat_cpu_20240101_120000.prof
+
+# Get help
+./dtail-tools profile -h
+```
+
+## Optimization Workflow
+
+1. **Baseline Performance**
+ ```bash
+ # Run benchmarks without profiling
+ cd benchmarks
+ go test -bench="BenchmarkDCat" -benchtime=10s
+ ```
+
+2. **Profile Execution**
+ ```bash
+ # Run with profiling
+ ./dcat -profile -profiledir profiles large_file.log
+ ```
+
+3. **Identify Bottlenecks**
+ ```bash
+ # Analyze CPU profile
+ ./dprofile -profile profiles/dcat_cpu_*.prof -top 10
+
+ # Check memory allocations
+ go tool pprof -alloc_space profiles/dcat_alloc_*.prof
+ ```
+
+4. **Optimize Code**
+ - Focus on functions with high Flat% (direct CPU usage)
+ - Reduce allocations in hot paths
+ - Consider buffering and pooling
+
+5. **Verify Improvements**
+ ```bash
+ # Re-run benchmarks after optimization
+ go test -bench="BenchmarkDCat" -benchtime=10s
+ ```
+
+## Common Performance Issues
+
+### CPU Bottlenecks
+
+Look for:
+- Regex compilation in loops
+- Excessive string operations
+- Inefficient algorithms (O(n²) or worse)
+- Unnecessary type conversions
+
+Example optimization:
+```go
+// Before: Regex compiled every time
+for _, line := range lines {
+ if regexp.MustCompile(pattern).MatchString(line) {
+ // ...
+ }
+}
+
+// After: Compile once
+re := regexp.MustCompile(pattern)
+for _, line := range lines {
+ if re.MatchString(line) {
+ // ...
+ }
+}
+```
+
+### Memory Issues
+
+Common patterns:
+- String concatenation in loops
+- Large temporary slices
+- Unclosed resources
+- Excessive goroutines
+
+Example optimization:
+```go
+// Before: Many allocations
+result := ""
+for _, s := range strings {
+ result += s + "\n"
+}
+
+// After: Single allocation
+var buf strings.Builder
+buf.Grow(estimatedSize)
+for _, s := range strings {
+ buf.WriteString(s)
+ buf.WriteByte('\n')
+}
+result := buf.String()
+```
+
+## Tips and Best Practices
+
+1. **Profile Real Workloads**
+ - Use production-like data sizes
+ - Test with actual file formats
+ - Include network operations if relevant
+
+2. **Compare Profiles**
+ ```bash
+ # Compare before/after optimization
+ go tool pprof -diff_base=before.prof after.prof
+ ```
+
+3. **Focus on Hot Paths**
+ - Optimize functions with >5% CPU usage first
+ - Small improvements in hot paths have big impact
+
+4. **Memory Profiling**
+ - Use `-alloc_space` for total allocations
+ - Use `-inuse_space` for current heap usage
+ - Check for growing heap over time
+
+5. **Benchmark Regularly**
+ - Add profiling to CI/CD pipeline
+ - Track performance over releases
+ - Set performance regression alerts
+
+## Troubleshooting
+
+### No profiles generated
+- Check write permissions for profile directory
+- Ensure command completes successfully
+- Verify profiling flags are correct
+
+### Empty or small profiles
+- Run command with larger workload
+- Increase execution time
+- Check if command exits too quickly
+
+### Analysis tools fail
+- Ensure profile format is valid
+- Check Go version compatibility
+- Verify graphviz is installed for visualizations
+
+## Advanced Usage
+
+### Custom Profiling Points
+
+Add profiling snapshots in code:
+
+```go
+import "github.com/mimecast/dtail/internal/profiling"
+
+func processLargeFile() {
+ profiler := profiling.GetProfiler() // Assumes global profiler
+
+ // Take memory snapshot before processing
+ profiler.Snapshot("before_processing")
+
+ // ... process file ...
+
+ // Take snapshot after
+ profiler.Snapshot("after_processing")
+}
+```
+
+### Continuous Profiling
+
+For long-running operations:
+
+```go
+// Start periodic metrics logging
+ticker := time.NewTicker(30 * time.Second)
+go func() {
+ for range ticker.C {
+ profiler.LogMetrics("periodic")
+ }
+}()
+defer ticker.Stop()
+```
+
+## Contributing
+
+When adding new features:
+1. Include benchmark tests
+2. Run profiling before submitting PR
+3. Document any performance implications
+4. Add profiling examples for new commands
+
+## References
+
+- [Go Profiling Documentation](https://go.dev/blog/pprof)
+- [pprof Tool Guide](https://github.com/google/pprof)
+- [Go Performance Tips](https://go.dev/wiki/Performance) \ No newline at end of file
diff --git a/doc/refactoring_guide.md b/doc/refactoring_guide.md
new file mode 100644
index 0000000..88c77d5
--- /dev/null
+++ b/doc/refactoring_guide.md
@@ -0,0 +1,240 @@
+# Integration Tests Refactoring Guide
+
+## Overview
+
+This guide outlines the refactoring opportunities for the dtail integration tests to reduce code duplication and improve maintainability.
+
+## Key Benefits of Refactoring
+
+1. **Reduced Code Duplication**: ~40-50% reduction in test code
+2. **Improved Maintainability**: Changes to common patterns only need to be made in one place
+3. **Better Test Hygiene**: Automatic cleanup using `t.Cleanup()`
+4. **Clearer Test Intent**: Helper functions make tests more readable
+5. **Reduced Copy-Paste Errors**: Less boilerplate to copy incorrectly
+
+## Common Patterns Identified
+
+### 1. Test Skip Pattern
+**Before:**
+```go
+if !config.Env("DTAIL_INTEGRATION_TEST_RUN_MODE") {
+ t.Log("Skipping")
+ return
+}
+```
+
+**After:**
+```go
+skipIfNotIntegrationTest(t)
+```
+
+### 2. Server Setup Pattern
+**Before:**
+```go
+port := getUniquePortNumber()
+bindAddress := "localhost"
+ctx, cancel := context.WithCancel(context.Background())
+defer cancel()
+
+_, _, _, err := startCommand(ctx, t,
+ "", "../dserver",
+ "--cfg", "none",
+ "--logger", "stdout",
+ "--logLevel", "error",
+ "--bindAddress", bindAddress,
+ "--port", fmt.Sprintf("%d", port),
+)
+if err != nil {
+ t.Error(err)
+ return
+}
+time.Sleep(500 * time.Millisecond)
+```
+
+**After:**
+```go
+server := NewTestServer(t)
+if err := server.Start("error"); err != nil {
+ t.Error(err)
+ return
+}
+```
+
+### 3. File Cleanup Pattern
+**Before:**
+```go
+defer os.Remove(outFile)
+defer os.Remove(csvFile)
+defer os.Remove(queryFile)
+```
+
+**After:**
+```go
+cleanupFiles(t, outFile, csvFile, queryFile)
+// or
+fileSet := &TestFileSet{...}
+fileSet.Cleanup(t)
+```
+
+### 4. Command Arguments Pattern
+**Before:**
+```go
+args := []string{
+ "--plain", "--cfg", "none",
+ "--servers", fmt.Sprintf("%s:%d", bindAddress, port),
+ "--trustAllHosts", "--noColor",
+ "--files", inFile,
+}
+```
+
+**After:**
+```go
+args := NewCommandArgs()
+args.Plain = true
+args.Servers = []string{server.Address()}
+args.TrustAllHosts = true
+args.NoColor = true
+args.Files = []string{inFile}
+// args.ToSlice() produces the string array
+```
+
+### 5. Dual Mode Testing Pattern
+**Before:**
+```go
+func TestX(t *testing.T) {
+ if !config.Env("DTAIL_INTEGRATION_TEST_RUN_MODE") {
+ t.Log("Skipping")
+ return
+ }
+
+ t.Run("Serverless", func(t *testing.T) {
+ testXServerless(t)
+ })
+
+ t.Run("ServerMode", func(t *testing.T) {
+ testXWithServer(t)
+ })
+}
+```
+
+**After:**
+```go
+func TestX(t *testing.T) {
+ runDualModeTest(t, DualModeTest{
+ Name: "TestX",
+ ServerlessTest: testXServerless,
+ ServerTest: testXWithServer,
+ })
+}
+```
+
+## Refactoring Strategy
+
+### Phase 1: Add Helper Functions
+1. Add `testhelpers.go` with all common utilities
+2. Ensure all tests still pass
+
+### Phase 2: Refactor Test by Test
+1. Start with simpler tests (e.g., dcat_test.go)
+2. Refactor one test function at a time
+3. Run tests after each refactoring
+4. Commit after each file is complete
+
+### Phase 3: Additional Improvements
+1. Add table-driven tests where appropriate
+2. Create test fixtures for common scenarios
+3. Add more sophisticated helpers as patterns emerge
+
+## Example Refactoring Results
+
+### Before (TestDCat1WithServer):
+```go
+func testDCat1WithServer(t *testing.T, inFile string) error {
+ outFile := "dcat1.out"
+ port := getUniquePortNumber()
+ bindAddress := "localhost"
+
+ ctx, cancel := context.WithCancel(context.Background())
+ defer cancel()
+
+ _, _, _, err := startCommand(ctx, t,
+ "", "../dserver",
+ "--cfg", "none",
+ "--logger", "stdout",
+ "--logLevel", "error",
+ "--bindAddress", bindAddress,
+ "--port", fmt.Sprintf("%d", port),
+ )
+ if err != nil {
+ return err
+ }
+
+ time.Sleep(500 * time.Millisecond)
+
+ _, err = runCommand(ctx, t, outFile,
+ "../dcat", "--plain", "--cfg", "none",
+ "--servers", fmt.Sprintf("%s:%d", bindAddress, port),
+ "--files", inFile,
+ "--trustAllHosts",
+ "--noColor")
+ if err != nil {
+ return err
+ }
+
+ cancel()
+
+ if err := compareFiles(t, outFile, inFile); err != nil {
+ return err
+ }
+
+ os.Remove(outFile)
+ return nil
+}
+```
+
+### After:
+```go
+func testDCat1WithServer_Refactored(t *testing.T, inFile string) {
+ fileSet := &TestFileSet{
+ InputFile: inFile,
+ OutputFile: "dcat1.out",
+ ExpectedFile: inFile,
+ }
+ fileSet.Cleanup(t)
+
+ server := NewTestServer(t)
+ if err := server.Start("error"); err != nil {
+ t.Error(err)
+ return
+ }
+
+ args := NewCommandArgs()
+ args.Plain = true
+ args.Servers = []string{server.Address()}
+ args.Files = []string{inFile}
+ args.TrustAllHosts = true
+ args.NoColor = true
+
+ err := runCommandAndVerify(t, server.ctx, fileSet.OutputFile, fileSet.ExpectedFile,
+ "../dcat", args.ToSlice()...)
+ if err != nil {
+ t.Error(err)
+ }
+}
+```
+
+## Metrics
+
+Based on the examples:
+- **Lines of code reduction**: ~45%
+- **Boilerplate elimination**: ~70%
+- **Improved readability**: Subjective but significant
+- **Error-prone patterns eliminated**: Port management, cleanup, context handling
+
+## Next Steps
+
+1. Review and approve the helper functions
+2. Create a PR with `testhelpers.go`
+3. Incrementally refactor tests in separate PRs
+4. Document any new patterns that emerge
+5. Consider creating a test generator for common scenarios \ No newline at end of file
diff --git a/doc/turboboost_optimization.md b/doc/turboboost_optimization.md
new file mode 100644
index 0000000..f13943f
--- /dev/null
+++ b/doc/turboboost_optimization.md
@@ -0,0 +1,118 @@
+# DTail Turbo Boost Optimization
+
+## Overview
+
+This document describes the turbo boost optimization feature that provides significant performance improvements for DTail operations by using channel-less processing and optimized I/O.
+
+## Problem Statement
+
+The original dgrep implementation used multiple channels in a pipeline:
+- `rawLines chan *bytes.Buffer` (buffer: 100) - Raw lines read from file
+- `lines chan *line.Line` (buffer: 100) - Filtered lines to send to client
+
+This created several performance issues:
+1. Fixed channel buffer sizes causing blocking under high throughput
+2. Context switching overhead between goroutines
+3. Channel synchronization overhead
+4. Memory allocations for channel operations
+
+## Solution
+
+The channel-less implementation replaces the channel pipeline with direct function calls using a `LineProcessor` interface.
+
+### Key Components
+
+1. **LineProcessor Interface** (`internal/io/line/processor.go`)
+ - Defines methods for processing lines without channels
+ - `ProcessLine()` - Handle a single line
+ - `Flush()` - Ensure buffered data is written
+ - `Close()` - Clean up resources
+
+2. **GrepLineProcessor** (`internal/server/handlers/lineprocessor.go`)
+ - Implements LineProcessor for grep operations
+ - Writes directly to the network connection
+ - Uses internal buffering for efficiency (64KB buffer)
+ - Thread-safe with mutex protection
+
+3. **Modified File Reading** (`internal/io/fs/readfile_processor.go`)
+ - `StartWithProcessor()` - Channel-less file reading
+ - Direct callbacks instead of channel sends
+ - Inline regex filtering without goroutines
+
+4. **Optimized File Reading** (`internal/io/fs/readfile_processor_optimized.go`)
+ - Uses buffered line reading instead of byte-by-byte
+ - Custom scanner with 256KB buffer
+ - Efficient handling of long lines
+ - Special optimization for tail mode
+
+### Feature Flags
+
+The implementation can be controlled via environment variables:
+- `DTAIL_CHANNELLESS_GREP=yes` - Enable channel-less grep implementation
+- `DTAIL_OPTIMIZED_READER=yes` - Use optimized buffered reader
+
+### Benefits
+
+1. **Reduced Latency**: No channel queuing delays
+2. **Lower Memory Usage**: No channel buffers
+3. **Better CPU Efficiency**: Fewer context switches
+4. **Simpler Code Flow**: Direct processing without goroutine coordination
+5. **Predictable Performance**: No channel blocking
+
+### Backward Compatibility
+
+- Original channel-based implementation remains available
+- Same command-line interface
+- Protocol compatibility maintained
+- All integration tests pass unchanged
+
+### Performance Testing
+
+Use the provided script to compare performance:
+
+```bash
+./test_channelless_performance.sh
+```
+
+This will test:
+1. Original channel-based implementation
+2. Channel-less implementation
+3. Optimized channel-less implementation
+
+### Usage
+
+To use the channel-less implementation:
+
+```bash
+# Enable channel-less grep
+export DTAIL_CHANNELLESS_GREP=yes
+
+# Also enable optimized reader
+export DTAIL_OPTIMIZED_READER=yes
+
+# Run dgrep normally
+dgrep -regex "pattern" file.log
+```
+
+### Future Improvements
+
+1. Extend channel-less approach to other commands (dcat, dtail)
+2. Add configurable buffer sizes
+3. Implement zero-copy optimizations
+4. Add performance metrics collection
+5. Consider using io_uring on Linux for async I/O
+
+## Usage
+
+To enable turbo boost optimizations:
+
+```bash
+export DTAIL_TURBOBOOST_ENABLE=yes
+```
+
+This enables:
+- Channel-less implementation for grep and cat operations
+- Optimized buffered I/O reader (256KB buffer)
+- Buffer pooling to reduce memory allocations
+
+The turbo boost mode is designed to be extended to other DTail commands in the future.