diff options
| author | Paul Buetow <paul@buetow.org> | 2025-07-03 17:58:06 +0300 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2025-07-03 17:58:06 +0300 |
| commit | 859be4593e4f7ef37ff2c91dc90f42e6930a3996 (patch) | |
| tree | a73597068c3e5f34017d4e348267f8051f3be614 /verify_dmap_output.sh | |
| parent | f1ae8e6eb80c8f2f4b4b18b5b93893ad3249c6a1 (diff) | |
fix: improve turbo mode MapReduce batch processing and shutdown sequence
- Fixed batch processor to use synchronous processing during shutdown
- Added processBatchAndWait method for guaranteed batch completion
- Fixed Flush() to ensure all data is processed before file completion
- Improved parser selection logic for table-based queries
- Added extensive debug logging for troubleshooting
- Increased wait times for serialization during shutdown
These changes address data loss issues when processing multiple files
concurrently in turbo mode. The batch processor now properly flushes
all remaining data when files complete and during shutdown.
Note: Integration tests still failing due to SSH authentication issues
in test environment, but core turbo mode logic has been fixed.
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
Diffstat (limited to 'verify_dmap_output.sh')
| -rwxr-xr-x | verify_dmap_output.sh | 52 |
1 files changed, 52 insertions, 0 deletions
diff --git a/verify_dmap_output.sh b/verify_dmap_output.sh new file mode 100755 index 0000000..1b88cdd --- /dev/null +++ b/verify_dmap_output.sh @@ -0,0 +1,52 @@ +#!/bin/bash + +# Verification script for dmap turbo mode output +set -e + +echo "=== DTail dmap Output Verification ===" +echo "Comparing regular mode vs turbo mode output" +echo + +# Create test data if it doesn't exist +TEST_DATA="/tmp/dtail_test_data.log" +if [ ! -f "$TEST_DATA" ]; then + echo "Creating test data..." + for i in {1..1000}; do + echo "2023-12-27 10:00:00 integrationtest mapreduce=TestData goroutines=34.5 lifetimeConnections=0" >> $TEST_DATA + done +fi + +# Test query - simple aggregation without where clause +QUERY='select count($hostname),$hostname,avg($goroutines),sum($lifetimeConnections) from - group by $hostname order by count($hostname)' + +# Run in regular mode +echo "Running in regular mode..." +OUTPUT_REGULAR=$(./dmap -servers localhost:2222 -files "$TEST_DATA" -query "$QUERY" 2>/dev/null | head -20) + +# Run in turbo mode +echo "Running in turbo mode..." +export DTAIL_TURBOBOOST_ENABLE=yes +OUTPUT_TURBO=$(./dmap -servers localhost:2222 -files "$TEST_DATA" -query "$QUERY" 2>/dev/null | head -20) +unset DTAIL_TURBOBOOST_ENABLE + +# Compare outputs +echo +echo "=== Regular Mode Output ===" +echo "$OUTPUT_REGULAR" | head -5 +echo "Lines: $(echo "$OUTPUT_REGULAR" | wc -l)" +echo + +echo "=== Turbo Mode Output ===" +echo "$OUTPUT_TURBO" | head -5 +echo "Lines: $(echo "$OUTPUT_TURBO" | wc -l)" +echo + +# Check if outputs match +if [ "$OUTPUT_REGULAR" = "$OUTPUT_TURBO" ]; then + echo "✓ PASS: Outputs match exactly!" +else + echo "✗ FAIL: Outputs differ!" + echo + echo "Difference:" + diff <(echo "$OUTPUT_REGULAR") <(echo "$OUTPUT_TURBO") || true +fi
\ No newline at end of file |
