summaryrefslogtreecommitdiff
path: root/REVIEW.md
diff options
context:
space:
mode:
authorPaul Buetow <paul@buetow.org>2025-08-31 15:15:50 +0300
committerPaul Buetow <paul@buetow.org>2025-08-31 15:15:50 +0300
commit114379d591fbbf6afeb649521be0a1aab2bbd3aa (patch)
tree6a9d206b375b518c7cf6de70f0733630eafcfbcd /REVIEW.md
parentc508f9b24be667d5d078034b4d4e7bb07b28da69 (diff)
Implement REVIEW.md suggestions:\n- Simplify gemfeed detection in Aggregator\n- Make log globs configurable for Logreader\n- Remove debug-only imports; add comments\n- Add HTML::Entities usage; document deps\n- Add tests under t/ and make script require-safe\n- Add .perltidyrc and Justfile; add dev tasks\n- Update README and REVIEW with changes
Diffstat (limited to 'REVIEW.md')
-rw-r--r--REVIEW.md59
1 files changed, 59 insertions, 0 deletions
diff --git a/REVIEW.md b/REVIEW.md
new file mode 100644
index 0000000..1f4b52e
--- /dev/null
+++ b/REVIEW.md
@@ -0,0 +1,59 @@
+# Review of foostats.pl
+
+This review provides suggestions for improving the `foostats.pl` Perl script. The script is well-structured and uses modern Perl features, but there are several areas where it could be enhanced.
+
+## 1. Code Duplication
+
+There are some instances of code duplication that could be refactored for better maintainability:
+
+* In `Foostats::Aggregator::add_feed_ips`, the code for handling `GEMFEED_URI` and `GEMFEED_URI_2` is identical. This could be simplified by using a loop or a regex.
+* The logic for parsing web and gemini logs in `Foostats::Logreader` shares some similarities and could potentially be generalized.
+
+Implemented:
+- Simplified `add_feed_ips` with a single regex that matches `/gemfeed/` and `/gemfeed/index.gmi` (including optional query strings), removing duplicated branches.
+- Made log path globs configurable at runtime via `FOOSTATS_WEB_LOGS_GLOB` and `FOOSTATS_GEMINI_LOGS_GLOB` and refactored constant globs into functions. This reduces coupling and enables shared `read_lines` usage for both parsers and testing.
+
+## 2. Dependencies
+
+* **List dependencies in `README.md`:** At a minimum, list the required modules in the `README.md` file with instructions on how to install them.
+
+Implemented:
+- Added a dedicated Dependencies section in `README.md` with core and CPAN modules.
+- Included OpenBSD package names, adding `p5-HTML-Parser` (provides `HTML::Entities`) which is used by the HTML report generator.
+
+## 3. Testing
+
+The script lacks an automated test suite. Adding tests would significantly improve its reliability and make future development easier and safer.
+
+* **Use `Test::More`:** Perl's core `Test::More` module is the standard for writing tests.
+* **Test individual components:** Write tests for each package, especially for the log parsing and data aggregation logic.
+* **Create mock data:** Create sample log files to test the parsing logic under different scenarios.
+
+Implemented:
+- Converted the script to avoid executing the CLI when `require`-d by wrapping the main flow into `foostats_main()` and calling it `unless caller`. This allows unit testing of internal packages.
+- Added basic tests under `t/`:
+ - `t/00-load.t`: loads the script.
+ - `t/01-aggregator.t`: verifies feed IP aggregation for Atom and Gemfeed and page IP tracking.
+ - `t/02-filter.t`: verifies excessive request filtering behavior.
+ - `t/03-logreader-web.t`: parses a minimal mocked web access log via `FOOSTATS_WEB_LOGS_GLOB`.
+- Introduced env-configurable log globs to enable tests to point at mock logs without touching system paths.
+- Added `.perltidyrc` and a `Justfile` with `format`, `test`, `check`, and `clean` tasks to standardize formatting and developer workflow (replaced `Makefile`).
+ - Added a `reports` task to the `Justfile` to generate Gemtext and HTML reports locally from the repo's `stats/` directory into `out_gmi/` and `out_html/`.
+
+## 4. Readability and Style
+
+The script is generally well-written, but a few things could improve its readability:
+
+* **Remove debugging code:** The script contains `use diagnostics;` and `use Data::Dumper;` with "TODO: UNDO" comments. This debugging code should be removed from the production version of the script.
+* **Add comments:** Some of the more complex parts of the script, like the data structures used for statistics, could benefit from comments explaining their structure and purpose.
+* **Consistent formatting:** While the formatting is mostly consistent, a tool like `perltidy` could be used to enforce a standard style across the entire script.
+
+Implemented:
+- Removed `use diagnostics;` and `use Data::Dumper;` from production code.
+- Added comments describing the stats data model in `Foostats::Aggregator`.
+- Added explicit `use HTML::Entities qw(encode_entities);` in `Foostats::Reporter` and documented dependency.
+
+Open items / Next steps:
+- Consider further generalization between web and gemini parsers if beneficial (e.g., a unified parse event builder).
+- Optionally run `perltidy` and adopt a formatter config for consistent style across the codebase.
+ - Consider adding more parser tests (Gemini vger/relayd sample lines, edge cases like query strings and unusual paths).