diff options
| author | Paul Buetow <paul@buetow.org> | 2025-08-31 15:15:50 +0300 |
|---|---|---|
| committer | Paul Buetow <paul@buetow.org> | 2025-08-31 15:15:50 +0300 |
| commit | 114379d591fbbf6afeb649521be0a1aab2bbd3aa (patch) | |
| tree | 6a9d206b375b518c7cf6de70f0733630eafcfbcd /REVIEW.md | |
| parent | c508f9b24be667d5d078034b4d4e7bb07b28da69 (diff) | |
Implement REVIEW.md suggestions:\n- Simplify gemfeed detection in Aggregator\n- Make log globs configurable for Logreader\n- Remove debug-only imports; add comments\n- Add HTML::Entities usage; document deps\n- Add tests under t/ and make script require-safe\n- Add .perltidyrc and Justfile; add dev tasks\n- Update README and REVIEW with changes
Diffstat (limited to 'REVIEW.md')
| -rw-r--r-- | REVIEW.md | 59 |
1 files changed, 59 insertions, 0 deletions
diff --git a/REVIEW.md b/REVIEW.md new file mode 100644 index 0000000..1f4b52e --- /dev/null +++ b/REVIEW.md @@ -0,0 +1,59 @@ +# Review of foostats.pl + +This review provides suggestions for improving the `foostats.pl` Perl script. The script is well-structured and uses modern Perl features, but there are several areas where it could be enhanced. + +## 1. Code Duplication + +There are some instances of code duplication that could be refactored for better maintainability: + +* In `Foostats::Aggregator::add_feed_ips`, the code for handling `GEMFEED_URI` and `GEMFEED_URI_2` is identical. This could be simplified by using a loop or a regex. +* The logic for parsing web and gemini logs in `Foostats::Logreader` shares some similarities and could potentially be generalized. + +Implemented: +- Simplified `add_feed_ips` with a single regex that matches `/gemfeed/` and `/gemfeed/index.gmi` (including optional query strings), removing duplicated branches. +- Made log path globs configurable at runtime via `FOOSTATS_WEB_LOGS_GLOB` and `FOOSTATS_GEMINI_LOGS_GLOB` and refactored constant globs into functions. This reduces coupling and enables shared `read_lines` usage for both parsers and testing. + +## 2. Dependencies + +* **List dependencies in `README.md`:** At a minimum, list the required modules in the `README.md` file with instructions on how to install them. + +Implemented: +- Added a dedicated Dependencies section in `README.md` with core and CPAN modules. +- Included OpenBSD package names, adding `p5-HTML-Parser` (provides `HTML::Entities`) which is used by the HTML report generator. + +## 3. Testing + +The script lacks an automated test suite. Adding tests would significantly improve its reliability and make future development easier and safer. + +* **Use `Test::More`:** Perl's core `Test::More` module is the standard for writing tests. +* **Test individual components:** Write tests for each package, especially for the log parsing and data aggregation logic. +* **Create mock data:** Create sample log files to test the parsing logic under different scenarios. + +Implemented: +- Converted the script to avoid executing the CLI when `require`-d by wrapping the main flow into `foostats_main()` and calling it `unless caller`. This allows unit testing of internal packages. +- Added basic tests under `t/`: + - `t/00-load.t`: loads the script. + - `t/01-aggregator.t`: verifies feed IP aggregation for Atom and Gemfeed and page IP tracking. + - `t/02-filter.t`: verifies excessive request filtering behavior. + - `t/03-logreader-web.t`: parses a minimal mocked web access log via `FOOSTATS_WEB_LOGS_GLOB`. +- Introduced env-configurable log globs to enable tests to point at mock logs without touching system paths. +- Added `.perltidyrc` and a `Justfile` with `format`, `test`, `check`, and `clean` tasks to standardize formatting and developer workflow (replaced `Makefile`). + - Added a `reports` task to the `Justfile` to generate Gemtext and HTML reports locally from the repo's `stats/` directory into `out_gmi/` and `out_html/`. + +## 4. Readability and Style + +The script is generally well-written, but a few things could improve its readability: + +* **Remove debugging code:** The script contains `use diagnostics;` and `use Data::Dumper;` with "TODO: UNDO" comments. This debugging code should be removed from the production version of the script. +* **Add comments:** Some of the more complex parts of the script, like the data structures used for statistics, could benefit from comments explaining their structure and purpose. +* **Consistent formatting:** While the formatting is mostly consistent, a tool like `perltidy` could be used to enforce a standard style across the entire script. + +Implemented: +- Removed `use diagnostics;` and `use Data::Dumper;` from production code. +- Added comments describing the stats data model in `Foostats::Aggregator`. +- Added explicit `use HTML::Entities qw(encode_entities);` in `Foostats::Reporter` and documented dependency. + +Open items / Next steps: +- Consider further generalization between web and gemini parsers if beneficial (e.g., a unified parse event builder). +- Optionally run `perltidy` and adopt a formatter config for consistent style across the codebase. + - Consider adding more parser tests (Gemini vger/relayd sample lines, edge cases like query strings and unusual paths). |
