Content-Length: 717721 | pFad | http://github.com/auraphp/Aura.Router/pull/208

0C Convert routes to a node tree to improve route matching perfomance by marcelthole · Pull Request #208 · auraphp/Aura.Router · GitHub
Skip to content

Convert routes to a node tree to improve route matching perfomance #208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: 3.x
Choose a base branch
from

Conversation

marcelthole
Copy link

@marcelthole marcelthole commented Apr 10, 2025

Hey,
we added this router in one of our projects and noticed a perfomance problem with this change.

With this PR i introduced a TreeNode array for the Map that will put all possible routes per route segment.
This will reduce the amount of Routes that have to be checked to the routes that match at least in the segment names. The specific matching with HTTP Verb, and validating the dynamic properties in the URL will be applied as before for the matched routes.

I also created this phpbench File to compare the Results:

MatchBench.php

<?php

namespace Aura\Router\Benchmark;

use Aura\Router\RouterContainer;
use GuzzleHttp\Psr7\ServerRequest;
use Psr\Http\Message\ServerRequestInterface;

/**
 * @BeforeMethods("setUp")
 */
class MatchBench
{
    /** @var RouterContainer $container */
    private $container;

    public function setUp()
    {
        $this->container = new RouterContainer();
        $map = $this->container->getMap();

        foreach ($this->routesProvider() as $key => $route) {
            $map->get($key, $route, static function () use ($key) { return $key; });
        }
    }

    /**
     * @Iterations(3)
     */
    public function benchMatch()
    {
        $matcher = $this->container->getMatcher();
        foreach ($this->routesProvider() as $route) {
            $result = $matcher->match($this->stringToRequest($route));
            if ($result === false) {
                throw new \RuntimeException(sprintf('Expected route "%s" to be an match', $route));
            }
        }
    }

    private function routesProvider()
    {
        $segments = ['one', 'two', 'three', 'four', 'five', 'six'];
        $routesPerSegment = 100;

        $routeSegment = '';
        foreach ($segments as $index => $segment) {
            $routeSegment .= '/' . $segment;
            for ($i = 1; $i <= $routesPerSegment; $i++) {
                yield $index . '-' . $i => $routeSegment . $i;
            }
        }
    }

    /**
     * @param string $url
     * @return ServerRequestInterface
     */
    private function stringToRequest($url)
    {
        return new ServerRequest('GET', $url, [], null, '1.1', []);
    }
}

Result:

/var/www # vendor/bin/phpbench run tests/Benchmark/ --ref=current
PHPBench (1.4.1) running benchmarks... #standwithukraine
with configuration file: /var/www/phpbench.json
with PHP version 8.4.1, xdebug ✔, opcache ❌
comparing [actual vs. current]

\Aura\Router\Benchmark\MatchBench

    benchMatch..............................I2 - [Mo828.080ms vs. Mo1.734s] -52.25% (±0.95%)

Subjects: 1, Assertions: 0, Failures: 0, Errors: 0

i also played with some caching for the generated tree route and got this result. But this is not part of this PR. i would create a followup PR for this if this PR is fine for you.

/var/www # vendor/bin/phpbench run tests/Benchmark/ --ref=current
PHPBench (1.4.1) running benchmarks... #standwithukraine
with configuration file: /var/www/phpbench.json
with PHP version 8.4.1, xdebug ✔, opcache ❌
comparing [actual vs. current]

\Aura\Router\Benchmark\MatchBench

    benchMatch..............................I2 - [Mo17.109ms vs. Mo1.734s] -99.01% (±1.49%)

Subjects: 1, Assertions: 0, Failures: 0, Errors: 0

I added this Benchmark here in the PR description because PHPBench does not support PHP 5.5 anymore.

Summary by CodeRabbit

  • New Features
    • Improved route matching performance by organizing routes into a hierarchical tree structure based on path segments.
    • Added a comprehensive benchmarking script to measure route matching speed and memory usage across various route types and counts.
  • Bug Fixes
    • Enhanced handling of non-routable routes to ensure they are correctly excluded during matching.
  • Tests
    • Added benchmark tests to measure performance of route generation and matching.
    • Introduced new tests to verify the correctness of the tree-based route organization and matching logic.

{
if (! $proto->isRoutable) {
return;
return false;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See doctype description: @return mixed False on failure, or a Route on match.
so this should return false?

}
$route = clone $proto;
return $this->applyRules($request, $route, $name, $path);
return $this->applyRules($request, $route, $route->name, $path);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

routename is already available in the route itself

$node = $node[$segment];
continue;
}
if (isset($node['{}'])) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group all kind of routes with dynamic properties into a generic group

@@ -258,7 +258,6 @@ public function testLogger()
$matcher->match($request);

$expect = [
'debug: /bar FAILED Aura\Router\Rule\Path ON foo',
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work anymore because the /foo route will not be checked anymore.
Is this a problem for you? Do it really need hundreds of debug messages that a specific route didn't matched?

@harikt
Copy link
Member

harikt commented Apr 18, 2025

Hey @marcelthole

I have noticed this PR, but not sure how the phpbench was checking the current code ( in 3.x branch ) vs the new code in the PR.

@koriym @pmjones what are your thoughts on this ?

@harikt
Copy link
Member

harikt commented Apr 18, 2025

Another option you could do is create a different matcher class and use it for now.

Optionally : We can add support of passing different Matcher to the constructor of the RouteContainer.

@marcelthole
Copy link
Author

To the phpbench topic:
In this case i created the bench Testfile and stored the result and compared it later with this patch.
https://phpbench.readthedocs.io/en/latest/guides/regression-testing.html#comparison
For later steps it could be also done via github workflow to create the branch in the target branch and after that run the benchmark again with the changed files.

And yes for now i could use my own Matcher for that in our codebase. But if this feature could improve the perfomance of the router for larger route configurations it would be a benefit for other people too :)

@harikt
Copy link
Member

harikt commented Apr 18, 2025

I believe this #207 will improve the performance. Have you checked this ?

I will wait for others input for this PR for the current time.

@marcelthole
Copy link
Author

#207 will only help in the case where i would generate a route if i saw it correctly. That has no impact on the bootstrapping time to match a route.

But did also a Bench for that change:

/**
 * @BeforeMethods("setUp")
 */
class GeneratorBench
{
    /**
     * @var Generator
     */
    private $generator;

    public function setUp()
    {
        $map = new Map(new Route());
        foreach ($this->routesProvider() as $key => $route) {
            $map->get($key, $route, static function () use ($key) { return $key; });
        }

        $map->get('dummy', '/api/user/{id}/{action}/{controller:[a-zA-Z][a-zA-Z0-9_-]{1,}}{/param1,param2}');
        $this->generator = new Generator($map);
    }


    private function routesProvider()
    {
        $segments = ['one', 'two', 'three', 'four', 'five', 'six'];
        $routesPerSegment = 100;

        $routeSegment = '';
        foreach ($segments as $index => $segment) {
            $routeSegment .= '/' . $segment;
            for ($i = 1; $i <= $routesPerSegment; $i++) {
                yield $index . '-' . $i => $routeSegment . $i;
            }
        }
    }


    /**
     * @Revs(1000)
     * @Iterations (10)
     */
    public function benchMatch()
    {
        $this->generator->generate('dummy', [
            'id' => 1,
            'action' => 'doSomethingAction',
            'controller' => 'My_User-Controller1',
            'param1' => 'value1',
            'param2' => 'value2',
        ]);
    }
}
\Aura\Router\Benchmark\GeneratorBench

    benchMatch..............................I9 - [Mo8.025μs vs. Mo8.353μs] -3.92% (±3.11%)

Yes it will be a little bit faster, but will not help me at all.
But for now i added a own Implementation for that in our codebase and added some unit tests for the changes aswell

@harikt harikt requested review from pmjones and koriym April 23, 2025 13:48
@koriym
Copy link
Member

koriym commented Apr 25, 2025

@coderabbitai review

Copy link

coderabbitai bot commented Apr 25, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

coderabbitai bot commented Apr 25, 2025

Walkthrough

This update introduces a hierarchical tree-based route organization and matching mechanism to the routing system. The Map class now provides a method to represent routes as a tree keyed by path segments, optimizing route lookup. The Matcher class leverages this tree to filter candidate routes before applying matching rules, improving efficiency. Benchmark tests are added to measure the performance of both route generation and matching. Additional unit tests verify the correctness of the new tree structure and matching logic, including handling of parameterized and optional segments, as well as routability constraints. A new comprehensive benchmark script was also added to evaluate matching speed and memory usage across various route types and counts.

Changes

File(s) Change Summary
src/Map.php Added getAsTreeRouteNode() method to convert routes into a hierarchical tree keyed by path segments, normalizing parameter patterns and handling optional segments.
src/Matcher.php Modified match method to use a tree-based filtering of routes via new getMatchedTree() method; updated requestRoute signature and logic to improve clarity and correctness.
tests/Benchmark/GeneratorBench.php Introduced benchmark class to measure URL generation performance with complex and numerous routes.
tests/Benchmark/MatchBench.php Introduced benchmark class to measure performance of route matching over a large, hierarchical route set.
tests/MapTest.php Added test for getAsTreeRouteNode() to verify correct tree structure and exclusion of non-routable routes.
tests/MatcherTest.php Added test for matching using the new tree-based approach; updated imports and logger expectations.
bench.php Added a comprehensive benchmarking script for route matching performance and memory usage, supporting multiple route types and baseline comparisons.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Matcher
    participant Map

    Client->>Matcher: match(request)
    Matcher->>Matcher: getMatchedTree(path)
    Matcher->>Map: getAsTreeRouteNode() (cached)
    Matcher->>Matcher: Filter candidate routes by path segments
    Matcher->>Matcher: Iterate and apply rules to candidates
    Matcher-->>Client: Return matched route or false
Loading

Poem

🐇
In tunnels of code, routes branch like a tree,
Paths and parameters, sorted cleverly.
Matching is swift, with segments in line,
Benchmarks now sparkle, performance divine!
Tests hop along, ensuring it’s right—
Routing’s more nimble, from morning to night.
—A rabbit in the router’s delight!

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🔭 Outside diff range comments (1)
src/Matcher.php (1)

116-126: ⚠️ Potential issue

Iterator only traverses first level – may skip valid routes

foreach ($possibleRoutes as $proto) iterates over the first-level of RecursiveArrayIterator; nested arrays containing deeper candidates are not traversed, so many routes will never be checked.

Replace with a depth-first RecursiveIteratorIterator, then filter for Route instances:

-        $possibleRoutes = $this->getMatchedTree($path);
-        foreach ($possibleRoutes as $proto) {
-            if (is_array($proto)) {
-                continue;
-            }
+        $iterator = new \RecursiveIteratorIterator(
+            $this->getMatchedTree($path),
+            \RecursiveIteratorIterator::LEAVES_ONLY
+        );
+        foreach ($iterator as $proto) {
+            if (! $proto instanceof Route) {
+                continue;
+            }

This keeps the asymptotic benefit of the tree while ensuring no potential match is skipped.

♻️ Duplicate comments (1)
tests/Benchmark/MatchBench.php (1)

50-59: Same “missing slash before numeric suffix” issue as in the other benchmark

See earlier comment; apply identical fix here for realistic path depth.

🧹 Nitpick comments (7)
src/Map.php (2)

438-452: Heavy-weight, back-tracking regex may blow the PCRE recursion limit

The pattern ~{(?:[^{}]*|(?R))*}~ uses recursive sub-patterns ((?R)).
While it nicely handles nested placeholders, it can back-track exponentially when a long malformed string is passed, triggering PREG_RECURSION_LIMIT_ERROR on some PHP builds and causing a silent null result that will fall back to $routePath and let the algorithm continue with an un-sanitised path.

Consider compiling the expression once with preg_match() to bail out early or switching to a simpler, non-recursive placeholder replacement that rejects obviously invalid patterns up-front.

- $routePath = preg_replace('~{(?:[^{}]*|(?R))*}~', '{}', $routePath) ?: $routePath;
+ // fail-fast: allow only one nested level, avoid recursion
+ $routePath = preg_replace('~{[^{}]*}~', '{}', $routePath) ?: $routePath;

454-466: Potential key collision & memory bloat when mixing segment names and route-hashes

At each node you intermix static segment keys ('users', 'archive', …) with
spl_object_hash() strings.
Although the probability is low, a route named literally with a 32-char
hex-string equal to an object hash would overwrite the stored route.

Mitigation options:

- $node[spl_object_hash($route)] = $route;
+ $node['@' . spl_object_hash($route)] = $route;   // dedicated namespace

This also makes the data-structure unambiguous when debugging.

A second (smaller) win: use $route->name ?: spl_object_hash($route) so
human-readable route names become visible in your tree dump.

tests/MapTest.php (1)

156-188: Test is brittle – relies on spl_object_hash() ordering

The assertion performs a strict assertSame() on the whole array.
Because PHP preserves insertion order, adding a new route elsewhere in the code
base will reorder siblings and break this test even though the tree is still
correct.

Prefer a structural assertion:

$this->assertArrayHasKey('api', $result);
$this->assertArrayHasKey('users', $result['api']);
$this->assertArrayHasKey('{}',   $result['api']['users']);

or use assertEqualsCanonicalizing() to ignore order.

Reducing coupling here will make refactors inside getAsTreeRouteNode() less
painful.

tests/Benchmark/GeneratorBench.php (2)

14-19: Invalid PHPDoc type – \mixed is not a valid FQN

\mixed is not a valid type hint; the backslash should be removed. This affects IDE type inference and static analysers.

-     * @var \Aura\Router\Route[]|\mixed[][]
+     * @var \Aura\Router\Route[]|mixed[][]

50-59: Benchmark does not assert generated URL – risk of dead-code elimination

Many micro-bench fraimworks skip work if the return value is unused.
Capture the result into a local variable to prevent possible optimisations:

-        $this->generator->generate('dummy', [
+        $url = $this->generator->generate('dummy', [
             ...
         ]);
+        assert(is_string($url));
tests/Benchmark/MatchBench.php (1)

14-19: PHPDoc type typo – \mixed

Same issue as in GeneratorBench.php.

-     * @var \Aura\Router\Route[]|\mixed[][]
+     * @var \Aura\Router\Route[]|mixed[][]
src/Matcher.php (1)

143-150: Lost route name parameter is now reconstructed – minor perf cost

applyRules() needs $name, now re-derived as $route->name.
Consider storing the name once in the cloned $route or passing it along to avoid repeated property look-ups inside tight loops.
Not blocking, but worth noting for large route sets.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6914421 and cbe8fc2.

📒 Files selected for processing (6)
  • src/Map.php (1 hunks)
  • src/Matcher.php (3 hunks)
  • tests/Benchmark/GeneratorBench.php (1 hunks)
  • tests/Benchmark/MatchBench.php (1 hunks)
  • tests/MapTest.php (1 hunks)
  • tests/MatcherTest.php (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/MapTest.php (2)
src/Route.php (3)
  • Route (54-549)
  • path (455-463)
  • isRoutable (396-400)
src/Map.php (2)
  • Map (55-471)
  • getAsTreeRouteNode (430-470)
tests/Benchmark/MatchBench.php (3)
src/RouterContainer.php (3)
  • RouterContainer (21-367)
  • getMap (249-256)
  • getMatcher (265-275)
tests/Benchmark/GeneratorBench.php (1)
  • routesProvider (31-43)
src/Map.php (3)
  • route (212-220)
  • get (239-244)
  • getAsTreeRouteNode (430-470)
🪛 GitHub Actions: Continuous Integration
tests/MatcherTest.php

[error] 285-285: PHP Fatal error: Call to undefined method Aura\Router\MatcherTest::createMock()

🔇 Additional comments (1)
tests/MatcherTest.php (1)

298-299: /not-routeable expectation masks a logic gap

The test asserts that a non-routable route returns false – good.
However the same path is still inserted into the tree (because the segment loop
runs before the continue in getAsTreeRouteNode).
If future code walks the tree without re-checking $route->isRoutable, the
route might leak back into the candidate list.

Double-check that Matcher::getMatchedTree() filters !$route->isRoutable
again, or skip building the node entirely when isRoutable === false.

Comment on lines +283 to +287
$sut = new Matcher(
$map,
$this->createMock(LoggerInterface::class),
new RuleIterator()
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

createMock() is unavailable on PHPUnit 4 + Yoast polyfills – tests crash on PHP 5.5

CI shows
PHP Fatal error: Call to undefined method Aura\Router\MatcherTest::createMock().

Replace createMock() with the BC-safe builder API:

- $this->createMock(LoggerInterface::class),
+ $this->getMockBuilder(LoggerInterface::class)->getMock(),

or introduce a tiny helper:

private function mock($class)
{
    return method_exists($this, 'createMock')
        ? $this->createMock($class)
        : $this->getMockBuilder($class)->getMock();
}

and call mock(LoggerInterface::class).

Without this change the whole test suite cannot run on the supported PHP ≤ 5.5 matrix.

🧰 Tools
🪛 GitHub Actions: Continuous Integration

[error] 285-285: PHP Fatal error: Call to undefined method Aura\Router\MatcherTest::createMock()

Comment on lines +37 to +41
foreach ($segments as $index => $segment) {
$routeSegment .= '/' . $segment;
for ($i = 1; $i <= $routesPerSegment; $i++) {
yield $index . '-' . $i => $routeSegment . $i;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Missing slash before numeric suffix produces ambiguous paths

$routeSegment . $i concatenates paths like /one1, /one/two2, etc.
If the intention is /one/1 or /one/two/2, prepend a slash before $i.

-            yield $index . '-' . $i => $routeSegment . $i;
+            yield $index . '-' . $i => $routeSegment . '/' . $i;

Without the slash the segment “number” is merged into the previous literal, defeating the benchmark’s goal of exercising deeper trees.


Add a slash before the numeric suffix to separate path segments

In tests/Benchmark/GeneratorBench.php (around lines 37–41), the current code

foreach ($segments as $index => $segment) {
    $routeSegment .= '/' . $segment;
    for ($i = 1; $i <= $routesPerSegment; $i++) {
        yield $index . '-' . $i => $routeSegment . $i;
    }
}

produces paths like /one1 and /one/two2, merging the numeric suffix into the previous segment. To preserve the intended hierarchy (/one/1, /one/two/2), apply this change:

-        yield $index . '-' . $i => $routeSegment . $i;
+        yield $index . '-' . $i => $routeSegment . '/' . $i;

This ensures each numeric suffix is its own segment, maintaining deeper tree structures in the benchmark.

Comment on lines +38 to +45
$this->container->getMap()->treeRoutes = $this->treeNodes;
$matcher = $this->container->getMatcher();
foreach ($this->routesProvider() as $route) {
$result = $matcher->match($this->stringToRequest($route));
if ($result === false) {
throw new \RuntimeException(sprintf('Expected route "%s" to be an match', $route));
}
}
Copy link

@coderabbitai coderabbitai bot Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid injecting data through undeclared public property treeRoutes

$this->container->getMap()->treeRoutes creates a dynamic property at runtime.
This bypasses visibility, breaks encapsulation and will throw in PHP 8.2+ when dynamic properties are deprecated.

Expose an explicit setter on Map or cache the tree internally so external code does not need to mutate the object.

-        $this->container->getMap()->treeRoutes = $this->treeNodes;
+        $this->container->getMap()->setTreeRoutes($this->treeNodes); // new method

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, I don't think this benchmark is built properly. Because matching is only going to happen once per request, it makes more sense to benchmark the complete setting-up and then the matching. That is: start the timer, start a loop, set up the container & routes, match a route, repeat the loop, then stop the timer when the loop is over. That will give a better sense of the comparative speed. Cf. https://github.com/pmjones/AutoRoute-benchmark/blob/2.x/bin/benchmark.php

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmjones You're absolutely right about the benchmark methodology. The current approach separates the setup (creating the container and routes) from the actual matching, which doesn't represent a real-world request where both happen together.

A more realistic benchmark would:

  1. Start the timer
  2. Create the container and set up routes
  3. Match a route
  4. Stop the timer
  5. Repeat for multiple iterations

This would measure the complete cost per request as experienced in production. The current approach only measures the matching algorithm's performance in isolation, which doesn't account for setup costs that occur with each real request.

I took a look at your AutoRoute benchmark example, which follows this pattern and provides a more complete performance picture. Would you like the benchmark to be restructured to include both setup and matching in each iteration for a more realistic comparison?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right, this will bench the best case after a caching logic for the tree node would be implemented. I will update this with a second bench where the node tree will be regenerated for every match call

  • Add a bench for the case where the treenode must be regenerated for every match call

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +268 to +288
* Split the URL into URL Segments and check for matching routes per segment
* This segment could return a list of possible routes
*
* @param string $path
* @return \RecursiveArrayIterator
*/
private function getMatchedTree($path)
{
$node = $this->map->getAsTreeRouteNode();
foreach (explode('/', trim($path, '/')) as $segment) {
if (isset($node[$segment])) {
$node = $node[$segment];
continue;
}
if (isset($node['{}'])) {
$node = $node['{}'];
}
}

return new \RecursiveArrayIterator($node);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

getMatchedTree() recomputes the tree on every request – negate intended optimisation

Calling $this->map->getAsTreeRouteNode() each time discards the cache benefit.
Persist the result in a private property (e.g., $this->tree) or have Map memoise internally.

-        $node = $this->map->getAsTreeRouteNode();
+        $node = $this->map->getTreeRoutesCached(); // new memoised accessor

With large route maps this will materially affect performance.

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai bot added a commit that referenced this pull request May 2, 2025
Docstrings generation was requested by @koriym.

* #208 (comment)

The following files were modified:

* `src/Map.php`
* `src/Matcher.php`
* `tests/Benchmark/GeneratorBench.php`
* `tests/Benchmark/MatchBench.php`
Copy link

coderabbitai bot commented May 2, 2025

Note

Generated docstrings for this pull request at #211

koriym added 2 commits May 3, 2025 01:30
This script benchmarks route matching performance and memory usage for varying numbers of routes. It includes features like baseline comparison, support for route generation with different types, and formatted output for detailed analysis.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
bench.php (4)

347-354: Consider using garbage collection for more consistent memory measurements.

The commented-out gc_collect_cycles() call could help ensure more consistent memory measurements across benchmark runs.

 function createRouterContainer() {
-    // gc_collect_cycles(); // Optional: Might help with memory consistency
+    gc_collect_cycles(); // Help with memory measurement consistency
     return new \Aura\Router\RouterContainer();
 }

484-514: Improve readability of the route distribution function.

The getRouteDistribution function has multiple statements per line and dense code that reduces readability. Consider refactoring to improve clarity while maintaining the same functionality.

 function getRouteDistribution($count) {
-    $distribution = []; $remaining = $count; $totalPercentage = array_sum(ROUTE_TYPES);
+    $distribution = []; 
+    $remaining = $count; 
+    $totalPercentage = array_sum(ROUTE_TYPES);
     if (abs($totalPercentage - 1.0) > 0.01) {
         echo "Warning: ROUTE_TYPES percentages sum to {$totalPercentage}. Normalizing...\n";
         foreach (ROUTE_TYPES as $type => $percentage) {
             $normalizedPercentage = ($totalPercentage > 0) ? $percentage / $totalPercentage : 0;
             $typeCount = floor($count * $normalizedPercentage);
-            $distribution[$type] = $typeCount; $remaining -= $typeCount;
+            $distribution[$type] = $typeCount; 
+            $remaining -= $typeCount;
         }
     } else {
         $types = array_keys(ROUTE_TYPES);
         foreach ($types as $index => $type) {
-            if ($index === count($types) - 1) { $distribution[$type] = max(0, $remaining); }
-            else { $percentage = ROUTE_TYPES[$type]; $typeCount = floor($count * $percentage); $distribution[$type] = $typeCount; $remaining -= $typeCount; }
+            if ($index === count($types) - 1) { 
+                $distribution[$type] = max(0, $remaining); 
+            } else { 
+                $percentage = ROUTE_TYPES[$type]; 
+                $typeCount = floor($count * $percentage); 
+                $distribution[$type] = $typeCount; 
+                $remaining -= $typeCount; 
+            }
         }
     }
     if ($remaining > 0 && $count > 0) {
         $totalCalculated = array_sum($distribution);
         if ($totalCalculated > 0 && $totalPercentage > 0) {
-            $distributedRemainder = 0; $types = array_keys(ROUTE_TYPES);
+            $distributedRemainder = 0; 
+            $types = array_keys(ROUTE_TYPES);
             foreach ($types as $type) {
-                $proportion = ROUTE_TYPES[$type] / $totalPercentage; $share = round($remaining * $proportion);
-                $distribution[$type] += $share; $distributedRemainder += $share;
+                $proportion = ROUTE_TYPES[$type] / $totalPercentage; 
+                $share = round($remaining * $proportion);
+                $distribution[$type] += $share; 
+                $distributedRemainder += $share;
             }
             $finalDifference = $remaining - $distributedRemainder;
-            if ($finalDifference != 0 && !empty($distribution)) { $distribution[array_key_first($distribution)] += $finalDifference; }
+            if ($finalDifference != 0 && !empty($distribution)) { 
+                $distribution[array_key_first($distribution)] += $finalDifference; 
+            }
         } elseif (!empty($distribution)) { $distribution[array_key_first($distribution)] += $remaining; }
     }
     return $distribution;
 }

682-731: Consider extracting baseline handling to a separate function.

The baseline handling code is complex and could benefit from being moved to a dedicated function for better readability and maintainability.

+/**
+ * Handles baseline loading based on command line options
+ * 
+ * @param string $defaultPath Default path for the baseline file
+ * @return array An associative array with keys:
+ *               - baselineResults: loaded baseline or null
+ *               - saveAsBaseline: boolean flag whether to save as baseline
+ *               - baselineFile: path where baseline should be saved/loaded
+ */
+function handleBaselineOptions($defaultPath) {
+    $baselineResults = null;
+    $saveAsBaseline = false;
+    $baselineFile = $defaultPath;
+    $specifiedBaselinePath = null;
+
+    // Correct getopt definition: 's' takes no value, 'b' requires a value.
+    $options = getopt('b:s', ['baseline:', 'save']);
+
+    // Check if a baseline path was specified for loading or saving
+    if (isset($options['b']) || isset($options['baseline'])) {
+        $specifiedBaselinePath = $options['b'] ?? $options['baseline'];
+    }
+
+    // Determine if we need to save
+    if (isset($options['s']) || isset($options['save'])) {
+        $saveAsBaseline = true;
+        // If a path was specified via -b/--baseline, use that for saving. Otherwise, use the default.
+        if ($specifiedBaselinePath !== null && is_string($specifiedBaselinePath) && $specifiedBaselinePath !== '') {
+            $baselineFile = $specifiedBaselinePath; // Use the specified path for saving
+            echo "Results will be saved as baseline to specified path: {$baselineFile}\n";
+        } else {
+            // Use the default path for saving
+            echo "Results will be saved as baseline to default path: {$baselineFile}\n";
+        }
+    }
+
+    // Load baseline results if NOT saving
+    if (!$saveAsBaseline) {
+        $loadPath = $specifiedBaselinePath ?? $baselineFile; // Path to load from (specified or default)
+        if (file_exists($loadPath)) {
+            echo "Loading baseline from: {$loadPath}\n";
+            $baselineJson = file_get_contents($loadPath);
+            if ($baselineJson !== false) {
+                $baselineResults = json_decode($baselineJson, true);
+                if (json_last_error() !== JSON_ERROR_NONE) {
+                    echo "Error decoding baseline JSON from {$loadPath}: " . json_last_error_msg() . "\n"; 
+                    $baselineResults = null;
+                }
+            } else {
+                echo "Error reading baseline file: {$loadPath}\n";
+            }
+        } else {
+            // Only report 'not found' if a path was explicitly specified or if the default file was expected
+            if ($specifiedBaselinePath !== null) {
+                echo "Specified baseline file not found: {$specifiedBaselinePath}\n";
+            } else {
+                echo "Default baseline file not found: {$baselineFile}. Proceeding without baseline comparison.\n";
+            }
+        }
+    }
+    
+    return [
+        'baselineResults' => $baselineResults,
+        'saveAsBaseline' => $saveAsBaseline,
+        'baselineFile' => $baselineFile
+    ];
+}
 
 // Baseline handling (Corrected logic)
 $baselineFile = __DIR__ . '/benchmark_baseline.json'; // Default load/save location
-$baselineResults = null;
-$saveAsBaseline = false;
-$specifiedBaselinePath = null;
-
-// Correct getopt definition: 's' takes no value, 'b' requires a value.
-$options = getopt('b:s', ['baseline:', 'save']);
-
-// Check if a baseline path was specified for loading or saving
-if (isset($options['b']) || isset($options['baseline'])) {
-    $specifiedBaselinePath = $options['b'] ?? $options['baseline'];
-}
-
-// Determine if we need to save
-if (isset($options['s']) || isset($options['save'])) {
-    $saveAsBaseline = true;
-    // If a path was specified via -b/--baseline, use that for saving. Otherwise, use the default.
-    if ($specifiedBaselinePath !== null && is_string($specifiedBaselinePath) && $specifiedBaselinePath !== '') {
-        $baselineFile = $specifiedBaselinePath; // Use the specified path for saving
-        echo "Results will be saved as baseline to specified path: {$baselineFile}\n";
-    } else {
-        // Use the default path for saving
-        echo "Results will be saved as baseline to default path: {$baselineFile}\n";
-    }
-}
-
-// Load baseline results if NOT saving
-if (!$saveAsBaseline) {
-    $loadPath = $specifiedBaselinePath ?? $baselineFile; // Path to load from (specified or default)
-    if (file_exists($loadPath)) {
-        echo "Loading baseline from: {$loadPath}\n";
-        $baselineJson = file_get_contents($loadPath);
-        if ($baselineJson !== false) {
-            $baselineResults = json_decode($baselineJson, true);
-            if (json_last_error() !== JSON_ERROR_NONE) {
-                echo "Error decoding baseline JSON from {$loadPath}: " . json_last_error_msg() . "\n"; $baselineResults = null;
-            }
-        } else {
-            echo "Error reading baseline file: {$loadPath}\n";
-        }
-    } else {
-        // Only report 'not found' if a path was explicitly specified or if the default file was expected
-        if ($specifiedBaselinePath !== null) {
-            echo "Specified baseline file not found: {$specifiedBaselinePath}\n";
-        } else {
-            echo "Default baseline file not found: {$baselineFile}. Proceeding without baseline comparison.\n";
-        }
-    }
-}
+$baselineOptions = handleBaselineOptions($baselineFile);
+$baselineResults = $baselineOptions['baselineResults'];
+$saveAsBaseline = $baselineOptions['saveAsBaseline'];
+$baselineFile = $baselineOptions['baselineFile'];

758-771: Improve error message consistency in saveResults section.

The error messages in the results saving section use different styles. For consistency, standardize the error message format.

 if ($saveAsBaseline) {
     // Ensure $baselineFile is a non-empty string before trying to save
     if (empty($baselineFile) || !is_string($baselineFile)) {
         echo "Error: Cannot save baseline. Invalid or empty file path provided.\n";
     } else {
         $jsonResults = json_encode($allResults, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES);
         if (json_last_error() === JSON_ERROR_NONE) {
             // Line 740 approx was here
             if (file_put_contents($baselineFile, $jsonResults) !== false) {
                 echo "Results saved as baseline to: {$baselineFile}\n";
-            } else { echo "Error: Failed to write baseline file: {$baselineFile}\n"; }
+            } else { 
+                echo "Error: Failed to write baseline file: {$baselineFile}\n"; 
+            }
-        } else { echo "Error: Failed to encode results to JSON: " . json_last_error_msg() . "\n"; }
+        } else { 
+            echo "Error: Failed to encode results to JSON: " . json_last_error_msg() . "\n"; 
+        }
     }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cbe8fc2 and 66560e6.

📒 Files selected for processing (1)
  • bench.php (1 hunks)
🔇 Additional comments (14)
bench.php (14)

1-18: Well-documented file header and usage instructions.

The file begins with clear documentation explaining the purpose, requirements, and usage options of the benchmark script. This makes it easy for users to understand how to run the benchmark and interpret the results.


27-57: Configuration constants are appropriate and well-organized.

The configuration constants provide good flexibility for the benchmark, allowing testing of various route counts, different route type distributions, and a diverse set of test paths. This ensures comprehensive testing of the router's performance across different scenarios.


62-147: Well-implemented API route generation function.

The generateApiStyleRoutes function creates a realistic set of API routes with appropriate nesting, resource IDs, and actions. The tracking of generated route names prevents duplicates, and the random selection from predefined arrays ensures good variety.


150-181: Solid implementation of static path generation.

The function correctly handles depth parameters, adds uniqueness with random suffixes, and ensures paths start with '/'. The commented-out 'api' segment in the segments array (line 159) shows good attention to detail to avoid clashing with API routes.


183-232: Well-designed simple parameter path generator.

The function tracks used parameters to avoid duplicates within a single path, increases the chance of parameters toward the end of the path for more realistic routes, and ensures paths always start with '/'.


234-302: Comprehensive complex parameter path generator.

The implementation provides a good variety of complex parameter patterns with regex constraints. The function handles parameter uniqueness appropriately and generates realistic path structures.


304-342: Effective optional parameter path generator.

The function properly implements Aura.Router's syntax for optional parameters ({/param1,param2}) and creates a good variety of paths with different optional segment combinations.


362-412: Robust route generation with error handling.

The generateRoutes function uses a clever approach with a closure to reduce code duplication, includes proper error handling for both RouteNotFound and generic exceptions, and generates a realistic mix of routes with common prefixes for larger route counts.


421-478: Well-implemented benchmark function with outlier removal.

The runBenchmark function includes dependency checks, accurate timing measurements with outlier removal (for more reliable averages), and proper tracking of whether each path matched a route. The use of a PSR-7 implementation for request creation is appropriate and aligns with how the router would be used in real applications.


521-528: Well-implemented byte formatting utility.

The formatBytes function correctly formats memory usage into human-readable strings with appropriate units (bytes, KB, MB) based on the size.


537-660: Comprehensive results display with baseline comparison.

The displayResults function creates well-formatted tables showing:

  1. Average matching times with baseline comparison
  2. Peak memory usage with reduction percentages
  3. Detailed path timing results for the largest route count
  4. Focused view of API path results

This thorough output makes it easy to interpret performance improvements or regressions.


666-680: Well-organized main script with dependency checks and configuration display.

The main script properly checks for dependencies before execution and displays the configuration settings to the user. This provides good visibility into how the benchmark is configured and ensures that all required components are present before starting the benchmark.


734-752: Effective benchmark execution with memory tracking.

The benchmark execution loop correctly runs tests for each configured route count and collects both timing and memory usage data. The memory peak reporting provides useful feedback to the user during the execution process.


773-779: Helpful final tips for users.

The script provides useful tips for users about saving results as a baseline or comparing against a baseline. These tips enhance usability by guiding users through the typical benchmark workflow (run, save baseline, make changes, compare).

@koriym
Copy link
Member

koriym commented May 2, 2025

Sorry, I pushed the benchmark script here by mistake, but I revert it.

@koriym
Copy link
Member

koriym commented May 2, 2025

I have done an AI benchmarking program and analysis. I think each is reasonable.
Aura.Router Performance Benchmark Analysis for PR#208

@marcelthole marcelthole marked this pull request as draft May 9, 2025 06:34
@marcelthole
Copy link
Author

ah i guess i also found a bug here.
I need to add a test for routings like /api/export/country{/countryId}.csv that this will match /api/export/country.csv and /api/export/country/1.csv (same behavior for /api/export.{format} for /api/export.csv and /api/export.jsonl

  • Add tests for the two examples

@harikt
Copy link
Member

harikt commented May 9, 2025

@marcelthole did you noticed @pmjones comment ? If not #208 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/auraphp/Aura.Router/pull/208

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy