-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
[OptionsResolver] Optimize splitOutsideParenthesis() - 2.91x faster #61239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 7.4
Are you sure you want to change the base?
[OptionsResolver] Optimize splitOutsideParenthesis() - 2.91x faster #61239
Conversation
This comment has been minimized.
This comment has been minimized.
1312319
to
e01b177
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance improvements are refactors, which means this should target 7.4. 7.3 being already released, the branch receives bug fixes only.
Also, would you mind sharing the test set you used to run benchmarks? In addition to the methodology used as explained in the description, having the opportunity to see the benchmark code would be a plus!
e01b177
to
1591e54
Compare
Hey @bendavies thanks for this optimization proposal! For performance refactoring PRs, we usually target the latest dev branch, in this case, 7.4 |
- Fast path for simple types (no pipes) - Fast path for unions without parentheses - Eliminate string concatenation overhead - Switch statement for character matching Reduces form processing time significantly for large forms.
1591e54
to
c135a89
Compare
# Test individual optimizations
hyperfine --warmup 1 --runs 10 \
--reference 'php test_original.php' \
'php test_opt1_fast_path_simple.php' \
'php test_opt2_fast_path_union.php' \
'php test_opt3_no_string_concat.php' \
'php test_opt4_switch_statement.php'
# Test combined optimization
hyperfine --warmup 1 --runs 10 \
--reference 'php test_original.php' \
'php test_optimized.php' test_original.php - Original implementation<?php
function splitOutsideParenthesis(string $type): array
{
$parts = [];
$currentPart = '';
$parenthesisLevel = 0;
$typeLength = \strlen($type);
for ($i = 0; $i < $typeLength; ++$i) {
$char = $type[$i];
if ('(' === $char) {
++$parenthesisLevel;
} elseif (')' === $char) {
--$parenthesisLevel;
}
if ('|' === $char && 0 === $parenthesisLevel) {
$parts[] = $currentPart;
$currentPart = '';
} else {
$currentPart .= $char;
}
}
if ('' !== $currentPart) {
$parts[] = $currentPart;
}
return $parts;
}
$testCases = [
'string',
'int',
'bool',
'array',
'string|int',
'string|int|bool',
'string|int|bool|array',
'string|(int|bool)',
'(string|int)|bool',
'string|(int|(bool|float))',
'(string|int)|(bool|float)',
'MyClass',
'string[]',
'int[]',
'\\Namespace\\Class',
'string|int|bool|array|object|resource|callable',
];
$iterations = 100000;
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
foreach ($testCases as $testCase) {
splitOutsideParenthesis($testCase);
}
}
$end = microtime(true);
$duration = ($end - $start) * 1000;
$peakMemory = memory_get_peak_usage(true) / 1024 / 1024;
echo "Original implementation: " . number_format($duration, 2) . "ms for " . ($iterations * count($testCases)) . " operations\n";
echo "Peak memory usage: " . number_format($peakMemory, 2) . " MB\n"; test_opt1_fast_path_simple.php - Optimization 1: Fast path for simple types<?php
function splitOutsideParenthesis(string $type): array
{
if (!\str_contains($type, '|')) {
return [$type];
}
$parts = [];
$currentPart = '';
$parenthesisLevel = 0;
$typeLength = \strlen($type);
for ($i = 0; $i < $typeLength; ++$i) {
$char = $type[$i];
if ('(' === $char) {
++$parenthesisLevel;
} elseif (')' === $char) {
--$parenthesisLevel;
}
if ('|' === $char && 0 === $parenthesisLevel) {
$parts[] = $currentPart;
$currentPart = '';
} else {
$currentPart .= $char;
}
}
if ('' !== $currentPart) {
$parts[] = $currentPart;
}
return $parts;
}
$testCases = [
'string',
'int',
'bool',
'array',
'string|int',
'string|int|bool',
'string|int|bool|array',
'string|(int|bool)',
'(string|int)|bool',
'string|(int|(bool|float))',
'(string|int)|(bool|float)',
'MyClass',
'string[]',
'int[]',
'\\Namespace\\Class',
'string|int|bool|array|object|resource|callable',
];
$iterations = 100000;
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
foreach ($testCases as $testCase) {
splitOutsideParenthesis($testCase);
}
}
$end = microtime(true);
$duration = ($end - $start) * 1000;
$peakMemory = memory_get_peak_usage(true) / 1024 / 1024;
echo "Optimization 1 (fast path simple): " . number_format($duration, 2) . "ms for " . ($iterations * count($testCases)) . " operations\n";
echo "Peak memory usage: " . number_format($peakMemory, 2) . " MB\n"; test_opt2_fast_path_union.php - Optimization 2: Fast path for union types<?php
function splitOutsideParenthesis(string $type): array
{
if (!\str_contains($type, '(') && !\str_contains($type, ')')) {
return \explode('|', $type);
}
$parts = [];
$currentPart = '';
$parenthesisLevel = 0;
$typeLength = \strlen($type);
for ($i = 0; $i < $typeLength; ++$i) {
$char = $type[$i];
if ('(' === $char) {
++$parenthesisLevel;
} elseif (')' === $char) {
--$parenthesisLevel;
}
if ('|' === $char && 0 === $parenthesisLevel) {
$parts[] = $currentPart;
$currentPart = '';
} else {
$currentPart .= $char;
}
}
if ('' !== $currentPart) {
$parts[] = $currentPart;
}
return $parts;
}
$testCases = [
'string',
'int',
'bool',
'array',
'string|int',
'string|int|bool',
'string|int|bool|array',
'string|(int|bool)',
'(string|int)|bool',
'string|(int|(bool|float))',
'(string|int)|(bool|float)',
'MyClass',
'string[]',
'int[]',
'\\Namespace\\Class',
'string|int|bool|array|object|resource|callable',
];
$iterations = 100000;
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
foreach ($testCases as $testCase) {
splitOutsideParenthesis($testCase);
}
}
$end = microtime(true);
$duration = ($end - $start) * 1000;
$peakMemory = memory_get_peak_usage(true) / 1024 / 1024;
echo "Optimization 2 (fast path union): " . number_format($duration, 2) . "ms for " . ($iterations * count($testCases)) . " operations\n";
echo "Peak memory usage: " . number_format($peakMemory, 2) . " MB\n"; test_opt3_no_string_concat.php - Optimization 3: Eliminate string concatenation<?php
function splitOutsideParenthesis(string $type): array
{
$parts = [];
$start = 0;
$parenthesisLevel = 0;
$length = \strlen($type);
for ($i = 0; $i < $length; ++$i) {
$char = $type[$i];
if ('(' === $char) {
++$parenthesisLevel;
} elseif (')' === $char) {
--$parenthesisLevel;
} elseif ('|' === $char && 0 === $parenthesisLevel) {
$parts[] = \substr($type, $start, $i - $start);
$start = $i + 1;
}
}
if ($start < $length) {
$parts[] = \substr($type, $start);
}
return $parts;
}
$testCases = [
'string',
'int',
'bool',
'array',
'string|int',
'string|int|bool',
'string|int|bool|array',
'string|(int|bool)',
'(string|int)|bool',
'string|(int|(bool|float))',
'(string|int)|(bool|float)',
'MyClass',
'string[]',
'int[]',
'\\Namespace\\Class',
'string|int|bool|array|object|resource|callable',
];
$iterations = 100000;
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
foreach ($testCases as $testCase) {
splitOutsideParenthesis($testCase);
}
}
$end = microtime(true);
$duration = ($end - $start) * 1000;
$peakMemory = memory_get_peak_usage(true) / 1024 / 1024;
echo "Optimization 3 (no string concat): " . number_format($duration, 2) . "ms for " . ($iterations * count($testCases)) . " operations\n";
echo "Peak memory usage: " . number_format($peakMemory, 2) . " MB\n"; test_opt4_switch_statement.php - Optimization 4: Switch statement<?php
function splitOutsideParenthesis(string $type): array
{
$parts = [];
$currentPart = '';
$parenthesisLevel = 0;
$typeLength = \strlen($type);
for ($i = 0; $i < $typeLength; ++$i) {
$char = $type[$i];
switch ($char) {
case '(':
++$parenthesisLevel;
$currentPart .= $char;
break;
case ')':
--$parenthesisLevel;
$currentPart .= $char;
break;
case '|':
if (0 === $parenthesisLevel) {
$parts[] = $currentPart;
$currentPart = '';
} else {
$currentPart .= $char;
}
break;
default:
$currentPart .= $char;
break;
}
}
if ('' !== $currentPart) {
$parts[] = $currentPart;
}
return $parts;
}
$testCases = [
'string',
'int',
'bool',
'array',
'string|int',
'string|int|bool',
'string|int|bool|array',
'string|(int|bool)',
'(string|int)|bool',
'string|(int|(bool|float))',
'(string|int)|(bool|float)',
'MyClass',
'string[]',
'int[]',
'\\Namespace\\Class',
'string|int|bool|array|object|resource|callable',
];
$iterations = 100000;
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
foreach ($testCases as $testCase) {
splitOutsideParenthesis($testCase);
}
}
$end = microtime(true);
$duration = ($end - $start) * 1000;
$peakMemory = memory_get_peak_usage(true) / 1024 / 1024;
echo "Optimization 4 (switch statement): " . number_format($duration, 2) . "ms for " . ($iterations * count($testCases)) . " operations\n";
echo "Peak memory usage: " . number_format($peakMemory, 2) . " MB\n"; test_optimized.php - Final optimized implementation (all optimizations combined)<?php
function splitOutsideParenthesis(string $type): array
{
if (!\str_contains($type, '|')) {
return [$type];
}
if (!\str_contains($type, '(') && !\str_contains($type, ')')) {
return \explode('|', $type);
}
$parts = [];
$start = 0;
$parenthesisLevel = 0;
$length = \strlen($type);
for ($i = 0; $i < $length; ++$i) {
$char = $type[$i];
switch ($char) {
case '(':
++$parenthesisLevel;
break;
case ')':
--$parenthesisLevel;
break;
case '|':
if (0 === $parenthesisLevel) {
$parts[] = \substr($type, $start, $i - $start);
$start = $i + 1;
}
break;
}
}
if ($start < $length) {
$parts[] = \substr($type, $start);
}
return $parts;
}
$testCases = [
'string',
'int',
'bool',
'array',
'string|int',
'string|int|bool',
'string|int|bool|array',
'string|(int|bool)',
'(string|int)|bool',
'string|(int|(bool|float))',
'(string|int)|(bool|float)',
'MyClass',
'string[]',
'int[]',
'\\Namespace\\Class',
'string|int|bool|array|object|resource|callable',
];
$iterations = 100000;
$start = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
foreach ($testCases as $testCase) {
splitOutsideParenthesis($testCase);
}
}
$end = microtime(true);
$duration = ($end - $start) * 1000;
$peakMemory = memory_get_peak_usage(true) / 1024 / 1024;
echo "Optimized implementation: " . number_format($duration, 2) . "ms for " . ($iterations * count($testCases)) . " operations\n";
echo "Peak memory usage: " . number_format($peakMemory, 2) . " MB\n"; |
@@ -1215,30 +1215,40 @@ private function verifyTypes(string $type, mixed $value, ?array &$invalidTypes = | |||
*/ | |||
private function splitOutsideParenthesis(string $type): array | |||
{ | |||
if (!str_contains($type, '|')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe could we even store a hardcoded list of known simple types somewhere to avoid the repeated calls to str_contains()
in these cases.
Here is a list: https://www.php.net/manual/en/function.gettype.php
return [$type]; | ||
} | ||
|
||
if (!str_contains($type, '(') && !str_contains($type, ')')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could likely optimize that too by looping over characters and check for both parenthesis. This will prevent looping two times.
A regexp to avoid the two calls to str_contains()
may also be faster than the current implementation (but slower that the loop I propose).
i investigated @dunglas suggestion to use regex.
|
This PR optimises the
splitOutsideParenthesis
method inOptionsResolver.php
, achieving a 2.91x performance improvement.I discovered this method as a performance hotspot while benchmarking a large Symfony form with many fields. Profiling revealed that
splitOutsideParenthesis
was consuming a significant portion of the form processing time.The
splitOutsideParenthesis
method (introduced in PR #59354) is called frequently during options resolution and has several performance bottlenecks:Test Methodology
Here's how all performance measurements were conducted:
string
,int
,bool
,array
string|int
,string|int|bool
,string|int|bool|array
string|(int|bool)
,(string|int)|bool
string|(int|(bool|float))
,(string|int)|(bool|float)
string[]
,int[]
MyClass
,\\Namespace\\Class
string|int|bool|array|object|resource|callable
Each optimisation was tested in isolation to measure its individual impact, then all optimisations were combined for the final benchmark.
Optimisations
1. Fast Path for Simple Types (No Pipes)
Most type declarations are simple types like
string
,int
,MyClass
, etc. without any union types.Implementation:
2. Fast Path for Union Types (No Parentheses)
Common union types like
string|int|bool
don't need complex parsing - PHP'sexplode()
is much faster.Implementation:
3. Eliminate String Concatenation
String concatenation in loops creates memory overhead. Using
substr()
avoids creating intermediate strings.Implementation:
4. Switch Statement Optimisation
Eliminates Multiple conditional checks per character.
Implementation:
Benchmark Results
Individual Optimisation Impact
Testing each optimisation in isolation:
Combined Optimisation Impact
Combining all optimisations: