Skip to content

Commit 3359a81

Browse files
committed
Fix incorrect search for "x?" style matches in creviterdissect().
When the number of allowed iterations is limited (either a "?" quantifier or a bound expression), the last sub-match has to reach to the end of the target string. The previous coding here first tried the shortest possible match (one character, usually) and then gave up and back-tracked if that didn't work, typically leading to failure to match overall, as shown in bug #11478 from Christoph Berg. The minimum change to fix that would be to not decrement k before "goto backtrack"; but that would be a pretty stupid solution, because we'd laboriously try each possible sub-match length before finally discovering that only ending at the end can work. Instead, force the sub-match endpoint limit up to the end for even the first shortest() call if we cannot have any more sub-matches after this one. Bug introduced in my rewrite that added the iterdissect logic, commit 173e29a. The shortest-first search code was too closely modeled on the longest-first code, which hasn't got this issue since it tries a match reaching to the end to start with anyway. Back-patch to all affected branches.
1 parent 5ff8c2d commit 3359a81

File tree

3 files changed

+16
-0
lines changed

3 files changed

+16
-0
lines changed

src/backend/regex/regexec.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1190,6 +1190,10 @@ creviterdissect(struct vars * v,
11901190
(k >= min_matches || min_matches - k < end - limit))
11911191
limit++;
11921192

1193+
/* if this is the last allowed sub-match, it must reach to the end */
1194+
if (k >= max_matches)
1195+
limit = end;
1196+
11931197
/* try to find an endpoint for the k'th sub-match */
11941198
endpts[k] = shortest(v, d, endpts[k - 1], limit, end,
11951199
(chr **) NULL, (int *) NULL);

src/test/regress/expected/regex.out

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,3 +188,11 @@ select regexp_matches('Programmer', '(\w)(.*?\1)', 'g');
188188
{m,m}
189189
(2 rows)
190190

191+
-- Test for proper matching of non-greedy iteration (bug #11478)
192+
select regexp_matches('foo/bar/baz',
193+
'^([^/]+?)(?:/([^/]+?))(?:/([^/]+?))?$', '');
194+
regexp_matches
195+
----------------
196+
{foo,bar,baz}
197+
(1 row)
198+

src/test/regress/sql/regex.sql

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,7 @@ select 'a' ~ '((((((a+|)+|)+|)+|)+|)+|)';
4646
-- https://core.tcl.tk/tcl/tktview/6585b21ca8fa6f3678d442b97241fdd43dba2ec0
4747
select 'Programmer' ~ '(\w).*?\1' as t;
4848
select regexp_matches('Programmer', '(\w)(.*?\1)', 'g');
49+
50+
-- Test for proper matching of non-greedy iteration (bug #11478)
51+
select regexp_matches('foo/bar/baz',
52+
'^([^/]+?)(?:/([^/]+?))(?:/([^/]+?))?$', '');

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy