Skip to content

Commit fc2660f

Browse files
committed
Fix LIKE's special-case code for % followed by _. I'm not entirely sure that
this case is worth a special code path, but a special code path that gets the boundary condition wrong is definitely no good. Per bug #4821 from Andrew Gierth. In passing, clean up some minor code formatting issues (excess parentheses and blank lines in odd places). Back-patch to 8.3, where the bug was introduced.
1 parent c6c458e commit fc2660f

File tree

3 files changed

+96
-62
lines changed

3 files changed

+96
-62
lines changed

src/backend/utils/adt/like_match.c

Lines changed: 58 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
* (UTF8 is a special case because we can use a much more efficient version
1010
* of NextChar than can be used for general multi-byte encodings.)
1111
*
12-
* Before the inclusion, we need to define following macros:
12+
* Before the inclusion, we need to define the following macros:
1313
*
1414
* NextChar
1515
* MatchText - to name of function wanted
@@ -19,47 +19,46 @@
1919
* Copyright (c) 1996-2009, PostgreSQL Global Development Group
2020
*
2121
* IDENTIFICATION
22-
* $PostgreSQL: pgsql/src/backend/utils/adt/like_match.c,v 1.24 2009/01/01 17:23:49 momjian Exp $
22+
* $PostgreSQL: pgsql/src/backend/utils/adt/like_match.c,v 1.25 2009/05/24 18:10:37 tgl Exp $
2323
*
2424
*-------------------------------------------------------------------------
2525
*/
2626

2727
/*
28-
** Originally written by Rich $alz, mirror!rs, Wed Nov 26 19:03:17 EST 1986.
29-
** Rich $alz is now <rsalz@bbn.com>.
30-
** Special thanks to Lars Mathiesen <thorinn@diku.dk> for the LABORT code.
31-
**
32-
** This code was shamelessly stolen from the "pql" code by myself and
33-
** slightly modified :)
34-
**
35-
** All references to the word "star" were replaced by "percent"
36-
** All references to the word "wild" were replaced by "like"
37-
**
38-
** All the nice shell RE matching stuff was replaced by just "_" and "%"
39-
**
40-
** As I don't have a copy of the SQL standard handy I wasn't sure whether
41-
** to leave in the '\' escape character handling.
42-
**
43-
** Keith Parks. <keith@mtcc.demon.co.uk>
44-
**
45-
** SQL92 lets you specify the escape character by saying
46-
** LIKE <pattern> ESCAPE <escape character>. We are a small operation
47-
** so we force you to use '\'. - ay 7/95
48-
**
49-
** Now we have the like_escape() function that converts patterns with
50-
** any specified escape character (or none at all) to the internal
51-
** default escape character, which is still '\'. - tgl 9/2000
52-
**
53-
** The code is rewritten to avoid requiring null-terminated strings,
54-
** which in turn allows us to leave out some memcpy() operations.
55-
** This code should be faster and take less memory, but no promises...
56-
** - thomas 2000-08-06
57-
**
58-
*/
28+
* Originally written by Rich $alz, mirror!rs, Wed Nov 26 19:03:17 EST 1986.
29+
* Rich $alz is now <rsalz@bbn.com>.
30+
* Special thanks to Lars Mathiesen <thorinn@diku.dk> for the LABORT code.
31+
*
32+
* This code was shamelessly stolen from the "pql" code by myself and
33+
* slightly modified :)
34+
*
35+
* All references to the word "star" were replaced by "percent"
36+
* All references to the word "wild" were replaced by "like"
37+
*
38+
* All the nice shell RE matching stuff was replaced by just "_" and "%"
39+
*
40+
* As I don't have a copy of the SQL standard handy I wasn't sure whether
41+
* to leave in the '\' escape character handling.
42+
*
43+
* Keith Parks. <keith@mtcc.demon.co.uk>
44+
*
45+
* SQL92 lets you specify the escape character by saying
46+
* LIKE <pattern> ESCAPE <escape character>. We are a small operation
47+
* so we force you to use '\'. - ay 7/95
48+
*
49+
* Now we have the like_escape() function that converts patterns with
50+
* any specified escape character (or none at all) to the internal
51+
* default escape character, which is still '\'. - tgl 9/2000
52+
*
53+
* The code is rewritten to avoid requiring null-terminated strings,
54+
* which in turn allows us to leave out some memcpy() operations.
55+
* This code should be faster and take less memory, but no promises...
56+
* - thomas 2000-08-06
57+
*/
5958

6059

6160
/*--------------------
62-
* Match text and p, return LIKE_TRUE, LIKE_FALSE, or LIKE_ABORT.
61+
* Match text and pattern, return LIKE_TRUE, LIKE_FALSE, or LIKE_ABORT.
6362
*
6463
* LIKE_TRUE: they match
6564
* LIKE_FALSE: they don't match
@@ -80,19 +79,18 @@ static int
8079
MatchText(char *t, int tlen, char *p, int plen)
8180
{
8281
/* Fast path for match-everything pattern */
83-
if ((plen == 1) && (*p == '%'))
82+
if (plen == 1 && *p == '%')
8483
return LIKE_TRUE;
8584

8685
/*
8786
* In this loop, we advance by char when matching wildcards (and thus on
8887
* recursive entry to this function we are properly char-synced). On other
8988
* occasions it is safe to advance by byte, as the text and pattern will
90-
* be in lockstep. This allows us to perform all comparisons between the
89+
* be in lockstep. This allows us to perform all comparisons between the
9190
* text and pattern on a byte by byte basis, even for multi-byte
9291
* encodings.
9392
*/
94-
95-
while ((tlen > 0) && (plen > 0))
93+
while (tlen > 0 && plen > 0)
9694
{
9795
if (*p == '\\')
9896
{
@@ -116,7 +114,7 @@ MatchText(char *t, int tlen, char *p, int plen)
116114

117115
/* %% is the same as % according to the SQL standard */
118116
/* Advance past all %'s */
119-
while ((plen > 0) && (*p == '%'))
117+
while (plen > 0 && *p == '%')
120118
NextByte(p, plen);
121119
/* Trailing percent matches everything. */
122120
if (plen <= 0)
@@ -127,22 +125,24 @@ MatchText(char *t, int tlen, char *p, int plen)
127125
* rest of the pattern.
128126
*/
129127
if (*p == '_')
130-
131128
{
132129
/* %_ is the same as _% - avoid matching _ repeatedly */
133130

134-
NextChar(t, tlen);
135-
NextByte(p, plen);
136-
137-
if (tlen <= 0)
138-
{
139-
return (plen <= 0) ? LIKE_TRUE : LIKE_ABORT;
140-
}
141-
else if (plen <= 0)
131+
do
142132
{
143-
return LIKE_FALSE;
144-
}
145-
133+
NextChar(t, tlen);
134+
NextByte(p, plen);
135+
} while (tlen > 0 && plen > 0 && *p == '_');
136+
137+
/*
138+
* If we are at the end of the pattern, succeed: % followed
139+
* by n _'s matches any string of at least n characters, and
140+
* we have now found there are at least n characters.
141+
*/
142+
if (plen <= 0)
143+
return LIKE_TRUE;
144+
145+
/* Look for a place that matches the rest of the pattern */
146146
while (tlen > 0)
147147
{
148148
int matched = MatchText(t, tlen, p, plen);
@@ -155,7 +155,6 @@ MatchText(char *t, int tlen, char *p, int plen)
155155
}
156156
else
157157
{
158-
159158
char firstpat = TCHAR(*p);
160159

161160
if (*p == '\\')
@@ -180,7 +179,6 @@ MatchText(char *t, int tlen, char *p, int plen)
180179
}
181180

182181
NextChar(t, tlen);
183-
184182
}
185183
}
186184

@@ -192,20 +190,20 @@ MatchText(char *t, int tlen, char *p, int plen)
192190
}
193191
else if (*p == '_')
194192
{
193+
/* _ matches any single character, and we know there is one */
195194
NextChar(t, tlen);
196195
NextByte(p, plen);
197196
continue;
198197
}
199-
else if (TCHAR(*t) != TCHAR(*p))
198+
else if (TCHAR(*p) != TCHAR(*t))
200199
{
201-
/*
202-
* Not the single-character wildcard and no explicit match? Then
203-
* time to quit...
204-
*/
200+
/* non-wildcard pattern char fails to match text char */
205201
return LIKE_FALSE;
206202
}
207203

208204
/*
205+
* Pattern and text match, so advance.
206+
*
209207
* It is safe to use NextByte instead of NextChar here, even for
210208
* multi-byte character sets, because we are not following immediately
211209
* after a wildcard character. If we are in the middle of a multibyte
@@ -222,9 +220,8 @@ MatchText(char *t, int tlen, char *p, int plen)
222220
if (tlen > 0)
223221
return LIKE_FALSE; /* end of pattern, but not of text */
224222

225-
/* End of input string. Do we have matching pattern remaining? */
226-
while ((plen > 0) && (*p == '%')) /* allow multiple %'s at end of
227-
* pattern */
223+
/* End of text string. Do we have matching pattern remaining? */
224+
while (plen > 0 && *p == '%') /* allow multiple %'s at end of pattern */
228225
NextByte(p, plen);
229226

230227
if (plen <= 0)
@@ -354,5 +351,4 @@ do_like_escape(text *pat, text *esc)
354351

355352
#ifdef MATCH_LOWER
356353
#undef MATCH_LOWER
357-
358354
#endif

src/test/regress/expected/strings.out

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -902,6 +902,33 @@ SELECT 'Hawkeye' NOT ILIKE 'h%' AS "false";
902902
f
903903
(1 row)
904904

905+
--
906+
-- test %/_ combination cases, cf bug #4821
907+
--
908+
SELECT 'foo' LIKE '_%' as t, 'f' LIKE '_%' as t, '' LIKE '_%' as f;
909+
t | t | f
910+
---+---+---
911+
t | t | f
912+
(1 row)
913+
914+
SELECT 'foo' LIKE '%_' as t, 'f' LIKE '%_' as t, '' LIKE '%_' as f;
915+
t | t | f
916+
---+---+---
917+
t | t | f
918+
(1 row)
919+
920+
SELECT 'foo' LIKE '__%' as t, 'foo' LIKE '___%' as t, 'foo' LIKE '____%' as f;
921+
t | t | f
922+
---+---+---
923+
t | t | f
924+
(1 row)
925+
926+
SELECT 'foo' LIKE '%__' as t, 'foo' LIKE '%___' as t, 'foo' LIKE '%____' as f;
927+
t | t | f
928+
---+---+---
929+
t | t | f
930+
(1 row)
931+
905932
--
906933
-- test implicit type conversion
907934
--

src/test/regress/sql/strings.sql

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -279,6 +279,17 @@ SELECT 'hawkeye' NOT ILIKE 'H%Eye' AS "false";
279279
SELECT 'Hawkeye' ILIKE 'h%' AS "true";
280280
SELECT 'Hawkeye' NOT ILIKE 'h%' AS "false";
281281

282+
--
283+
-- test %/_ combination cases, cf bug #4821
284+
--
285+
286+
SELECT 'foo' LIKE '_%' as t, 'f' LIKE '_%' as t, '' LIKE '_%' as f;
287+
SELECT 'foo' LIKE '%_' as t, 'f' LIKE '%_' as t, '' LIKE '%_' as f;
288+
289+
SELECT 'foo' LIKE '__%' as t, 'foo' LIKE '___%' as t, 'foo' LIKE '____%' as f;
290+
SELECT 'foo' LIKE '%__' as t, 'foo' LIKE '%___' as t, 'foo' LIKE '%____' as f;
291+
292+
282293
--
283294
-- test implicit type conversion
284295
--

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy