Skip to content

Commit c6b3fb4

Browse files
committed
Fix inadequately-sized output buffer in contrib/unaccent.
The output buffer size in unaccent_lexize() was calculated as input string length times pg_database_encoding_max_length(), which effectively assumes that replacement strings aren't more than one character. While that was all that we previously documented it to support, the code actually has always allowed replacement strings of arbitrary length; so if you tried to make use of longer strings, you were at risk of buffer overrun. To fix, use an expansible StringInfo buffer instead of trying to determine the maximum space needed a-priori. This would be a security issue if unaccent rules files could be installed by unprivileged users; but fortunately they can't, so in the back branches the problem can be labeled as improper configuration by a superuser. Nonetheless, a memory stomp isn't a nice way of reacting to improper configuration, so let's back-patch the fix.
1 parent d51a600 commit c6b3fb4

File tree

1 file changed

+27
-24
lines changed

1 file changed

+27
-24
lines changed

contrib/unaccent/unaccent.c

Lines changed: 27 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
#include "fmgr.h"
1717
#include "catalog/namespace.h"
1818
#include "commands/defrem.h"
19+
#include "lib/stringinfo.h"
1920
#include "mb/pg_wchar.h"
2021
#include "tsearch/ts_cache.h"
2122
#include "tsearch/ts_locale.h"
@@ -267,46 +268,48 @@ unaccent_lexize(PG_FUNCTION_ARGS)
267268
SuffixChar *rootSuffixTree = (SuffixChar *) PG_GETARG_POINTER(0);
268269
char *srcchar = (char *) PG_GETARG_POINTER(1);
269270
int32 len = PG_GETARG_INT32(2);
270-
char *srcstart,
271-
*trgchar = NULL;
272-
int charlen;
273-
TSLexeme *res = NULL;
274-
SuffixChar *node;
271+
char *srcstart = srcchar;
272+
TSLexeme *res;
273+
StringInfoData buf;
274+
275+
/* we allocate storage for the buffer only if needed */
276+
buf.data = NULL;
275277

276-
srcstart = srcchar;
277278
while (srcchar - srcstart < len)
278279
{
280+
SuffixChar *node;
281+
int charlen;
282+
279283
charlen = pg_mblen(srcchar);
280284

281285
node = findReplaceTo(rootSuffixTree, (unsigned char *) srcchar, charlen);
282286
if (node && node->replaceTo)
283287
{
284-
if (!res)
288+
if (buf.data == NULL)
285289
{
286-
/* allocate res only it it's needed */
287-
res = palloc0(sizeof(TSLexeme) * 2);
288-
res->lexeme = trgchar = palloc(len * pg_database_encoding_max_length() + 1 /* \0 */ );
289-
res->flags = TSL_FILTER;
290+
/* initialize buffer */
291+
initStringInfo(&buf);
292+
/* insert any data we already skipped over */
290293
if (srcchar != srcstart)
291-
{
292-
memcpy(trgchar, srcstart, srcchar - srcstart);
293-
trgchar += (srcchar - srcstart);
294-
}
294+
appendBinaryStringInfo(&buf, srcstart, srcchar - srcstart);
295295
}
296-
memcpy(trgchar, node->replaceTo, node->replacelen);
297-
trgchar += node->replacelen;
298-
}
299-
else if (res)
300-
{
301-
memcpy(trgchar, srcchar, charlen);
302-
trgchar += charlen;
296+
appendBinaryStringInfo(&buf, node->replaceTo, node->replacelen);
303297
}
298+
else if (buf.data != NULL)
299+
appendBinaryStringInfo(&buf, srcchar, charlen);
304300

305301
srcchar += charlen;
306302
}
307303

308-
if (res)
309-
*trgchar = '\0';
304+
/* return a result only if we made at least one substitution */
305+
if (buf.data != NULL)
306+
{
307+
res = (TSLexeme *) palloc0(sizeof(TSLexeme) * 2);
308+
res->lexeme = buf.data;
309+
res->flags = TSL_FILTER;
310+
}
311+
else
312+
res = NULL;
310313

311314
PG_RETURN_POINTER(res);
312315
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy