Skip to content

Commit d535136

Browse files
committed
Don't downcase non-ascii identifier chars in multi-byte encodings.
Long-standing code has called tolower() on identifier character bytes with the high bit set. This is clearly an error and produces junk output when the encoding is multi-byte. This patch therefore restricts this activity to cases where there is a character with the high bit set AND the encoding is single-byte. There have been numerous gripes about this, most recently from Martin Schäfer. Backpatch to all live releases.
1 parent 94e3311 commit d535136

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

src/backend/parser/scansup.c

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -132,25 +132,27 @@ downcase_truncate_identifier(const char *ident, int len, bool warn)
132132
{
133133
char *result;
134134
int i;
135+
bool enc_is_single_byte;
135136

136137
result = palloc(len + 1);
138+
enc_is_single_byte = pg_database_encoding_max_length() == 1;
137139

138140
/*
139141
* SQL99 specifies Unicode-aware case normalization, which we don't yet
140142
* have the infrastructure for. Instead we use tolower() to provide a
141143
* locale-aware translation. However, there are some locales where this
142144
* is not right either (eg, Turkish may do strange things with 'i' and
143145
* 'I'). Our current compromise is to use tolower() for characters with
144-
* the high bit set, and use an ASCII-only downcasing for 7-bit
145-
* characters.
146+
* the high bit set, as long as they aren't part of a multi-byte character,
147+
* and use an ASCII-only downcasing for 7-bit characters.
146148
*/
147149
for (i = 0; i < len; i++)
148150
{
149151
unsigned char ch = (unsigned char) ident[i];
150152

151153
if (ch >= 'A' && ch <= 'Z')
152154
ch += 'a' - 'A';
153-
else if (IS_HIGHBIT_SET(ch) && isupper(ch))
155+
else if (enc_is_single_byte && IS_HIGHBIT_SET(ch) && isupper(ch))
154156
ch = tolower(ch);
155157
result[i] = (char) ch;
156158
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy