Skip to content

Commit 160f2cb

Browse files
committed
Don't downcase non-ascii identifier chars in multi-byte encodings.
Long-standing code has called tolower() on identifier character bytes with the high bit set. This is clearly an error and produces junk output when the encoding is multi-byte. This patch therefore restricts this activity to cases where there is a character with the high bit set AND the encoding is single-byte. There have been numerous gripes about this, most recently from Martin Schäfer. Backpatch to all live releases.
1 parent 54f6836 commit 160f2cb

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

src/backend/parser/scansup.c

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -130,25 +130,27 @@ downcase_truncate_identifier(const char *ident, int len, bool warn)
130130
{
131131
char *result;
132132
int i;
133+
bool enc_is_single_byte;
133134

134135
result = palloc(len + 1);
136+
enc_is_single_byte = pg_database_encoding_max_length() == 1;
135137

136138
/*
137139
* SQL99 specifies Unicode-aware case normalization, which we don't yet
138140
* have the infrastructure for. Instead we use tolower() to provide a
139141
* locale-aware translation. However, there are some locales where this
140142
* is not right either (eg, Turkish may do strange things with 'i' and
141143
* 'I'). Our current compromise is to use tolower() for characters with
142-
* the high bit set, and use an ASCII-only downcasing for 7-bit
143-
* characters.
144+
* the high bit set, as long as they aren't part of a multi-byte character,
145+
* and use an ASCII-only downcasing for 7-bit characters.
144146
*/
145147
for (i = 0; i < len; i++)
146148
{
147149
unsigned char ch = (unsigned char) ident[i];
148150

149151
if (ch >= 'A' && ch <= 'Z')
150152
ch += 'a' - 'A';
151-
else if (IS_HIGHBIT_SET(ch) && isupper(ch))
153+
else if (enc_is_single_byte && IS_HIGHBIT_SET(ch) && isupper(ch))
152154
ch = tolower(ch);
153155
result[i] = (char) ch;
154156
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy