Skip to content

Commit 875e46a

Browse files
committed
Documentation update for Standard Collations.
Correct out-of-date text that said the "default" collation is always based on LC_COLLATE and LC_CTYPE. Also reformat into a list to make it easier to understand and compare the available collations, and briefly document the stability characteristics of each one. Discussion: https://postgr.es/m/4a69d067374d2f6bfb66f5bfb2ab9a020493d49f.camel@j-davis.com
1 parent 1e01374 commit 875e46a

File tree

1 file changed

+45
-27
lines changed

1 file changed

+45
-27
lines changed

doc/src/sgml/charset.sgml

Lines changed: 45 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -788,37 +788,19 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
788788
<title>Standard Collations</title>
789789

790790
<para>
791-
On all platforms, the collations named <literal>default</literal>,
792-
<literal>C</literal>, and <literal>POSIX</literal> are available. Additional
793-
collations may be available depending on operating system support.
794-
The <literal>default</literal> collation selects the <symbol>LC_COLLATE</symbol>
795-
and <symbol>LC_CTYPE</symbol> values specified at database creation time.
796-
The <literal>C</literal> and <literal>POSIX</literal> collations both specify
797-
<quote>traditional C</quote> behavior, in which only the ASCII letters
798-
<quote><literal>A</literal></quote> through <quote><literal>Z</literal></quote>
799-
are treated as letters, and sorting is done strictly by character
800-
code byte values.
801-
</para>
802-
803-
<note>
804-
<para>
805-
The <literal>C</literal> and <literal>POSIX</literal> locales may behave
806-
differently depending on the database encoding.
807-
</para>
808-
</note>
809-
810-
<para>
811-
Additionally, two SQL standard collation names are available:
791+
On all platforms, the following collations are supported:
812792

813793
<variablelist>
814794
<varlistentry>
815795
<term><literal>unicode</literal></term>
816796
<listitem>
817797
<para>
818-
This collation sorts using the Unicode Collation Algorithm with the
819-
Default Unicode Collation Element Table. It is available in all
820-
encodings. ICU support is required to use this collation. (This
821-
collation has the same behavior as the ICU root locale; see <xref
798+
This SQL standard collation sorts using the Unicode Collation
799+
Algorithm with the Default Unicode Collation Element Table. It is
800+
available in all encodings. ICU support is required to use this
801+
collation, and behavior may change if Postgres is built with a
802+
different version of ICU. (This collation has the same behavior as
803+
the ICU root locale; see <xref
822804
linkend="collation-managing-predefined-icu-und-x-icu"/>.)
823805
</para>
824806
</listitem>
@@ -828,15 +810,51 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
828810
<term><literal>ucs_basic</literal></term>
829811
<listitem>
830812
<para>
831-
This collation sorts by Unicode code point. It is only available for
832-
encoding <literal>UTF8</literal>. (This collation has the same
813+
This SQL standard collation sorts using the Unicode code point values
814+
rather than natural language order, and only the ASCII letters
815+
<quote><literal>A</literal></quote> through
816+
<quote><literal>Z</literal></quote> are treated as letters. The
817+
behavior is efficient and stable across all versions. Only available
818+
for encoding <literal>UTF8</literal>. (This collation has the same
833819
behavior as the libc locale specification <literal>C</literal> in
834820
<literal>UTF8</literal> encoding.)
835821
</para>
836822
</listitem>
837823
</varlistentry>
824+
825+
<varlistentry>
826+
<term><literal>C</literal> (equivalent to <literal>POSIX</literal>)</term>
827+
<listitem>
828+
<para>
829+
The <literal>C</literal> and <literal>POSIX</literal> collations are
830+
based on <quote>traditional C</quote> behavior. They sort by byte
831+
values rather than natural language order, and only the ASCII letters
832+
<quote><literal>A</literal></quote> through
833+
<quote><literal>Z</literal></quote> are treated as letters. The
834+
behavior is efficient and stable across all versions for a given
835+
database encoding, but behavior may vary between different database
836+
encodings.
837+
</para>
838+
</listitem>
839+
</varlistentry>
840+
841+
<varlistentry>
842+
<term><literal>default</literal></term>
843+
<listitem>
844+
<para>
845+
The <literal>default</literal> collation selects the locale specified
846+
at database creation time.
847+
</para>
848+
</listitem>
849+
</varlistentry>
838850
</variablelist>
839851
</para>
852+
853+
<para>
854+
Additional collations may be available depending on operating system
855+
support. The efficiency and stability of these additional collations
856+
depend on the collation provider, the provider version, and the locale.
857+
</para>
840858
</sect3>
841859

842860
<sect3 id="collation-managing-predefined">

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy