Skip to content

Commit 2e2d56f

Browse files
committed
Information schema views for collation support
Add the views character_sets, collations, and collation_character_set_applicability.
1 parent 183d3cf commit 2e2d56f

File tree

3 files changed

+292
-4
lines changed

3 files changed

+292
-4
lines changed

doc/src/sgml/information_schema.sgml

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -498,6 +498,140 @@
498498
</para>
499499
</sect1>
500500

501+
<sect1 id="infoschema-character-sets">
502+
<title><literal>character_sets</literal></title>
503+
504+
<para>
505+
The view <literal>character_sets</literal> identifies the character
506+
sets available in the current database. Since PostgreSQL does not
507+
support multiple character sets within one database, this view only
508+
shows one, which is the database encoding.
509+
</para>
510+
511+
<para>
512+
Take note of how the following terms are used in the SQL standard:
513+
<variablelist>
514+
<varlistentry>
515+
<term>character repertoire</term>
516+
<listitem>
517+
<para>
518+
An abstract collection of characters, for
519+
example <literal>UNICODE</literal>, <literal>UCS</literal>, or
520+
<literal>LATIN1</literal>. Not exposed as an SQL object, but
521+
visible in this view.
522+
</para>
523+
</listitem>
524+
</varlistentry>
525+
526+
<varlistentry>
527+
<term>character encoding form</term>
528+
<listitem>
529+
<para>
530+
An encoding of some character repertoire. Most older character
531+
repertoires only use one encoding form, and so there are no
532+
separate names for them (e.g., <literal>LATIN1</literal> is an
533+
encoding form applicable to the <literal>LATIN1</literal>
534+
repertoire). But for example Unicode has the encoding forms
535+
<literal>UTF8</literal>, <literal>UTF16</literal>, etc. (not
536+
all supported by PostgreSQL). Encoding forms are not exposed
537+
as an SQL object, but are visible in this view.
538+
</para>
539+
</listitem>
540+
</varlistentry>
541+
542+
<varlistentry>
543+
<term>character set</term>
544+
<listitem>
545+
<para>
546+
A named SQL object that identifies a character repertoire, a
547+
character encoding, and a default collation. A predefined
548+
character set would typically have the same name as an encoding
549+
form, but users could define other names. For example, the
550+
character set <literal>UTF8</literal> would typically identify
551+
the character repertoire <literal>UCS</literal>, encoding
552+
form <literal>UTF8</literal>, and some default collation.
553+
</para>
554+
</listitem>
555+
</varlistentry>
556+
</variablelist>
557+
558+
You can think of an <quote>encoding</quote> in PostgreSQL either as
559+
a character set or a character encoding form. They will have the
560+
same name, and there can only be one in one database.
561+
</para>
562+
563+
<table>
564+
<title><literal>character_sets</literal> Columns</title>
565+
566+
<tgroup cols="3">
567+
<thead>
568+
<row>
569+
<entry>Name</entry>
570+
<entry>Data Type</entry>
571+
<entry>Description</entry>
572+
</row>
573+
</thead>
574+
575+
<tbody>
576+
<row>
577+
<entry><literal>character_set_catalog</literal></entry>
578+
<entry><literal>sql_identifier</literal></entry>
579+
<entry>Character sets are currently not implemented as schema objects, so this column is null.</entry>
580+
</row>
581+
582+
<row>
583+
<entry><literal>character_set_schema</literal></entry>
584+
<entry><literal>sql_identifier</literal></entry>
585+
<entry>Character sets are currently not implemented as schema objects, so this column is null.</entry>
586+
</row>
587+
588+
<row>
589+
<entry><literal>character_set_name</literal></entry>
590+
<entry><literal>sql_identifier</literal></entry>
591+
<entry>Name of the character set, currently implemented as showing the name of the database encoding</entry>
592+
</row>
593+
594+
<row>
595+
<entry><literal>character_repertoire</literal></entry>
596+
<entry><literal>sql_identifier</literal></entry>
597+
<entry>Character repertoire, showing <literal>UCS</literal> if the encoding is <literal>UTF8</literal>, else just the encoding name</entry>
598+
</row>
599+
600+
<row>
601+
<entry><literal>form_of_use</literal></entry>
602+
<entry><literal>sql_identifier</literal></entry>
603+
<entry>Character encoding form, same as the database encoding</entry>
604+
</row>
605+
606+
<row>
607+
<entry><literal>default_collate_catalog</literal></entry>
608+
<entry><literal>sql_identifier</literal></entry>
609+
<entry>Name of the database containing the default collation (always the current database, if any collation is identified)</entry>
610+
</row>
611+
612+
<row>
613+
<entry><literal>default_collate_schema</literal></entry>
614+
<entry><literal>sql_identifier</literal></entry>
615+
<entry>Name of the schema containing the default collation</entry>
616+
</row>
617+
618+
<row>
619+
<entry><literal>default_collate_name</literal></entry>
620+
<entry><literal>sql_identifier</literal></entry>
621+
<entry>
622+
Name of the default collation. The default collation is
623+
identified as the collation that matches
624+
the <literal>COLLATE</literal> and <literal>CTYPE</literal>
625+
settings of the current database. If there is no such
626+
collation, then this column and the associated schema and
627+
catalog columns are null.
628+
</entry>
629+
</row>
630+
</tbody>
631+
</tgroup>
632+
</table>
633+
</sect1>
634+
501635
<sect1 id="infoschema-check-constraint-routine-usage">
502636
<title><literal>check_constraint_routine_usage</literal></title>
503637

@@ -615,6 +749,123 @@
615749
</table>
616750
</sect1>
617751

752+
<sect1 id="infoschema-collations">
753+
<title><literal>collations</literal></title>
754+
755+
<para>
756+
The view <literal>collations</literal> contains the collations
757+
available in the current database.
758+
</para>
759+
760+
<table>
761+
<title><literal>collations</literal> Columns</title>
762+
763+
<tgroup cols="3">
764+
<thead>
765+
<row>
766+
<entry>Name</entry>
767+
<entry>Data Type</entry>
768+
<entry>Description</entry>
769+
</row>
770+
</thead>
771+
772+
<tbody>
773+
<row>
774+
<entry><literal>collation_catalog</literal></entry>
775+
<entry><literal>sql_identifier</literal></entry>
776+
<entry>Name of the database containing the collation (always the current database)</entry>
777+
</row>
778+
779+
<row>
780+
<entry><literal>collation_schema</literal></entry>
781+
<entry><literal>sql_identifier</literal></entry>
782+
<entry>Name of the schema containing the collation</entry>
783+
</row>
784+
785+
<row>
786+
<entry><literal>collation_name</literal></entry>
787+
<entry><literal>sql_identifier</literal></entry>
788+
<entry>Name of the default collation</entry>
789+
</row>
790+
791+
<row>
792+
<entry><literal>pad_attribute</literal></entry>
793+
<entry><literal>character_data</literal></entry>
794+
<entry>
795+
Always <literal>NO PAD</literal> (The alternative <literal>PAD
796+
SPACE</literal> is not supported by PostgreSQL.)
797+
</entry>
798+
</row>
799+
</tbody>
800+
</tgroup>
801+
</table>
802+
</sect1>
803+
804+
<sect1 id="infoschema-collation-character-set-applicability">
805+
<title><literal>collation_character_set_applicability</literal></title>
806+
807+
<para>
808+
The view <literal>collation_character_set_applicability</literal>
809+
identifies which character set the available collations are
810+
applicable to. In PostgreSQL, there is only one character set per
811+
database (see explanation
812+
in <xref linkend="infoschema-character-sets">), so this view does
813+
not provide much useful information.
814+
</para>
815+
816+
<table>
817+
<title><literal>collation_character_set_applicability</literal> Columns</title>
818+
819+
<tgroup cols="3">
820+
<thead>
821+
<row>
822+
<entry>Name</entry>
823+
<entry>Data Type</entry>
824+
<entry>Description</entry>
825+
</row>
826+
</thead>
827+
828+
<tbody>
829+
<row>
830+
<entry><literal>collation_catalog</literal></entry>
831+
<entry><literal>sql_identifier</literal></entry>
832+
<entry>Name of the database containing the collation (always the current database)</entry>
833+
</row>
834+
835+
<row>
836+
<entry><literal>collation_schema</literal></entry>
837+
<entry><literal>sql_identifier</literal></entry>
838+
<entry>Name of the schema containing the collation</entry>
839+
</row>
840+
841+
<row>
842+
<entry><literal>collation_name</literal></entry>
843+
<entry><literal>sql_identifier</literal></entry>
844+
<entry>Name of the default collation</entry>
845+
</row>
846+
847+
<row>
848+
<entry><literal>character_set_catalog</literal></entry>
849+
<entry><literal>sql_identifier</literal></entry>
850+
<entry>Character sets are currently not implemented as schema objects, so this column is null</entry>
851+
</row>
852+
853+
<row>
854+
<entry><literal>character_set_schema</literal></entry>
855+
<entry><literal>sql_identifier</literal></entry>
856+
<entry>Character sets are currently not implemented as schema objects, so this column is null</entry>
857+
</row>
858+
859+
<row>
860+
<entry><literal>character_set_name</literal></entry>
861+
<entry><literal>sql_identifier</literal></entry>
862+
<entry>Name of the character set</entry>
863+
</row>
864+
</tbody>
865+
</tgroup>
866+
</table>
867+
</sect1>
868+
618869
<sect1 id="infoschema-column-domain-usage">
619870
<title><literal>column_domain_usage</literal></title>
620871

src/backend/catalog/information_schema.sql

Lines changed: 40 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,23 @@ GRANT SELECT ON attributes TO PUBLIC;
354354
* CHARACTER_SETS view
355355
*/
356356

357-
-- feature not supported
357+
CREATE VIEW character_sets AS
358+
SELECT CAST(null AS sql_identifier) AS character_set_catalog,
359+
CAST(null AS sql_identifier) AS character_set_schema,
360+
CAST(getdatabaseencoding() AS sql_identifier) AS character_set_name,
361+
CAST(CASE WHEN getdatabaseencoding() = 'UTF8' THEN 'UCS' ELSE getdatabaseencoding() END AS sql_identifier) AS character_repertoire,
362+
CAST(getdatabaseencoding() AS sql_identifier) AS form_of_use,
363+
CAST(current_database() AS sql_identifier) AS default_collate_catalog,
364+
CAST(nc.nspname AS sql_identifier) AS default_collate_schema,
365+
CAST(c.collname AS sql_identifier) AS default_collate_name
366+
FROM pg_database d
367+
LEFT JOIN (pg_collation c JOIN pg_namespace nc ON (c.collnamespace = nc.oid))
368+
ON (datcollate = collcollate AND datctype = collctype)
369+
WHERE d.datname = current_database()
370+
ORDER BY char_length(c.collname) DESC, c.collname ASC -- prefer full/canonical name
371+
LIMIT 1;
372+
373+
GRANT SELECT ON character_sets TO PUBLIC;
358374

359375

360376
/*
@@ -425,14 +441,35 @@ GRANT SELECT ON check_constraints TO PUBLIC;
425441
* COLLATIONS view
426442
*/
427443

428-
-- feature not supported
444+
CREATE VIEW collations AS
445+
SELECT CAST(current_database() AS sql_identifier) AS collation_catalog,
446+
CAST(nc.nspname AS sql_identifier) AS collation_schema,
447+
CAST(c.collname AS sql_identifier) AS collation_name,
448+
CAST('NO PAD' AS character_data) AS pad_attribute
449+
FROM pg_collation c, pg_namespace nc
450+
WHERE c.collnamespace = nc.oid
451+
AND collencoding = (SELECT encoding FROM pg_catalog.pg_database WHERE datname = pg_catalog.current_database());
452+
453+
GRANT SELECT ON collations TO PUBLIC;
454+
429455

430456
/*
431457
* 5.16
432458
* COLLATION_CHARACTER_SET_APPLICABILITY view
433459
*/
434460

435-
-- feature not supported
461+
CREATE VIEW collation_character_set_applicability AS
462+
SELECT CAST(current_database() AS sql_identifier) AS collation_catalog,
463+
CAST(nc.nspname AS sql_identifier) AS collation_schema,
464+
CAST(c.collname AS sql_identifier) AS collation_name,
465+
CAST(null AS sql_identifier) AS character_set_catalog,
466+
CAST(null AS sql_identifier) AS character_set_schema,
467+
CAST(getdatabaseencoding() AS sql_identifier) AS character_set_name
468+
FROM pg_collation c, pg_namespace nc
469+
WHERE c.collnamespace = nc.oid
470+
AND collencoding = (SELECT encoding FROM pg_catalog.pg_database WHERE datname = pg_catalog.current_database());
471+
472+
GRANT SELECT ON collation_character_set_applicability TO PUBLIC;
436473

437474

438475
/*

src/include/catalog/catversion.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,6 @@
5353
*/
5454

5555
/* yyyymmddN */
56-
#define CATALOG_VERSION_NO 201102084
56+
#define CATALOG_VERSION_NO 201102091
5757

5858
#endif

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy