F.49. pg_tsparser — an extension for text search #

pg_tsparser is a Postgres Pro extension for text search. This extension modifies the default text parsing strategy for words that include:

  • underscores

  • numbers and letters separated by the hyphen character

In addition to separate word parts returned by default, pg_tsparser also returns the whole word.

F.49.1. Installation and Setup #

pg_tsparser is included into the Postgres Pro distribution. To enable pg_tsparser, once Postgres Pro is installed, create the pg_tsparser extension for each database you are planning to use:

CREATE EXTENSION pg_tsparser;

Once pg_tsparser is enabled, you can create your own text search configuration. In addition to pg_tsparser, you can use any available dictionary.

For example, you can create english_ts configuration for the English language, as follows:

CREATE TEXT SEARCH CONFIGURATION english_ts (
    PARSER = tsparser
);

COMMENT ON TEXT SEARCH CONFIGURATION english_ts IS 'text search configuration for english language';

ALTER TEXT SEARCH CONFIGURATION english_ts
    ADD MAPPING FOR email, file, float, host, hword_numpart, int,
    numhword, numword, sfloat, uint, url, url_path, version
    WITH simple;

ALTER TEXT SEARCH CONFIGURATION english_ts
    ADD MAPPING FOR asciiword, asciihword, hword_asciipart,
    word, hword, hword_part
    WITH english_stem;

F.49.2. Examples #

The following examples illustrate the difference in search results returned by pg_tsparser and the default parser:

SELECT to_tsvector('english', 'pg_trgm') as def_parser,
       to_tsvector('english_ts', 'pg_trgm')  as new_parser;
   def_parser    |         new_parser
-----------------+-----------------------------
 'pg':1 'trgm':2 | 'pg':2 'pg_trgm':1 'trgm':3
(1 row)

SELECT to_tsvector('english', '123-abc') as def_parser,
       to_tsvector('english_ts', '123-abc')  as new_parser;
   def_parser    |         new_parser
-----------------+-----------------------------
 '123':1 'abc':2 | '123':2 '123-abc':1 'abc':3
(1 row)

SELECT to_tsvector('english', 'rel-3.2-A') as def_parser,
       to_tsvector('english_ts', 'rel-3.2-A')  as new_parser;
    def_parser    |          new_parser
------------------+-------------------------------
 '-3.2':2 'rel':1 | '3.2':3 'rel':2 'rel-3.2-a':1
(1 row)

See Also

CREATE TEXT SEARCH CONFIGURATION

ALTER TEXT SEARCH CONFIGURATION

F.49.3. Authors #

Postgres Professional, Moscow, Russia

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy