Skip to content

Commit b82323e

Browse files
committed
This adds mention of my latest tweak to the tsearch2/pg_trgm
integration. It is much better to create a word list of unstemmed words than stemmed ones. Chris K-L
1 parent c2e5631 commit b82323e

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

contrib/pg_trgm/README.pg_trgm

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -100,11 +100,15 @@ Tsearch2 Integration
100100
The first step is to generate an auxiliary table containing all
101101
the unique words in the Tsearch2 index:
102102

103-
CREATE TABLE words AS
104-
SELECT word FROM stat('SELECT vector FROM documents');
105-
106-
Where 'documents' is the table that contains the Tsearch2 index
107-
column 'vector', of type 'tsvector'.
103+
CREATE TABLE words AS SELECT word FROM
104+
stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
105+
106+
Where 'documents' is a table that has a text field 'bodytext'
107+
that TSearch2 is used to search. The use of the 'simple' dictionary
108+
with the to_tsvector function, instead of just using the already
109+
existing vector is to avoid creating a list of already stemmed
110+
words. This way, only the original, unstemmed words are added
111+
to the word list.
108112

109113
Next, create a trigram index on the word column:
110114

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy