Skip to content

Commit 1420f3a

Browse files
committed
Fix ts_rank_cd() to ignore stripped lexemes
Previously, stripped lexemes got a default location and could be considered if mixed with non-stripped lexemes. BACKWARD INCOMPATIBILITY CHANGE
1 parent bb42e21 commit 1420f3a

File tree

4 files changed

+30
-5
lines changed

4 files changed

+30
-5
lines changed

doc/src/sgml/textsearch.sgml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -889,9 +889,13 @@ SELECT plainto_tsquery('english', 'The Fat & Rats:C');
889889
</para>
890890

891891
<para>
892-
This function requires positional information in its input.
893-
Therefore it will not work on <quote>stripped</> <type>tsvector</>
894-
values &mdash; it will always return zero.
892+
This function requires lexeme positional information to perform
893+
its calculation. Therefore, it ignores any <quote>stripped</>
894+
lexemes in the <type>tsvector</>. If there are no unstripped
895+
lexemes in the input, the result will be zero. (See <xref
896+
linkend="textsearch-manipulate-tsvector"> for more information
897+
about the <function>strip</> function and positional information
898+
in <type>tsvector</>s.)
895899
</para>
896900
</listitem>
897901
</varlistentry>

src/backend/utils/adt/tsrank.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -658,8 +658,9 @@ get_docrep(TSVector txt, QueryRepresentation *qr, int *doclen)
658658
}
659659
else
660660
{
661-
dimt = POSNULL.npos;
662-
post = POSNULL.pos;
661+
/* ignore words without positions */
662+
entry++;
663+
continue;
663664
}
664665

665666
while (cur + dimt >= len)

src/test/regress/expected/tsearch.out

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -596,6 +596,20 @@ S. T. Coleridge (1772-1834)
596596
0.1
597597
(1 row)
598598

599+
SELECT ts_rank_cd(strip(to_tsvector('both stripped')),
600+
to_tsquery('both & stripped'));
601+
ts_rank_cd
602+
------------
603+
0
604+
(1 row)
605+
606+
SELECT ts_rank_cd(to_tsvector('unstripped') || strip(to_tsvector('stripped')),
607+
to_tsquery('unstripped & stripped'));
608+
ts_rank_cd
609+
------------
610+
0
611+
(1 row)
612+
599613
--headline tests
600614
SELECT ts_headline('english', '
601615
Day after day, day after day,

src/test/regress/sql/tsearch.sql

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,12 @@ Water, water, every where,
165165
S. T. Coleridge (1772-1834)
166166
'), to_tsquery('english', 'ocean'));
167167

168+
SELECT ts_rank_cd(strip(to_tsvector('both stripped')),
169+
to_tsquery('both & stripped'));
170+
171+
SELECT ts_rank_cd(to_tsvector('unstripped') || strip(to_tsvector('stripped')),
172+
to_tsquery('unstripped & stripped'));
173+
168174
--headline tests
169175
SELECT ts_headline('english', '
170176
Day after day, day after day,

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy