Skip to content

Commit a93b3b9

Browse files
committed
Fix bug in the tsvector stats collection function, which caused a crash if
the sample contains just a one tsvector, containing only one lexeme.
1 parent fb645f6 commit a93b3b9

File tree

1 file changed

+22
-21
lines changed

1 file changed

+22
-21
lines changed

src/backend/tsearch/ts_typanalyze.c

Lines changed: 22 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
*
88
*
99
* IDENTIFICATION
10-
* $PostgreSQL: pgsql/src/backend/tsearch/ts_typanalyze.c,v 1.2 2008/09/19 19:03:40 tgl Exp $
10+
* $PostgreSQL: pgsql/src/backend/tsearch/ts_typanalyze.c,v 1.3 2008/11/27 21:17:39 heikki Exp $
1111
*
1212
*-------------------------------------------------------------------------
1313
*/
@@ -290,33 +290,34 @@ compute_tsvector_stats(VacAttrStats *stats,
290290
if (num_mcelem > track_len)
291291
num_mcelem = track_len;
292292

293-
/* Grab the minimal and maximal frequencies that will get stored */
294-
minfreq = sort_table[num_mcelem - 1]->frequency;
295-
maxfreq = sort_table[0]->frequency;
296-
297-
/*
298-
* We want to store statistics sorted on the lexeme value using first
299-
* length, then byte-for-byte comparison. The reason for doing length
300-
* comparison first is that we don't care about the ordering so long
301-
* as it's consistent, and comparing lengths first gives us a chance
302-
* to avoid a strncmp() call.
303-
*
304-
* This is different from what we do with scalar statistics -- they get
305-
* sorted on frequencies. The rationale is that we usually search
306-
* through most common elements looking for a specific value, so we can
307-
* grab its frequency. When values are presorted we can employ binary
308-
* search for that. See ts_selfuncs.c for a real usage scenario.
309-
*/
310-
qsort(sort_table, num_mcelem, sizeof(TrackItem *),
311-
trackitem_compare_lexemes);
312-
313293
/* Generate MCELEM slot entry */
314294
if (num_mcelem > 0)
315295
{
316296
MemoryContext old_context;
317297
Datum *mcelem_values;
318298
float4 *mcelem_freqs;
319299

300+
/* Grab the minimal and maximal frequencies that will get stored */
301+
minfreq = sort_table[num_mcelem - 1]->frequency;
302+
maxfreq = sort_table[0]->frequency;
303+
304+
/*
305+
* We want to store statistics sorted on the lexeme value using
306+
* first length, then byte-for-byte comparison. The reason for
307+
* doing length comparison first is that we don't care about the
308+
* ordering so long as it's consistent, and comparing lengths first
309+
* gives us a chance to avoid a strncmp() call.
310+
*
311+
* This is different from what we do with scalar statistics -- they
312+
* get sorted on frequencies. The rationale is that we usually
313+
* search through most common elements looking for a specific
314+
* value, so we can grab its frequency. When values are presorted
315+
* we can employ binary search for that. See ts_selfuncs.c for a
316+
* real usage scenario.
317+
*/
318+
qsort(sort_table, num_mcelem, sizeof(TrackItem *),
319+
trackitem_compare_lexemes);
320+
320321
/* Must copy the target values into anl_context */
321322
old_context = MemoryContextSwitchTo(stats->anl_context);
322323

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy