Content-Length: 325922 | pFad | http://github.com/postgrespro/postgres/commit/17d787a3b160eefb2ff4a3fdf12ca1fedc02cbc1

75 Items on GIN data pages are no longer always 6 bytes; update gincoste… · postgrespro/postgres@17d787a · GitHub
Skip to content

Commit 17d787a

Browse files
committed
Items on GIN data pages are no longer always 6 bytes; update gincostestimate.
Also improve the comments a bit.
1 parent 588fb50 commit 17d787a

File tree

1 file changed

+16
-17
lines changed

1 file changed

+16
-17
lines changed

src/backend/utils/adt/selfuncs.c

Lines changed: 16 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -7291,31 +7291,30 @@ gincostestimate(PG_FUNCTION_ARGS)
72917291
*indexStartupCost = (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
72927292

72937293
/*
7294-
* Now we compute the number of data pages fetched while the scan
7295-
* proceeds.
7294+
* Now compute the number of data pages fetched during the scan.
7295+
*
7296+
* We assume every entry to have the same number of items, and that there
7297+
* is no overlap between them. (XXX: tsvector and array opclasses collect
7298+
* statistics on the frequency of individual keys; it would be nice to
7299+
* use those here.)
72967300
*/
7297-
7298-
/* data pages scanned for each exact (non-partial) matched entry */
72997301
dataPagesFetched = ceil(numDataPages * counts.exactEntries / numEntries);
73007302

73017303
/*
7302-
* Estimate number of data pages read, using selectivity estimation and
7303-
* capacity of data page.
7304+
* If there is a lot of overlap among the entries, in particular if one
7305+
* of the entries is very frequent, the above calculation can grossly
7306+
* under-estimate. As a simple cross-check, calculate a lower bound
7307+
* based on the overall selectivity of the quals. At a minimum, we must
7308+
* read one item pointer for each matching entry.
7309+
*
7310+
* The width of each item pointer varies, based on the level of
7311+
* compression. We don't have statistics on that, but an average of
7312+
* around 3 bytes per item is fairly typical.
73047313
*/
73057314
dataPagesFetchedBySel = ceil(*indexSelectivity *
7306-
(numTuples / (BLCKSZ / SizeOfIptrData)));
7307-
7315+
(numTuples / (BLCKSZ / 3)));
73087316
if (dataPagesFetchedBySel > dataPagesFetched)
7309-
{
7310-
/*
7311-
* At least one of entries is very frequent and, unfortunately, we
7312-
* couldn't get statistic about entries (only tsvector has such
7313-
* statistics). So, we obviously have too small estimation of pages
7314-
* fetched from data tree. Re-estimate it from known capacity of data
7315-
* pages
7316-
*/
73177317
dataPagesFetched = dataPagesFetchedBySel;
7318-
}
73197318

73207319
/* Account for cache effects, the same as above */
73217320
if (outer_scans > 1 || counts.arrayScans > 1)

0 commit comments

Comments
 (0)








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/postgrespro/postgres/commit/17d787a3b160eefb2ff4a3fdf12ca1fedc02cbc1

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy