Skip to content

Commit 33e9915

Browse files
committed
Use fuzzy not exact cost comparison for the final tie-breaker in add_path.
Instead of an exact cost comparison, use a fuzzy comparison with 1e-10 delta after all other path metrics have proved equal. This is to avoid having platform-specific roundoff behaviors determine the choice when two paths are really the same to our cost estimators. Adjust the recently-added test case that made it obvious we had a problem here.
1 parent 09ff76f commit 33e9915

File tree

2 files changed

+38
-22
lines changed

2 files changed

+38
-22
lines changed

src/backend/optimizer/util/pathnode.c

Lines changed: 30 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,11 @@ compare_fractional_path_costs(Path *path1, Path *path2,
125125
*
126126
* We use fuzzy comparisons so that add_path() can avoid keeping both of
127127
* a pair of paths that really have insignificantly different cost.
128-
* The fuzz factor is 1% of the smaller cost. (XXX does this percentage
129-
* need to be user-configurable?)
128+
*
129+
* The fuzz_factor argument must be 1.0 plus delta, where delta is the
130+
* fraction of the smaller cost that is considered to be a significant
131+
* difference. For example, fuzz_factor = 1.01 makes the fuzziness limit
132+
* be 1% of the smaller cost.
130133
*
131134
* The two paths are said to have "equal" costs if both startup and total
132135
* costs are fuzzily the same. Path1 is said to be better than path2 if
@@ -138,27 +141,27 @@ compare_fractional_path_costs(Path *path1, Path *path2,
138141
* dominates the other across the whole performance spectrum.
139142
*/
140143
static PathCostComparison
141-
compare_path_costs_fuzzily(Path *path1, Path *path2)
144+
compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
142145
{
143146
/*
144147
* Check total cost first since it's more likely to be different; many
145148
* paths have zero startup cost.
146149
*/
147-
if (path1->total_cost > path2->total_cost * 1.01)
150+
if (path1->total_cost > path2->total_cost * fuzz_factor)
148151
{
149152
/* path1 fuzzily worse on total cost */
150-
if (path2->startup_cost > path1->startup_cost * 1.01)
153+
if (path2->startup_cost > path1->startup_cost * fuzz_factor)
151154
{
152155
/* ... but path2 fuzzily worse on startup, so DIFFERENT */
153156
return COSTS_DIFFERENT;
154157
}
155158
/* else path2 dominates */
156159
return COSTS_BETTER2;
157160
}
158-
if (path2->total_cost > path1->total_cost * 1.01)
161+
if (path2->total_cost > path1->total_cost * fuzz_factor)
159162
{
160163
/* path2 fuzzily worse on total cost */
161-
if (path1->startup_cost > path2->startup_cost * 1.01)
164+
if (path1->startup_cost > path2->startup_cost * fuzz_factor)
162165
{
163166
/* ... but path1 fuzzily worse on startup, so DIFFERENT */
164167
return COSTS_DIFFERENT;
@@ -167,12 +170,12 @@ compare_path_costs_fuzzily(Path *path1, Path *path2)
167170
return COSTS_BETTER1;
168171
}
169172
/* fuzzily the same on total cost */
170-
if (path1->startup_cost > path2->startup_cost * 1.01)
173+
if (path1->startup_cost > path2->startup_cost * fuzz_factor)
171174
{
172175
/* ... but path1 fuzzily worse on startup, so path2 wins */
173176
return COSTS_BETTER2;
174177
}
175-
if (path2->startup_cost > path1->startup_cost * 1.01)
178+
if (path2->startup_cost > path1->startup_cost * fuzz_factor)
176179
{
177180
/* ... but path2 fuzzily worse on startup, so path1 wins */
178181
return COSTS_BETTER1;
@@ -354,7 +357,11 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
354357

355358
p1_next = lnext(p1);
356359

357-
costcmp = compare_path_costs_fuzzily(new_path, old_path);
360+
/*
361+
* Do a fuzzy cost comparison with 1% fuzziness limit. (XXX does this
362+
* percentage need to be user-configurable?)
363+
*/
364+
costcmp = compare_path_costs_fuzzily(new_path, old_path, 1.01);
358365

359366
/*
360367
* If the two paths compare differently for startup and total cost,
@@ -403,15 +410,24 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
403410
/*
404411
* Same pathkeys and outer rels, and fuzzily
405412
* the same cost, so keep just one; to decide
406-
* which, first check rows and then do an
407-
* exact cost comparison.
413+
* which, first check rows and then do a fuzzy
414+
* cost comparison with very small fuzz limit.
415+
* (We used to do an exact cost comparison,
416+
* but that results in annoying
417+
* platform-specific plan variations due to
418+
* roundoff in the cost estimates.) If things
419+
* are still tied, arbitrarily keep only the
420+
* old path. Notice that we will keep only
421+
* the old path even if the less-fuzzy
422+
* comparison decides the startup and total
423+
* costs compare differently.
408424
*/
409425
if (new_path->rows < old_path->rows)
410426
remove_old = true; /* new dominates old */
411427
else if (new_path->rows > old_path->rows)
412428
accept_new = false; /* old dominates new */
413-
else if (compare_path_costs(new_path, old_path,
414-
TOTAL_COST) < 0)
429+
else if (compare_path_costs_fuzzily(new_path, old_path,
430+
1.0000000001) == COSTS_BETTER1)
415431
remove_old = true; /* new dominates old */
416432
else
417433
accept_new = false; /* old equals or dominates new */

src/test/regress/expected/join.out

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2798,17 +2798,17 @@ select b.unique1 from
27982798
Sort Key: b.unique1
27992799
-> Nested Loop Left Join
28002800
-> Seq Scan on int4_tbl i2
2801-
-> Nested Loop Left Join
2802-
Join Filter: (b.unique1 = 42)
2803-
-> Nested Loop
2801+
-> Nested Loop
2802+
-> Seq Scan on int4_tbl i1
2803+
-> Nested Loop Left Join
2804+
Join Filter: (b.unique1 = 42)
28042805
-> Nested Loop
2805-
-> Seq Scan on int4_tbl i1
28062806
-> Index Scan using tenk1_thous_tenthous on tenk1 b
28072807
Index Cond: ((thousand = i1.f1) AND (i2.f1 = tenthous))
2808-
-> Index Scan using tenk1_unique1 on tenk1 a
2809-
Index Cond: (unique1 = b.unique2)
2810-
-> Index Only Scan using tenk1_thous_tenthous on tenk1 c
2811-
Index Cond: (thousand = a.thousand)
2808+
-> Index Scan using tenk1_unique1 on tenk1 a
2809+
Index Cond: (unique1 = b.unique2)
2810+
-> Index Only Scan using tenk1_thous_tenthous on tenk1 c
2811+
Index Cond: (thousand = a.thousand)
28122812
(15 rows)
28132813

28142814
select b.unique1 from

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy