Skip to content

Commit a16ef31

Browse files
committed
Remove planner's have_dangerous_phv() join-order restriction.
Commit 85e5e22, which added (a forerunner of) this logic, argued that Adding the necessary complexity to make this work doesn't seem like it would be repaid in significantly better plans, because in cases where such a PHV exists, there is probably a corresponding join order constraint that would allow a good plan to be found without using the star-schema exception. The flaw in this claim is that there may be other join-order restrictions that prevent us from finding a join order that doesn't involve a "dangerous" PHV. In particular we now recognize that small join_collapse_limit or from_collapse_limit could prevent it. Therefore, let's bite the bullet and make the case work. We don't have to extend the executor's support for nestloop parameters as I thought at the time, because we can instead push the evaluation of the placeholder's expression into the left-hand input of the NestLoop node. So there's not really a lot of downside to this solution, and giving the planner more join-order flexibility should have value beyond just avoiding failure. Having said that, there surely is a nonzero risk of introducing new bugs. Since this failure mode escaped detection for ten years, such cases don't seem common enough to justify a lot of risk. Therefore, let's put this fix into master but leave the back branches alone (for now anyway). Bug: #18953 Reported-by: Alexander Lakhin <exclusion@gmail.com> Diagnosed-by: Richard Guo <guofenglinux@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18953-1c9883a9d4afeb30@postgresql.org
1 parent 5861b1f commit a16ef31

File tree

8 files changed

+188
-83
lines changed

8 files changed

+188
-83
lines changed

src/backend/optimizer/path/joinpath.c

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -876,16 +876,13 @@ try_nestloop_path(PlannerInfo *root,
876876
/*
877877
* Check to see if proposed path is still parameterized, and reject if the
878878
* parameterization wouldn't be sensible --- unless allow_star_schema_join
879-
* says to allow it anyway. Also, we must reject if have_dangerous_phv
880-
* doesn't like the look of it, which could only happen if the nestloop is
881-
* still parameterized.
879+
* says to allow it anyway.
882880
*/
883881
required_outer = calc_nestloop_required_outer(outerrelids, outer_paramrels,
884882
innerrelids, inner_paramrels);
885883
if (required_outer &&
886-
((!bms_overlap(required_outer, extra->param_source_rels) &&
887-
!allow_star_schema_join(root, outerrelids, inner_paramrels)) ||
888-
have_dangerous_phv(root, outerrelids, inner_paramrels)))
884+
!bms_overlap(required_outer, extra->param_source_rels) &&
885+
!allow_star_schema_join(root, outerrelids, inner_paramrels))
889886
{
890887
/* Waste no memory when we reject a path here */
891888
bms_free(required_outer);

src/backend/optimizer/path/joinrels.c

Lines changed: 0 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -565,9 +565,6 @@ join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
565565
* Also, if the lateral reference is only indirect, we should reject
566566
* the join; whatever rel(s) the reference chain goes through must be
567567
* joined to first.
568-
*
569-
* Another case that might keep us from building a valid plan is the
570-
* implementation restriction described by have_dangerous_phv().
571568
*/
572569
lateral_fwd = bms_overlap(rel1->relids, rel2->lateral_relids);
573570
lateral_rev = bms_overlap(rel2->relids, rel1->lateral_relids);
@@ -584,9 +581,6 @@ join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
584581
/* check there is a direct reference from rel2 to rel1 */
585582
if (!bms_overlap(rel1->relids, rel2->direct_lateral_relids))
586583
return false; /* only indirect refs, so reject */
587-
/* check we won't have a dangerous PHV */
588-
if (have_dangerous_phv(root, rel1->relids, rel2->lateral_relids))
589-
return false; /* might be unable to handle required PHV */
590584
}
591585
else if (lateral_rev)
592586
{
@@ -599,9 +593,6 @@ join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
599593
/* check there is a direct reference from rel1 to rel2 */
600594
if (!bms_overlap(rel2->relids, rel1->direct_lateral_relids))
601595
return false; /* only indirect refs, so reject */
602-
/* check we won't have a dangerous PHV */
603-
if (have_dangerous_phv(root, rel2->relids, rel1->lateral_relids))
604-
return false; /* might be unable to handle required PHV */
605596
}
606597

607598
/*
@@ -1278,57 +1269,6 @@ has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel)
12781269
}
12791270

12801271

1281-
/*
1282-
* There's a pitfall for creating parameterized nestloops: suppose the inner
1283-
* rel (call it A) has a parameter that is a PlaceHolderVar, and that PHV's
1284-
* minimum eval_at set includes the outer rel (B) and some third rel (C).
1285-
* We might think we could create a B/A nestloop join that's parameterized by
1286-
* C. But we would end up with a plan in which the PHV's expression has to be
1287-
* evaluated as a nestloop parameter at the B/A join; and the executor is only
1288-
* set up to handle simple Vars as NestLoopParams. Rather than add complexity
1289-
* and overhead to the executor for such corner cases, it seems better to
1290-
* forbid the join. (Note that we can still make use of A's parameterized
1291-
* path with pre-joined B+C as the outer rel. have_join_order_restriction()
1292-
* ensures that we will consider making such a join even if there are not
1293-
* other reasons to do so.)
1294-
*
1295-
* So we check whether any PHVs used in the query could pose such a hazard.
1296-
* We don't have any simple way of checking whether a risky PHV would actually
1297-
* be used in the inner plan, and the case is so unusual that it doesn't seem
1298-
* worth working very hard on it.
1299-
*
1300-
* This needs to be checked in two places. If the inner rel's minimum
1301-
* parameterization would trigger the restriction, then join_is_legal() should
1302-
* reject the join altogether, because there will be no workable paths for it.
1303-
* But joinpath.c has to check again for every proposed nestloop path, because
1304-
* the inner path might have more than the minimum parameterization, causing
1305-
* some PHV to be dangerous for it that otherwise wouldn't be.
1306-
*/
1307-
bool
1308-
have_dangerous_phv(PlannerInfo *root,
1309-
Relids outer_relids, Relids inner_params)
1310-
{
1311-
ListCell *lc;
1312-
1313-
foreach(lc, root->placeholder_list)
1314-
{
1315-
PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(lc);
1316-
1317-
if (!bms_is_subset(phinfo->ph_eval_at, inner_params))
1318-
continue; /* ignore, could not be a nestloop param */
1319-
if (!bms_overlap(phinfo->ph_eval_at, outer_relids))
1320-
continue; /* ignore, not relevant to this join */
1321-
if (bms_is_subset(phinfo->ph_eval_at, outer_relids))
1322-
continue; /* safe, it can be eval'd within outerrel */
1323-
/* Otherwise, it's potentially unsafe, so reject the join */
1324-
return true;
1325-
}
1326-
1327-
/* OK to perform the join */
1328-
return false;
1329-
}
1330-
1331-
13321272
/*
13331273
* is_dummy_rel --- has relation been proven empty?
13341274
*/

src/backend/optimizer/plan/createplan.c

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4348,9 +4348,11 @@ create_nestloop_plan(PlannerInfo *root,
43484348
List *joinrestrictclauses = best_path->jpath.joinrestrictinfo;
43494349
List *joinclauses;
43504350
List *otherclauses;
4351-
Relids outerrelids;
43524351
List *nestParams;
4352+
List *outer_tlist;
4353+
bool outer_parallel_safe;
43534354
Relids saveOuterRels = root->curOuterRels;
4355+
ListCell *lc;
43544356

43554357
/*
43564358
* If the inner path is parameterized by the topmost parent of the outer
@@ -4412,9 +4414,47 @@ create_nestloop_plan(PlannerInfo *root,
44124414
* Identify any nestloop parameters that should be supplied by this join
44134415
* node, and remove them from root->curOuterParams.
44144416
*/
4415-
outerrelids = best_path->jpath.outerjoinpath->parent->relids;
4416-
nestParams = identify_current_nestloop_params(root, outerrelids);
4417+
nestParams = identify_current_nestloop_params(root,
4418+
best_path->jpath.outerjoinpath);
4419+
4420+
/*
4421+
* While nestloop parameters that are Vars had better be available from
4422+
* the outer_plan already, there are edge cases where nestloop parameters
4423+
* that are PHVs won't be. In such cases we must add them to the
4424+
* outer_plan's tlist, since the executor's NestLoopParam machinery
4425+
* requires the params to be simple outer-Var references to that tlist.
4426+
*/
4427+
outer_tlist = outer_plan->targetlist;
4428+
outer_parallel_safe = outer_plan->parallel_safe;
4429+
foreach(lc, nestParams)
4430+
{
4431+
NestLoopParam *nlp = (NestLoopParam *) lfirst(lc);
4432+
TargetEntry *tle;
4433+
4434+
if (IsA(nlp->paramval, Var))
4435+
continue; /* nothing to do for simple Vars */
4436+
if (tlist_member((Expr *) nlp->paramval, outer_tlist))
4437+
continue; /* already available */
4438+
4439+
/* Make a shallow copy of outer_tlist, if we didn't already */
4440+
if (outer_tlist == outer_plan->targetlist)
4441+
outer_tlist = list_copy(outer_tlist);
4442+
/* ... and add the needed expression */
4443+
tle = makeTargetEntry((Expr *) copyObject(nlp->paramval),
4444+
list_length(outer_tlist) + 1,
4445+
NULL,
4446+
true);
4447+
outer_tlist = lappend(outer_tlist, tle);
4448+
/* ... and track whether tlist is (still) parallel-safe */
4449+
if (outer_parallel_safe)
4450+
outer_parallel_safe = is_parallel_safe(root,
4451+
(Node *) nlp->paramval);
4452+
}
4453+
if (outer_tlist != outer_plan->targetlist)
4454+
outer_plan = change_plan_targetlist(outer_plan, outer_tlist,
4455+
outer_parallel_safe);
44174456

4457+
/* And finally, we can build the join plan node */
44184458
join_plan = make_nestloop(tlist,
44194459
joinclauses,
44204460
otherclauses,

src/backend/optimizer/util/paramassign.c

Lines changed: 28 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -600,7 +600,7 @@ process_subquery_nestloop_params(PlannerInfo *root, List *subplan_params)
600600

601601
/*
602602
* Identify any NestLoopParams that should be supplied by a NestLoop plan
603-
* node with the specified lefthand rels. Remove them from the active
603+
* node with the specified lefthand input path. Remove them from the active
604604
* root->curOuterParams list and return them as the result list.
605605
*
606606
* XXX Here we also hack up the returned Vars and PHVs so that they do not
@@ -626,11 +626,26 @@ process_subquery_nestloop_params(PlannerInfo *root, List *subplan_params)
626626
* subquery, which'd be unduly expensive.
627627
*/
628628
List *
629-
identify_current_nestloop_params(PlannerInfo *root, Relids leftrelids)
629+
identify_current_nestloop_params(PlannerInfo *root, Path *leftpath)
630630
{
631631
List *result;
632+
Relids leftrelids = leftpath->parent->relids;
633+
Relids outerrelids = PATH_REQ_OUTER(leftpath);
634+
Relids allleftrelids;
632635
ListCell *cell;
633636

637+
/*
638+
* We'll be able to evaluate a PHV in the lefthand path if it uses the
639+
* lefthand rels plus any available required-outer rels. But don't do so
640+
* if it uses *only* required-outer rels; in that case it should be
641+
* evaluated higher in the tree. For Vars, no such hair-splitting is
642+
* necessary since they depend on only one relid.
643+
*/
644+
if (outerrelids)
645+
allleftrelids = bms_union(leftrelids, outerrelids);
646+
else
647+
allleftrelids = leftrelids;
648+
634649
result = NIL;
635650
foreach(cell, root->curOuterParams)
636651
{
@@ -653,18 +668,20 @@ identify_current_nestloop_params(PlannerInfo *root, Relids leftrelids)
653668
leftrelids);
654669
result = lappend(result, nlp);
655670
}
656-
else if (IsA(nlp->paramval, PlaceHolderVar) &&
657-
bms_is_subset(find_placeholder_info(root,
658-
(PlaceHolderVar *) nlp->paramval)->ph_eval_at,
659-
leftrelids))
671+
else if (IsA(nlp->paramval, PlaceHolderVar))
660672
{
661673
PlaceHolderVar *phv = (PlaceHolderVar *) nlp->paramval;
674+
Relids eval_at = find_placeholder_info(root, phv)->ph_eval_at;
662675

663-
root->curOuterParams = foreach_delete_current(root->curOuterParams,
664-
cell);
665-
phv->phnullingrels = bms_intersect(phv->phnullingrels,
666-
leftrelids);
667-
result = lappend(result, nlp);
676+
if (bms_is_subset(eval_at, allleftrelids) &&
677+
bms_overlap(eval_at, leftrelids))
678+
{
679+
root->curOuterParams = foreach_delete_current(root->curOuterParams,
680+
cell);
681+
phv->phnullingrels = bms_intersect(phv->phnullingrels,
682+
leftrelids);
683+
result = lappend(result, nlp);
684+
}
668685
}
669686
}
670687
return result;

src/include/optimizer/paramassign.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ extern Param *replace_nestloop_param_placeholdervar(PlannerInfo *root,
3030
extern void process_subquery_nestloop_params(PlannerInfo *root,
3131
List *subplan_params);
3232
extern List *identify_current_nestloop_params(PlannerInfo *root,
33-
Relids leftrelids);
33+
Path *leftpath);
3434
extern Param *generate_new_exec_param(PlannerInfo *root, Oid paramtype,
3535
int32 paramtypmod, Oid paramcollation);
3636
extern int assign_special_exec_param(PlannerInfo *root);

src/include/optimizer/paths.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -109,8 +109,6 @@ extern Relids add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
109109
List **pushed_down_joins);
110110
extern bool have_join_order_restriction(PlannerInfo *root,
111111
RelOptInfo *rel1, RelOptInfo *rel2);
112-
extern bool have_dangerous_phv(PlannerInfo *root,
113-
Relids outer_relids, Relids inner_params);
114112
extern void mark_dummy_rel(RelOptInfo *rel);
115113
extern void init_dummy_sjinfo(SpecialJoinInfo *sjinfo, Relids left_relids,
116114
Relids right_relids);

src/test/regress/expected/join.out

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3946,6 +3946,59 @@ where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
39463946
(1 row)
39473947

39483948
-- variant that isn't quite a star-schema case
3949+
explain (verbose, costs off)
3950+
select ss1.d1 from
3951+
tenk1 as t1
3952+
inner join tenk1 as t2
3953+
on t1.tenthous = t2.ten
3954+
inner join
3955+
int8_tbl as i8
3956+
left join int4_tbl as i4
3957+
inner join (select 64::information_schema.cardinal_number as d1
3958+
from tenk1 t3,
3959+
lateral (select abs(t3.unique1) + random()) ss0(x)
3960+
where t3.fivethous < 0) as ss1
3961+
on i4.f1 = ss1.d1
3962+
on i8.q1 = i4.f1
3963+
on t1.tenthous = ss1.d1
3964+
where t1.unique1 < i4.f1;
3965+
QUERY PLAN
3966+
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3967+
Nested Loop
3968+
Output: (64)::information_schema.cardinal_number
3969+
Join Filter: (t1.tenthous = ((64)::information_schema.cardinal_number)::integer)
3970+
-> Seq Scan on public.tenk1 t3
3971+
Output: t3.unique1, t3.unique2, t3.two, t3.four, t3.ten, t3.twenty, t3.hundred, t3.thousand, t3.twothousand, t3.fivethous, t3.tenthous, t3.odd, t3.even, t3.stringu1, t3.stringu2, t3.string4
3972+
Filter: (t3.fivethous < 0)
3973+
-> Nested Loop
3974+
Output: t1.tenthous, t2.ten
3975+
-> Nested Loop
3976+
Output: t1.tenthous, t2.ten, i4.f1
3977+
Join Filter: (t1.unique1 < i4.f1)
3978+
-> Hash Join
3979+
Output: t1.tenthous, t1.unique1, t2.ten
3980+
Hash Cond: (t2.ten = t1.tenthous)
3981+
-> Seq Scan on public.tenk1 t2
3982+
Output: t2.unique1, t2.unique2, t2.two, t2.four, t2.ten, t2.twenty, t2.hundred, t2.thousand, t2.twothousand, t2.fivethous, t2.tenthous, t2.odd, t2.even, t2.stringu1, t2.stringu2, t2.string4
3983+
-> Hash
3984+
Output: t1.tenthous, t1.unique1
3985+
-> Nested Loop
3986+
Output: t1.tenthous, t1.unique1
3987+
-> Subquery Scan on ss0
3988+
Output: ss0.x, (64)::information_schema.cardinal_number
3989+
-> Result
3990+
Output: ((abs(t3.unique1))::double precision + random())
3991+
-> Index Scan using tenk1_thous_tenthous on public.tenk1 t1
3992+
Output: t1.unique1, t1.unique2, t1.two, t1.four, t1.ten, t1.twenty, t1.hundred, t1.thousand, t1.twothousand, t1.fivethous, t1.tenthous, t1.odd, t1.even, t1.stringu1, t1.stringu2, t1.string4
3993+
Index Cond: (t1.tenthous = (((64)::information_schema.cardinal_number))::integer)
3994+
-> Seq Scan on public.int4_tbl i4
3995+
Output: i4.f1
3996+
Filter: (i4.f1 = ((64)::information_schema.cardinal_number)::integer)
3997+
-> Seq Scan on public.int8_tbl i8
3998+
Output: i8.q1, i8.q2
3999+
Filter: (i8.q1 = ((64)::information_schema.cardinal_number)::integer)
4000+
(33 rows)
4001+
39494002
select ss1.d1 from
39504003
tenk1 as t1
39514004
inner join tenk1 as t2
@@ -4035,6 +4088,37 @@ select * from
40354088
1 | 2 | 2
40364089
(1 row)
40374090

4091+
-- This example demonstrates the folly of our old "have_dangerous_phv" logic
4092+
begin;
4093+
set local from_collapse_limit to 2;
4094+
explain (verbose, costs off)
4095+
select * from int8_tbl t1
4096+
left join
4097+
(select coalesce(t2.q1 + x, 0) from int8_tbl t2,
4098+
lateral (select t3.q1 as x from int8_tbl t3,
4099+
lateral (select t2.q1, t3.q1 offset 0) s))
4100+
on true;
4101+
QUERY PLAN
4102+
------------------------------------------------------------------
4103+
Nested Loop Left Join
4104+
Output: t1.q1, t1.q2, (COALESCE((t2.q1 + t3.q1), '0'::bigint))
4105+
-> Seq Scan on public.int8_tbl t1
4106+
Output: t1.q1, t1.q2
4107+
-> Materialize
4108+
Output: (COALESCE((t2.q1 + t3.q1), '0'::bigint))
4109+
-> Nested Loop
4110+
Output: COALESCE((t2.q1 + t3.q1), '0'::bigint)
4111+
-> Seq Scan on public.int8_tbl t2
4112+
Output: t2.q1, t2.q2
4113+
-> Nested Loop
4114+
Output: t3.q1
4115+
-> Seq Scan on public.int8_tbl t3
4116+
Output: t3.q1, t3.q2
4117+
-> Result
4118+
Output: NULL::bigint, NULL::bigint
4119+
(16 rows)
4120+
4121+
rollback;
40384122
-- Test proper handling of appendrel PHVs during useless-RTE removal
40394123
explain (costs off)
40404124
select * from

src/test/regress/sql/join.sql

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1277,6 +1277,23 @@ where t1.unique2 < 42 and t1.stringu1 > t2.stringu2;
12771277

12781278
-- variant that isn't quite a star-schema case
12791279

1280+
explain (verbose, costs off)
1281+
select ss1.d1 from
1282+
tenk1 as t1
1283+
inner join tenk1 as t2
1284+
on t1.tenthous = t2.ten
1285+
inner join
1286+
int8_tbl as i8
1287+
left join int4_tbl as i4
1288+
inner join (select 64::information_schema.cardinal_number as d1
1289+
from tenk1 t3,
1290+
lateral (select abs(t3.unique1) + random()) ss0(x)
1291+
where t3.fivethous < 0) as ss1
1292+
on i4.f1 = ss1.d1
1293+
on i8.q1 = i4.f1
1294+
on t1.tenthous = ss1.d1
1295+
where t1.unique1 < i4.f1;
1296+
12801297
select ss1.d1 from
12811298
tenk1 as t1
12821299
inner join tenk1 as t2
@@ -1332,6 +1349,18 @@ select * from
13321349
(select 1 as x) ss1 left join (select 2 as y) ss2 on (true),
13331350
lateral (select ss2.y as z limit 1) ss3;
13341351

1352+
-- This example demonstrates the folly of our old "have_dangerous_phv" logic
1353+
begin;
1354+
set local from_collapse_limit to 2;
1355+
explain (verbose, costs off)
1356+
select * from int8_tbl t1
1357+
left join
1358+
(select coalesce(t2.q1 + x, 0) from int8_tbl t2,
1359+
lateral (select t3.q1 as x from int8_tbl t3,
1360+
lateral (select t2.q1, t3.q1 offset 0) s))
1361+
on true;
1362+
rollback;
1363+
13351364
-- Test proper handling of appendrel PHVs during useless-RTE removal
13361365
explain (costs off)
13371366
select * from

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy