Skip to content

Commit ec63622

Browse files
committed
Fix usage of the parse tree for estimate_num_groups() in set operations
recurse_set_operations() uses the parse tree for the group number estimation, because of the "varno 0" hack. At the same time 2489d76 made root->parse and corresponding parent_root->simple_rte_array[]->subquery distinct copies of the parse tree, while d3d55ce introduced self-join removal replacing relid of removed relation only in one of the copies. The present commit fixes this bug by making recurse_set_operations() call estimate_num_groups() with the copy of the parse tree processed by self-join removal. In future, we may think about maintaining just one copy of the parse tree and/or keeping removed relids as aliases. Reported-by: Zuming Jiang Bug: #18170 Discussion: https://postgr.es/m/flat/18170-f1d17bf9a0d58b24%40postgresql.org Author: Richard Guo, Alexander Korotkov Reviewed-by: Andrei Lepikhov
1 parent a237a07 commit ec63622

File tree

3 files changed

+29
-2
lines changed

3 files changed

+29
-2
lines changed

src/backend/optimizer/prep/prepunion.c

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -324,14 +324,17 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
324324
* anyway; otherwise do statistical estimation.
325325
*
326326
* XXX you don't really want to know about this: we do the estimation
327-
* using the subquery's original targetlist expressions, not the
327+
* using the subroot->parse's original targetlist expressions, not the
328328
* subroot->processed_tlist which might seem more appropriate. The
329329
* reason is that if the subquery is itself a setop, it may return a
330330
* processed_tlist containing "varno 0" Vars generated by
331331
* generate_append_tlist, and those would confuse estimate_num_groups
332332
* mightily. We ought to get rid of the "varno 0" hack, but that
333333
* requires a redesign of the parsetree representation of setops, so
334334
* that there can be an RTE corresponding to each setop's output.
335+
* Note, we use this not subquery's targetlist but subroot->parse's
336+
* targetlist, because it was revised by self-join removal. subquery's
337+
* targetlist might contain the references to the removed relids.
335338
*/
336339
if (pNumGroups)
337340
{
@@ -341,7 +344,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
341344
*pNumGroups = subpath->rows;
342345
else
343346
*pNumGroups = estimate_num_groups(subroot,
344-
get_tlist_exprs(subquery->targetList, false),
347+
get_tlist_exprs(subroot->parse->targetList, false),
345348
subpath->rows,
346349
NULL,
347350
NULL);

src/test/regress/expected/join.out

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6759,6 +6759,23 @@ WHERE c3.id=c2.id AND c3.id*c2.id=c1.id*c1.id;
67596759
Filter: ((id IS NOT NULL) AND ((id * id) IS NOT NULL))
67606760
(3 rows)
67616761

6762+
-- Check the usage of a parse tree by the set operations (bug #18170)
6763+
explain (costs off)
6764+
SELECT c1.code FROM emp1 c1 LEFT JOIN emp1 c2 ON c1.id = c2.id
6765+
WHERE c2.id IS NOT NULL
6766+
EXCEPT ALL
6767+
SELECT c3.code FROM emp1 c3;
6768+
QUERY PLAN
6769+
----------------------------------------------
6770+
HashSetOp Except All
6771+
-> Append
6772+
-> Subquery Scan on "*SELECT* 1"
6773+
-> Seq Scan on emp1 c2
6774+
Filter: (id IS NOT NULL)
6775+
-> Subquery Scan on "*SELECT* 2"
6776+
-> Seq Scan on emp1 c3
6777+
(7 rows)
6778+
67626779
-- We can remove the join even if we find the join can't duplicate rows and
67636780
-- the base quals of each side are different. In the following case we end up
67646781
-- moving quals over to s1 to make it so it can't match any rows.

src/test/regress/sql/join.sql

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2573,6 +2573,13 @@ explain (costs off)
25732573
SELECT count(*) FROM emp1 c1, emp1 c2, emp1 c3
25742574
WHERE c3.id=c2.id AND c3.id*c2.id=c1.id*c1.id;
25752575

2576+
-- Check the usage of a parse tree by the set operations (bug #18170)
2577+
explain (costs off)
2578+
SELECT c1.code FROM emp1 c1 LEFT JOIN emp1 c2 ON c1.id = c2.id
2579+
WHERE c2.id IS NOT NULL
2580+
EXCEPT ALL
2581+
SELECT c3.code FROM emp1 c3;
2582+
25762583
-- We can remove the join even if we find the join can't duplicate rows and
25772584
-- the base quals of each side are different. In the following case we end up
25782585
-- moving quals over to s1 to make it so it can't match any rows.

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy