Skip to content

Commit 3ced883

Browse files
committed
Simplify query_planner's API by having it return the top-level RelOptInfo.
Formerly, query_planner returned one or possibly two Paths for the topmost join relation, so that grouping_planner didn't see the join RelOptInfo (at least not directly; it didn't have any hesitation about examining cheapest_path->parent, though). However, correct selection of the Paths involved a significant amount of coupling between query_planner and grouping_planner, a problem which has gotten worse over time. It seems best to give up on this API choice and instead return the topmost RelOptInfo explicitly. Then grouping_planner can pull out the Paths it wants from the rel's path list. In this way we can remove all knowledge of grouping behaviors from query_planner. The only real benefit of the old way is that in the case of an empty FROM clause, we never made any RelOptInfos at all, just a Path. Now we have to gin up a dummy RelOptInfo to represent the empty FROM clause. That's not a very big deal though. While at it, simplify query_planner's API a bit more by having the caller set up root->tuple_fraction and root->limit_tuples, rather than passing those values as separate parameters. Since query_planner no longer does anything with either value, requiring it to fill the PlannerInfo fields seemed pretty arbitrary. This patch just rearranges code; it doesn't (intentionally) change any behaviors. Followup patches will do more interesting things.
1 parent 841c29c commit 3ced883

File tree

8 files changed

+247
-277
lines changed

8 files changed

+247
-277
lines changed

src/backend/optimizer/README

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -372,10 +372,10 @@ generated during the optimization process are marked with their sort order
372372

373373
It is also possible to avoid an explicit sort step to implement a user's
374374
ORDER BY clause if the final path has the right ordering already, so the
375-
sort ordering is of interest even at the top level. query_planner() will
375+
sort ordering is of interest even at the top level. grouping_planner() will
376376
look for the cheapest path with a sort order matching the desired order,
377-
and grouping_planner() will compare its cost to the cost of using the
378-
cheapest-overall path and doing an explicit sort.
377+
then compare its cost to the cost of using the cheapest-overall path and
378+
doing an explicit sort on that.
379379

380380
When we are generating paths for a particular RelOptInfo, we discard a path
381381
if it is more expensive than another known path that has the same or better

src/backend/optimizer/plan/planagg.c

Lines changed: 21 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -316,6 +316,7 @@ find_minmax_aggs_walker(Node *node, List **context)
316316
Assert(aggref->agglevelsup == 0);
317317
if (list_length(aggref->args) != 1)
318318
return true; /* it couldn't be MIN/MAX */
319+
319320
/*
320321
* ORDER BY is usually irrelevant for MIN/MAX, but it can change the
321322
* outcome if the aggsortop's operator class recognizes non-identical
@@ -329,6 +330,7 @@ find_minmax_aggs_walker(Node *node, List **context)
329330
*/
330331
if (aggref->aggorder != NIL)
331332
return true;
333+
332334
/*
333335
* We might implement the optimization when a FILTER clause is present
334336
* by adding the filter to the quals of the generated subquery.
@@ -399,9 +401,8 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
399401
TargetEntry *tle;
400402
NullTest *ntest;
401403
SortGroupClause *sortcl;
402-
Path *cheapest_path;
404+
RelOptInfo *final_rel;
403405
Path *sorted_path;
404-
double dNumGroups;
405406
Cost path_cost;
406407
double path_fraction;
407408

@@ -470,37 +471,35 @@ build_minmax_path(PlannerInfo *root, MinMaxAggInfo *mminfo,
470471
* Generate the best paths for this query, telling query_planner that we
471472
* have LIMIT 1.
472473
*/
473-
query_planner(subroot, parse->targetList, 1.0, 1.0,
474-
minmax_qp_callback, NULL,
475-
&cheapest_path, &sorted_path, &dNumGroups);
474+
subroot->tuple_fraction = 1.0;
475+
subroot->limit_tuples = 1.0;
476+
477+
final_rel = query_planner(subroot, parse->targetList,
478+
minmax_qp_callback, NULL);
476479

477480
/*
478-
* Fail if no presorted path. However, if query_planner determines that
479-
* the presorted path is also the cheapest, it will set sorted_path to
480-
* NULL ... don't be fooled. (This is kind of a pain here, but it
481-
* simplifies life for grouping_planner, so leave it be.)
481+
* Get the best presorted path, that being the one that's cheapest for
482+
* fetching just one row. If there's no such path, fail.
482483
*/
484+
if (final_rel->rows > 1.0)
485+
path_fraction = 1.0 / final_rel->rows;
486+
else
487+
path_fraction = 1.0;
488+
489+
sorted_path =
490+
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
491+
subroot->query_pathkeys,
492+
NULL,
493+
path_fraction);
483494
if (!sorted_path)
484-
{
485-
if (cheapest_path &&
486-
pathkeys_contained_in(subroot->sort_pathkeys,
487-
cheapest_path->pathkeys))
488-
sorted_path = cheapest_path;
489-
else
490-
return false;
491-
}
495+
return false;
492496

493497
/*
494498
* Determine cost to get just the first row of the presorted path.
495499
*
496500
* Note: cost calculation here should match
497501
* compare_fractional_path_costs().
498502
*/
499-
if (sorted_path->parent->rows > 1.0)
500-
path_fraction = 1.0 / sorted_path->parent->rows;
501-
else
502-
path_fraction = 1.0;
503-
504503
path_cost = sorted_path->startup_cost +
505504
path_fraction * (sorted_path->total_cost - sorted_path->startup_cost);
506505

src/backend/optimizer/plan/planmain.c

Lines changed: 20 additions & 207 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,10 @@
2020
*/
2121
#include "postgres.h"
2222

23-
#include "miscadmin.h"
24-
#include "optimizer/cost.h"
2523
#include "optimizer/pathnode.h"
2624
#include "optimizer/paths.h"
2725
#include "optimizer/placeholder.h"
2826
#include "optimizer/planmain.h"
29-
#include "optimizer/tlist.h"
30-
#include "utils/selfuncs.h"
3127

3228

3329
/*
@@ -36,86 +32,58 @@
3632
* which may involve joins but not any fancier features.
3733
*
3834
* Since query_planner does not handle the toplevel processing (grouping,
39-
* sorting, etc) it cannot select the best path by itself. It selects
40-
* two paths: the cheapest path that produces all the required tuples,
41-
* independent of any ordering considerations, and the cheapest path that
42-
* produces the expected fraction of the required tuples in the required
43-
* ordering, if there is a path that is cheaper for this than just sorting
44-
* the output of the cheapest overall path. The caller (grouping_planner)
45-
* will make the final decision about which to use.
35+
* sorting, etc) it cannot select the best path by itself. Instead, it
36+
* returns the RelOptInfo for the top level of joining, and the caller
37+
* (grouping_planner) can choose one of the surviving paths for the rel.
38+
* Normally it would choose either the rel's cheapest path, or the cheapest
39+
* path for the desired sort order.
4640
*
47-
* Input parameters:
4841
* root describes the query to plan
4942
* tlist is the target list the query should produce
5043
* (this is NOT necessarily root->parse->targetList!)
51-
* tuple_fraction is the fraction of tuples we expect will be retrieved
52-
* limit_tuples is a hard limit on number of tuples to retrieve,
53-
* or -1 if no limit
5444
* qp_callback is a function to compute query_pathkeys once it's safe to do so
5545
* qp_extra is optional extra data to pass to qp_callback
5646
*
57-
* Output parameters:
58-
* *cheapest_path receives the overall-cheapest path for the query
59-
* *sorted_path receives the cheapest presorted path for the query,
60-
* if any (NULL if there is no useful presorted path)
61-
* *num_groups receives the estimated number of groups, or 1 if query
62-
* does not use grouping
63-
*
6447
* Note: the PlannerInfo node also includes a query_pathkeys field, which
6548
* tells query_planner the sort order that is desired in the final output
6649
* plan. This value is *not* available at call time, but is computed by
6750
* qp_callback once we have completed merging the query's equivalence classes.
6851
* (We cannot construct canonical pathkeys until that's done.)
69-
*
70-
* tuple_fraction is interpreted as follows:
71-
* 0: expect all tuples to be retrieved (normal case)
72-
* 0 < tuple_fraction < 1: expect the given fraction of tuples available
73-
* from the plan to be retrieved
74-
* tuple_fraction >= 1: tuple_fraction is the absolute number of tuples
75-
* expected to be retrieved (ie, a LIMIT specification)
76-
* Note that a nonzero tuple_fraction could come from outer context; it is
77-
* therefore not redundant with limit_tuples. We use limit_tuples to determine
78-
* whether a bounded sort can be used at runtime.
7952
*/
80-
void
53+
RelOptInfo *
8154
query_planner(PlannerInfo *root, List *tlist,
82-
double tuple_fraction, double limit_tuples,
83-
query_pathkeys_callback qp_callback, void *qp_extra,
84-
Path **cheapest_path, Path **sorted_path,
85-
double *num_groups)
55+
query_pathkeys_callback qp_callback, void *qp_extra)
8656
{
8757
Query *parse = root->parse;
8858
List *joinlist;
8959
RelOptInfo *final_rel;
90-
Path *cheapestpath;
91-
Path *sortedpath;
9260
Index rti;
9361
double total_pages;
9462

95-
/* Make tuple_fraction, limit_tuples accessible to lower-level routines */
96-
root->tuple_fraction = tuple_fraction;
97-
root->limit_tuples = limit_tuples;
98-
99-
*num_groups = 1; /* default result */
100-
10163
/*
10264
* If the query has an empty join tree, then it's something easy like
10365
* "SELECT 2+2;" or "INSERT ... VALUES()". Fall through quickly.
10466
*/
10567
if (parse->jointree->fromlist == NIL)
10668
{
107-
/* We need a trivial path result */
108-
*cheapest_path = (Path *)
109-
create_result_path((List *) parse->jointree->quals);
110-
*sorted_path = NULL;
69+
/* We need a dummy joinrel to describe the empty set of baserels */
70+
final_rel = build_empty_join_rel(root);
71+
72+
/* The only path for it is a trivial Result path */
73+
add_path(final_rel, (Path *)
74+
create_result_path((List *) parse->jointree->quals));
75+
76+
/* Select cheapest path (pretty easy in this case...) */
77+
set_cheapest(final_rel);
11178

11279
/*
11380
* We still are required to call qp_callback, in case it's something
11481
* like "SELECT 2+2 ORDER BY 1".
11582
*/
11683
root->canon_pathkeys = NIL;
11784
(*qp_callback) (root, qp_extra);
118-
return;
85+
86+
return final_rel;
11987
}
12088

12189
/*
@@ -259,165 +227,10 @@ query_planner(PlannerInfo *root, List *tlist,
259227
*/
260228
final_rel = make_one_rel(root, joinlist);
261229

230+
/* Check that we got at least one usable path */
262231
if (!final_rel || !final_rel->cheapest_total_path ||
263232
final_rel->cheapest_total_path->param_info != NULL)
264233
elog(ERROR, "failed to construct the join relation");
265234

266-
/*
267-
* If there's grouping going on, estimate the number of result groups. We
268-
* couldn't do this any earlier because it depends on relation size
269-
* estimates that were set up above.
270-
*
271-
* Then convert tuple_fraction to fractional form if it is absolute, and
272-
* adjust it based on the knowledge that grouping_planner will be doing
273-
* grouping or aggregation work with our result.
274-
*
275-
* This introduces some undesirable coupling between this code and
276-
* grouping_planner, but the alternatives seem even uglier; we couldn't
277-
* pass back completed paths without making these decisions here.
278-
*/
279-
if (parse->groupClause)
280-
{
281-
List *groupExprs;
282-
283-
groupExprs = get_sortgrouplist_exprs(parse->groupClause,
284-
parse->targetList);
285-
*num_groups = estimate_num_groups(root,
286-
groupExprs,
287-
final_rel->rows);
288-
289-
/*
290-
* In GROUP BY mode, an absolute LIMIT is relative to the number of
291-
* groups not the number of tuples. If the caller gave us a fraction,
292-
* keep it as-is. (In both cases, we are effectively assuming that
293-
* all the groups are about the same size.)
294-
*/
295-
if (tuple_fraction >= 1.0)
296-
tuple_fraction /= *num_groups;
297-
298-
/*
299-
* If both GROUP BY and ORDER BY are specified, we will need two
300-
* levels of sort --- and, therefore, certainly need to read all the
301-
* tuples --- unless ORDER BY is a subset of GROUP BY. Likewise if we
302-
* have both DISTINCT and GROUP BY, or if we have a window
303-
* specification not compatible with the GROUP BY.
304-
*/
305-
if (!pathkeys_contained_in(root->sort_pathkeys, root->group_pathkeys) ||
306-
!pathkeys_contained_in(root->distinct_pathkeys, root->group_pathkeys) ||
307-
!pathkeys_contained_in(root->window_pathkeys, root->group_pathkeys))
308-
tuple_fraction = 0.0;
309-
310-
/* In any case, limit_tuples shouldn't be specified here */
311-
Assert(limit_tuples < 0);
312-
}
313-
else if (parse->hasAggs || root->hasHavingQual)
314-
{
315-
/*
316-
* Ungrouped aggregate will certainly want to read all the tuples, and
317-
* it will deliver a single result row (so leave *num_groups 1).
318-
*/
319-
tuple_fraction = 0.0;
320-
321-
/* limit_tuples shouldn't be specified here */
322-
Assert(limit_tuples < 0);
323-
}
324-
else if (parse->distinctClause)
325-
{
326-
/*
327-
* Since there was no grouping or aggregation, it's reasonable to
328-
* assume the UNIQUE filter has effects comparable to GROUP BY. Return
329-
* the estimated number of output rows for use by caller. (If DISTINCT
330-
* is used with grouping, we ignore its effects for rowcount
331-
* estimation purposes; this amounts to assuming the grouped rows are
332-
* distinct already.)
333-
*/
334-
List *distinctExprs;
335-
336-
distinctExprs = get_sortgrouplist_exprs(parse->distinctClause,
337-
parse->targetList);
338-
*num_groups = estimate_num_groups(root,
339-
distinctExprs,
340-
final_rel->rows);
341-
342-
/*
343-
* Adjust tuple_fraction the same way as for GROUP BY, too.
344-
*/
345-
if (tuple_fraction >= 1.0)
346-
tuple_fraction /= *num_groups;
347-
348-
/* limit_tuples shouldn't be specified here */
349-
Assert(limit_tuples < 0);
350-
}
351-
else
352-
{
353-
/*
354-
* Plain non-grouped, non-aggregated query: an absolute tuple fraction
355-
* can be divided by the number of tuples.
356-
*/
357-
if (tuple_fraction >= 1.0)
358-
tuple_fraction /= final_rel->rows;
359-
}
360-
361-
/*
362-
* Pick out the cheapest-total path and the cheapest presorted path for
363-
* the requested pathkeys (if there is one). We should take the tuple
364-
* fraction into account when selecting the cheapest presorted path, but
365-
* not when selecting the cheapest-total path, since if we have to sort
366-
* then we'll have to fetch all the tuples. (But there's a special case:
367-
* if query_pathkeys is NIL, meaning order doesn't matter, then the
368-
* "cheapest presorted" path will be the cheapest overall for the tuple
369-
* fraction.)
370-
*
371-
* The cheapest-total path is also the one to use if grouping_planner
372-
* decides to use hashed aggregation, so we return it separately even if
373-
* this routine thinks the presorted path is the winner.
374-
*/
375-
cheapestpath = final_rel->cheapest_total_path;
376-
377-
sortedpath =
378-
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
379-
root->query_pathkeys,
380-
NULL,
381-
tuple_fraction);
382-
383-
/* Don't return same path in both guises; just wastes effort */
384-
if (sortedpath == cheapestpath)
385-
sortedpath = NULL;
386-
387-
/*
388-
* Forget about the presorted path if it would be cheaper to sort the
389-
* cheapest-total path. Here we need consider only the behavior at the
390-
* tuple fraction point.
391-
*/
392-
if (sortedpath)
393-
{
394-
Path sort_path; /* dummy for result of cost_sort */
395-
396-
if (root->query_pathkeys == NIL ||
397-
pathkeys_contained_in(root->query_pathkeys,
398-
cheapestpath->pathkeys))
399-
{
400-
/* No sort needed for cheapest path */
401-
sort_path.startup_cost = cheapestpath->startup_cost;
402-
sort_path.total_cost = cheapestpath->total_cost;
403-
}
404-
else
405-
{
406-
/* Figure cost for sorting */
407-
cost_sort(&sort_path, root, root->query_pathkeys,
408-
cheapestpath->total_cost,
409-
final_rel->rows, final_rel->width,
410-
0.0, work_mem, limit_tuples);
411-
}
412-
413-
if (compare_fractional_path_costs(sortedpath, &sort_path,
414-
tuple_fraction) > 0)
415-
{
416-
/* Presorted path is a loser */
417-
sortedpath = NULL;
418-
}
419-
}
420-
421-
*cheapest_path = cheapestpath;
422-
*sorted_path = sortedpath;
235+
return final_rel;
423236
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy