Skip to content

Commit 64fe602

Browse files
committed
Fixes for Disk-based Hash Aggregation.
Justin Pryzby raised a couple issues with commit 1f39bce. Fixed. Also, tweak the way the size of a hash entry is estimated and the number of buckets is estimated when calling BuildTupleHashTableExt(). Discussion: https://www.postgresql.org/message-id/20200319064222.GR26184@telsasoft.com
1 parent 0830d21 commit 64fe602

File tree

2 files changed

+8
-13
lines changed

2 files changed

+8
-13
lines changed

src/backend/commands/explain.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2778,7 +2778,7 @@ static void
27782778
show_hashagg_info(AggState *aggstate, ExplainState *es)
27792779
{
27802780
Agg *agg = (Agg *)aggstate->ss.ps.plan;
2781-
long memPeakKb = (aggstate->hash_mem_peak + 1023) / 1024;
2781+
int64 memPeakKb = (aggstate->hash_mem_peak + 1023) / 1024;
27822782

27832783
Assert(IsA(aggstate, AggState));
27842784

src/backend/executor/nodeAgg.c

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1873,17 +1873,12 @@ hash_agg_update_metrics(AggState *aggstate, bool from_tape, int npartitions)
18731873
aggstate->hash_disk_used = disk_used;
18741874
}
18751875

1876-
/*
1877-
* Update hashentrysize estimate based on contents. Don't include meta_mem
1878-
* in the memory used, because empty buckets would inflate the per-entry
1879-
* cost. An underestimate of the per-entry size is better than an
1880-
* overestimate, because an overestimate could compound with each level of
1881-
* recursion.
1882-
*/
1876+
/* update hashentrysize estimate based on contents */
18831877
if (aggstate->hash_ngroups_current > 0)
18841878
{
18851879
aggstate->hashentrysize =
1886-
hash_mem / (double)aggstate->hash_ngroups_current;
1880+
sizeof(TupleHashEntryData) +
1881+
(hash_mem / (double)aggstate->hash_ngroups_current);
18871882
}
18881883
}
18891884

@@ -1899,10 +1894,10 @@ hash_choose_num_buckets(double hashentrysize, long ngroups, Size memory)
18991894
max_nbuckets = memory / hashentrysize;
19001895

19011896
/*
1902-
* Leave room for slop to avoid a case where the initial hash table size
1903-
* exceeds the memory limit (though that may still happen in edge cases).
1897+
* Underestimating is better than overestimating. Too many buckets crowd
1898+
* out space for group keys and transition state values.
19041899
*/
1905-
max_nbuckets *= 0.75;
1900+
max_nbuckets >>= 1;
19061901

19071902
if (nbuckets > max_nbuckets)
19081903
nbuckets = max_nbuckets;
@@ -3548,7 +3543,7 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
35483543
* reasonable.
35493544
*/
35503545
for (i = 0; i < aggstate->num_hashes; i++)
3551-
totalGroups = aggstate->perhash[i].aggnode->numGroups;
3546+
totalGroups += aggstate->perhash[i].aggnode->numGroups;
35523547

35533548
hash_agg_set_limits(aggstate->hashentrysize, totalGroups, 0,
35543549
&aggstate->hash_mem_limit,

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy