Skip to content

Commit 08e1eed

Browse files
committed
Fix performance problem when building a lossy tidbitmap.
As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can make building a large tidbitmap into an O(N^2) operation. To fix, make sure we remove more than the minimum amount of information per call, and add a fallback path to behave sanely if we're unable to fit the bitmap within the requested amount of memory. This has been wrong since the tidbitmap code was written, so back-patch to all supported branches.
1 parent ee639d2 commit 08e1eed

File tree

1 file changed

+19
-3
lines changed

1 file changed

+19
-3
lines changed

src/backend/nodes/tidbitmap.c

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -953,8 +953,11 @@ tbm_lossify(TIDBitmap *tbm)
953953
/*
954954
* XXX Really stupid implementation: this just lossifies pages in
955955
* essentially random order. We should be paying some attention to the
956-
* number of bits set in each page, instead. Also it might be a good idea
957-
* to lossify more than the minimum number of pages during each call.
956+
* number of bits set in each page, instead.
957+
*
958+
* Since we are called as soon as nentries exceeds maxentries, we should
959+
* push nentries down to significantly less than maxentries, or else we'll
960+
* just end up doing this again very soon. We shoot for maxentries/2.
958961
*/
959962
Assert(!tbm->iterating);
960963
Assert(tbm->status == TBM_HASH);
@@ -975,7 +978,7 @@ tbm_lossify(TIDBitmap *tbm)
975978
/* This does the dirty work ... */
976979
tbm_mark_page_lossy(tbm, page->blockno);
977980

978-
if (tbm->nentries <= tbm->maxentries)
981+
if (tbm->nentries <= tbm->maxentries / 2)
979982
{
980983
/* we have done enough */
981984
hash_seq_term(&status);
@@ -988,6 +991,19 @@ tbm_lossify(TIDBitmap *tbm)
988991
* not care whether we visit lossy chunks or not.
989992
*/
990993
}
994+
995+
/*
996+
* With a big bitmap and small work_mem, it's possible that we cannot
997+
* get under maxentries. Again, if that happens, we'd end up uselessly
998+
* calling tbm_lossify over and over. To prevent this from becoming a
999+
* performance sink, force maxentries up to at least double the current
1000+
* number of entries. (In essence, we're admitting inability to fit
1001+
* within work_mem when we do this.) Note that this test will not fire
1002+
* if we broke out of the loop early; and if we didn't, the current
1003+
* number of entries is simply not reducible any further.
1004+
*/
1005+
if (tbm->nentries > tbm->maxentries / 2)
1006+
tbm->maxentries = Min(tbm->nentries, (INT_MAX - 1) / 2) * 2;
9911007
}
9921008

9931009
/*

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy