Skip to content

Commit ab5b4e2

Browse files
committed
Speed up AllocSetFreeIndex, which is a significant cost in palloc and pfree,
by using a lookup table instead of a naive shift-and-count loop. Based on code originally posted by Sean Eron Anderson at http://graphics.stanford.edu/%7eseander/bithacks.html. Greg Stark did the research and benchmarking to show that this is what we should use. Jeremy Kerr first noticed that this is a hotspot that could be optimized, though we ended up not using his suggestion of platform-specific bit-searching code.
1 parent f3f45c8 commit ab5b4e2

File tree

1 file changed

+32
-9
lines changed

1 file changed

+32
-9
lines changed

src/backend/utils/mmgr/aset.c

Lines changed: 32 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
* Portions Copyright (c) 1994, Regents of the University of California
1212
*
1313
* IDENTIFICATION
14-
* $PostgreSQL: pgsql/src/backend/utils/mmgr/aset.c,v 1.79 2009/06/11 14:49:06 momjian Exp $
14+
* $PostgreSQL: pgsql/src/backend/utils/mmgr/aset.c,v 1.80 2009/07/21 19:53:12 tgl Exp $
1515
*
1616
* NOTE:
1717
* This is a new (Feb. 05, 1999) implementation of the allocation set
@@ -238,6 +238,17 @@ static MemoryContextMethods AllocSetMethods = {
238238
#endif
239239
};
240240

241+
/*
242+
* Table for AllocSetFreeIndex
243+
*/
244+
#define LT16(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
245+
246+
static const unsigned char LogTable256[256] =
247+
{
248+
0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4,
249+
LT16(5), LT16(6), LT16(6), LT16(7), LT16(7), LT16(7), LT16(7),
250+
LT16(8), LT16(8), LT16(8), LT16(8), LT16(8), LT16(8), LT16(8), LT16(8)
251+
};
241252

242253
/* ----------
243254
* Debug macros
@@ -266,18 +277,30 @@ static MemoryContextMethods AllocSetMethods = {
266277
static inline int
267278
AllocSetFreeIndex(Size size)
268279
{
269-
int idx = 0;
280+
int idx;
281+
unsigned int t,
282+
tsize;
270283

271-
if (size > 0)
284+
if (size > (1 << ALLOC_MINBITS))
272285
{
273-
size = (size - 1) >> ALLOC_MINBITS;
274-
while (size != 0)
275-
{
276-
idx++;
277-
size >>= 1;
278-
}
286+
tsize = (size - 1) >> ALLOC_MINBITS;
287+
288+
/*
289+
* At this point we need to obtain log2(tsize)+1, ie, the number
290+
* of not-all-zero bits at the right. We used to do this with a
291+
* shift-and-count loop, but this function is enough of a hotspot
292+
* to justify micro-optimization effort. The best approach seems
293+
* to be to use a lookup table. Note that this code assumes that
294+
* ALLOCSET_NUM_FREELISTS <= 17, since we only cope with two bytes
295+
* of the tsize value.
296+
*/
297+
t = tsize >> 8;
298+
idx = t ? LogTable256[t] + 8 : LogTable256[tsize];
299+
279300
Assert(idx < ALLOCSET_NUM_FREELISTS);
280301
}
302+
else
303+
idx = 0;
281304

282305
return idx;
283306
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy