Skip to content

Commit b03d196

Browse files
committed
Use a non-locking initial test in TAS_SPIN on x86_64.
Testing done in 2011 by Tom Lane concluded that this is a win on Intel Xeons and AMD Opterons, but it was not changed back then, because of an old comment in tas() that suggested that it's a huge loss on older Opterons. However, didn't have separate TAS() and TAS_SPIN() macros back then, so the comment referred to doing a non-locked initial test even on the first access, in uncontended case. I don't have access to older Opterons, but I'm pretty sure that doing an initial unlocked test is unlikely to be a loss while spinning, even though it might be for the first access. We probably should do the same on 32-bit x86, but I'm afraid of changing it without any testing. Hence just add a note to the x86 implementation suggesting that we probably should do the same there.
1 parent 090d0f2 commit b03d196

File tree

1 file changed

+17
-5
lines changed

1 file changed

+17
-5
lines changed

src/include/storage/s_lock.h

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,12 @@ tas(volatile slock_t *lock)
145145
* Use a non-locking test before asserting the bus lock. Note that the
146146
* extra test appears to be a small loss on some x86 platforms and a small
147147
* win on others; it's by no means clear that we should keep it.
148+
*
149+
* When this was last tested, we didn't have separate TAS() and TAS_SPIN()
150+
* macros. Nowadays it probably would be better to do a non-locking test
151+
* in TAS_SPIN() but not in TAS(), like on x86_64, but no-one's done the
152+
* testing to verify that. Without some empirical evidence, better to
153+
* leave it alone.
148154
*/
149155
__asm__ __volatile__(
150156
" cmpb $0,%1 \n"
@@ -200,16 +206,22 @@ typedef unsigned char slock_t;
200206

201207
#define TAS(lock) tas(lock)
202208

209+
/*
210+
* On Intel EM64T, it's a win to use a non-locking test before the xchg proper,
211+
* but only when spinning.
212+
*
213+
* See also Implementing Scalable Atomic Locks for Multi-Core Intel(tm) EM64T
214+
* and IA32, by Michael Chynoweth and Mary R. Lee. As of this writing, it is
215+
* available at:
216+
* http://software.intel.com/en-us/articles/implementing-scalable-atomic-locks-for-multi-core-intel-em64t-and-ia32-architectures
217+
*/
218+
#define TAS_SPIN(lock) (*(lock) ? 1 : TAS(lock))
219+
203220
static __inline__ int
204221
tas(volatile slock_t *lock)
205222
{
206223
register slock_t _res = 1;
207224

208-
/*
209-
* On Opteron, using a non-locking test before the locking instruction
210-
* is a huge loss. On EM64T, it appears to be a wash or small loss,
211-
* so we needn't bother to try to distinguish the sub-architectures.
212-
*/
213225
__asm__ __volatile__(
214226
" lock \n"
215227
" xchgb %0,%1 \n"

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy