Abstract
We propose an iterated version of Nesterov’s first-order smoothing method for the two-person zero-sum game equilibrium problem
This formulation applies to matrix games as well as sequential games. Our new algorithmic scheme computes an \({\epsilon}\)-equilibrium to this min-max problem in \({\mathcal {O}\left(\frac{\|A\|}{\delta(A)} \, {\rm ln}(1{/}\epsilon)\right)}\) first-order iterations, where δ(A) is a certain condition measure of the matrix A. This improves upon the previous first-order methods which required \({\mathcal {O}(1{/}\epsilon)}\) iterations, and it matches the iteration complexity bound of interior-point methods in terms of the algorithm’s dependence on \({\epsilon}\). Unlike interior-point methods that are inapplicable to large games due to their memory requirements, our algorithm retains the small memory requirements of prior first-order methods. Our scheme supplements Nesterov’s method with an outer loop that lowers the target \({\epsilon}\) between iterations (this target affects the amount of smoothing in the inner loop). Computational experiments both in matrix games and sequential games show that a significant speed improvement is obtained in practice as well, and the relative speed improvement increases with the desired accuracy (as suggested by the complexity bounds).
Similar content being viewed by others
References
Bienstock D.: Potential Function Methods for Approximately Solving Linear Programming Problems. Kluwer International Series, Dordrecht (2002)
Dantzig G.: Linear Programming and Extensions. Princeton University Press, Princeton (1963)
Gilpin, A., Sandholm, T., Sørensen, T.B.: Potential-aware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold’em poker. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), pp. 50–57. AAAI Press, Vancouver (2007)
Goffin J.-L.: On the convergence rate of subgradient optimization methods. Math. Program. 13, 329–347 (1977)
Hirriart-Urruty J., Lemaréchal C.: Fundamentals of Convex Analysis. Springer, Berlin (2001)
Hoda S., Gilpin A., Peña J., Sandholm T.: Smoothing techniques for computing Nash equilibria of sequential games. Math. Oper. Res. 35(2), 494–512 (2010)
Koller D., Megiddo N.: The complexity of two-person zero-sum games in extensive form. Games Econ. Behav. 4(4), 528–552 (1992)
Lan, G., Lu, Z., Monteiro, R.D.C.: Primal-dual first-order methods with \({{O}(1{/}\epsilon)}\) iteration-complexity for cone programming. (to appear in Math Program) (2010)
McMahan, H., Gordon, G.J.: A fast bundle-based anytime algorithm for poker and other convex games. In: Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (AISTATS), San Juan, Puerto Rico (2007)
Mordukhovich, B., Peña, J., Roshchina, V.: Computation of a condition measure of a smoothing algorithm for matrix games. (to appear in SIAM J. Optim.) (2010)
Nesterov Y.: A method for unconstrained convex minimization problem with rate of convergence O(1/k 2). Doklady AN SSSR 269, 543–547 (1983) (Translated to English as Soviet Math. Docl.)
Nesterov Y.: Excessive gap technique in nonsmooth convex minimization. SIAM J. Optim. 16(1), 235–249 (2005)
Nesterov Y.: Smooth minimization of non-smooth functions. Math. Program. 103, 127–152 (2005)
Osborne M., Rubinstein A.: A Course in Game Theory. MIT Press, Cambridge (1994)
Romanovskii I.: Reduction of a game with complete memory to a matrix game. Sov. Math. 3, 678–681 (1962)
Shi, J., Littman, M.: Abstraction methods for game theoretic poker. In: CG ’00: Revised Papers from the Second International Conference on Computers and Games, London, UK, pp. 333–345. Springer, Berlin (2002)
Smola, A.J., Vishwanathan, S.V.N., Le, Q.: Bundle methods for machine learning. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), Vancouver, Canada (2007)
von Stengel B.: Efficient computation of behavior strategies. Games Econ. Behav. 14(2), 220–246 (1996)
Wright S.J.: Primal-Dual Interior-Point Methods. SIAM, Philadelphia (1997)
Ye Y., Todd M., Mizuno S.: An \({o(\sqrt{n}{L})}\)-iteration homogeneous and self-dual linear programming algorithm. Math. Oper. Res. 19, 53–67 (1994)
Zinkevich, M., Bowling, M., Burch, N.: A new algorithm for generating equilibria in massive zero-sum games. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), Vancouver, Canada (2007)
Zinkevich, M., Bowling, M., Johanson, M., Piccione, C.: Regret minimization in games with incomplete information. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS), Vancouver, Canada (2007)
Author information
Authors and Affiliations
Corresponding author
Additional information
A short early version of this paper appeared at the National Conference on Artificial Intelligence (AAAI), 2008.
Rights and permissions
About this article
Cite this article
Gilpin, A., Peña, J. & Sandholm, T. First-order algorithm with \({\mathcal{O}({\rm ln}(1{/}\epsilon))}\) convergence for \({\epsilon}\)-equilibrium in two-person zero-sum games. Math. Program. 133, 279–298 (2012). https://doi.org/10.1007/s10107-010-0430-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-010-0430-2