Economics and Computation UK

EigenGame: SVD as a Nash Equilibrium

Abstract. We present a novel view on singular value decomposition (SVD) as a Nash equilibrium of a competitive game in which each approximate singular vector is controlled by a player whose goal is to maximize their own utility function. We analyze the properties of this EigenGame and the behavior of its players learning via gradient based updates. The resulting algorithm -- which combines elements from Oja's rule with a generalized Gram-Schmidt orthogonalization -- is naturally distributed and hence parallelizable through message passing. EigenGame's updates are biased if computed using minibatches of data, which hinders convergence and more sophisticated parallelism in the stochastic setting. Therefore, in follow-up work, we propose an unbiased stochastic update that is asymptotically equivalent to EigenGame, enjoys greater parallelism allowing computation on datasets of larger sample sizes, and outperforms the original EigenGame in experiments. We demonstrate the a) scalability of the algorithm by conducting principal component analyses of large image datasets, language datasets, and neural network activations and b) generality by reusing the same algorithm to perform spectral clustering of a social network. We discuss how this new view of SVD as a differentiable game can lead to further algorithmic developments and insights.

This talk is based on two recent works, both joint work with Brian McWilliams, Claire Vernade, and Thore Graepel
- EigenGame (ICLR ‘21)
- EigenGame Unloaded (ICLR '22)

Ian Gemp (Google DeepMind)

27 Apr '22

Bio. Ian Gemp is a Senior Research Scientist on the Multiagent team at DeepMind. His research focuses primarily on two questions. How should agents behave in a group, be it a competitive, mixed-motive, or cooperative setting? And should individual agents themselves (including their constituent tools and algorithms) be considered multiagent systems in their own right? He studied mechanical engineering and applied math (BS/MS) at Northwestern University (2011) and obtained his MS/PhD in computer science from the University of Massachusetts at Amherst (2018).

Historical Notes and Further Readings

Singular value decomposition (SVD) has a long history and has been studied in a variety of fields including numerics / applied math [1], AI / machine learning (ML) via optimization [2,3,4], and neuroscience [5,6] in the context of biologically plausible learning mechanisms. SVD even has a (Courant-Fisher) minimax formulation [7]. Minimax formulations are pervasive in machine learning for the purpose of minimizing worst-case (adversarial) outcomes.

One reason SVD has received so much attention in machine learning particularly is because it can be used to reduce the dimensionality of datasets without discarding too much useful information (e.g., via principal component analysis [8]). In addition, SVD can also be used to embed graphs for (spectral) clustering [9], and solve linear matrix problems among a myriad of other uses — SVD is a workhorse of classical (pre deep learning) ML although still influences the design of neural networks today (e.g., autoencoders).

Related Work

R. W. Brockett. "Dynamical systems that sort lists, diagonalize matrices, and solve linear programming problems." Linear Algebra and its applications, 146:79–91, 1991.
M. Ghashami, E. Liberty, J. M. Phillips, and D. P. Woodruff. "Frequent directions: simple and deterministic matrix sketching." SIAM Journal on Computing, 45(5):1762–1792, 2016.
N. Halko, P.-G. Martinsson, and J. A. Tropp. "Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions." SIAM Review, 53(2):217–288, 2011.
Z. Allen-Zhu and Y. Li. "First efficient convergence for streaming k-PCA: a global, gap-free, and near-optimal rate." In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 487–492. IEEE, 2017.
E. Oja. "Simplified neuron model as a principal component analyzer." Journal of Mathematical Biology, 15(3):267–273, 1982.
T. D. Sanger. "Optimal unsupervised learning in a single-layer linear feedforward neural network." Neural Networks, 2(6):459–473, 1989.
G. H. Golub and H. A. Van der Vorst. "Eigenvalue computation in the 20th century." Journal of Computational and Applied Mathematics, 123(1-2):35–65, 2000.
I. T. Jolliffe. "Principal components in regression analysis." In Principal Component Analysis. Springer, 2002.
U. Von Luxburg. "A tutorial on spectral clustering." Statistics and Computing, 17(4):395–416, 2007.