Search Papers | Poster Sessions | All Posters
Poster B49 in Poster Session B - Thursday, August 8, 2024, 1:30 – 3:30 pm, Johnson Ice Rink
Neural manifold packing by stochastic gradient descent
Guanming Zhang1, Stefano Martiniani1; 1New York University
In self-supervised learning, different stimulus categories correspond to unique manifolds within an embedded neural state space. Accurate classification can be achieved by separating the manifolds from one another during learning, in a process that is analogous to a packing problem. To theoretically investigate the dynamics of 'neural manifold packing', we consider Stochastic Gradient Descent (SGD) for particle systems in physical dimension. In this framework, SGD aims to minimize a hinge-loss, L, proportional to the particles' overlaps, by performing gradient descent on a batch of randomly selected particles. The resulting stochastic dynamics exhibit a critical packing efficiency, below which the system reaches an 'absorbing' state with zero classification error, L=0, and above which the system settles in an 'active' steady state with L>0. We thus explore the connection between the dynamics of SGD and of a well-characterized absorbing state model known as Biased Random Organization (BRO) that evolves the particle positions with random kicks. We show that in the limit of small kick sizes and learning rates, BRO and SGD on a hinge-loss are equivalent, and exhibit the same critical packing efficiency(0.64). Further, we demonstrate that the behavior of SGD near the critical point is consistent with the Manna universality class. Thus, we propose that 'neural manifold packing' by SGD in high-dimensions is mean-field, given that Manna universality reduces to mean-field critical behavior in d>4. This work furthers our understanding of self-supervised learning dynamics and opens avenues for designing learning algorithms based on physical principles.
Keywords: stochastic gradient descent neural manifolds self-supervised learning