We present the development and analysis of a reinforcement learning algorithm designed to solve
continuous-space mean field game (MFG) and mean field control (MFC) problems in a unified manner. The
proposed approach pairs the actor-critic (AC) paradigm with a representation of the mean field distribution
via a parameterized score function, which can be efficiently updated in an online fashion, and uses Langevin
dynamics to obtain samples from the resulting distribution. The AC agent and the score function are updated
iteratively to converge, either to the MFG equilibrium or the MFC optimum for a given mean field problem,
depending on the choice of learning rates. A straightforward modification of the algorithm allows us to
solve mixed mean field control games. The performance of our algorithm is evaluated using linear-quadratic
benchmarks in the asymptotic infinite horizon framework.