There can be some globle state, and some agent state [1].


Cooperative multi-agent RL often requires decentralised policies, which limit the agents’ ability to coordinate their behaviour [2].


[1] Lin, Kaixiang, Renyu Zhao, Zhe Xu, and Jiayu Zhou. “Efficient large-scale fleet management via multi-agent deep reinforcement learning.” In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1774-1783. 2018.

[2] de Witt, Christian Schroeder, Jakob Foerster, Gregory Farquhar, Philip Torr, Wendelin Boehmer, and Shimon Whiteson. “Multi-agent common knowledge reinforcement learning.” In Advances in Neural Information Processing Systems, pp. 9927-9939. 2019.