참고
[1] Hong et al. Agent-Oriented Centralized Critic for Asynchronous Multi-Agent Reinforcement Learning. The Sixteenth Workshop on Adaptive and Learning Agents. 2024.

[2] Amato et al. Modeling and Planning with Macro-Actions in Decentralized POMDPs. Journal of Artificial Intelligence Research 64. 2019.

[3] Xiao et al. Asynchronous Actor-Critic for Multi-Agent Reinforcement Learning. Advances in Neural Information Processing Systems 35. 2022.

[4] Barde et al. A Model-Based Solution to the Offline Multi-Agent Reinforcement Learning Coordination Problem. Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems. 2024.

[5] Kumar et al. Conservative Q-Learning for Offline Reinforcement Learning. Advances in Neural Information Processing Systems 33. 2020.

[6] Fujimoto et al. A Minimalist Approach to Offline Reinforcement Learning. Advances in Neural Information Processing Systems 34. 2021.

[7] Zhao et al. Improving Offline-to-Online Reinforcement Learning with Q-Ensembles. Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems. 2024.

[8] Lee et al. SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning. Proceedings of the 38th International Conference on Machine Learning. 2021.