共 37 条
[1]
Abbeel P., 2004, Proceedings of the International Conference on Machine Learning (ICML), P1
[2]
Evolving Robust Policy Coverage Sets in Multi-Objective Markov Decision Processes Through Intrinsically Motivated Self-Play
[J].
FRONTIERS IN NEUROROBOTICS,
2018, 12
[3]
[Anonymous], 2022, MountainCarContinuous-v0
[4]
Artetxe M, 2018, Arxiv, DOI arXiv:1710.11041
[5]
Bojarski M, 2016, Arxiv, DOI [arXiv:1604.07316, DOI 10.48550/ARXIV.1604.07316]
[6]
BRADLEY RA, 1952, BIOMETRIKA, V39, P324, DOI 10.1093/biomet/39.3-4.324
[7]
Brown Daniel S., 2019, PR MACH LEARN RES, V97
[8]
Fu JS, 2018, Arxiv, DOI [arXiv:1710.11248, 10.48550/arXiv.1710.11248]
[9]
Fujita Y, 2021, J MACH LEARN RES, V22, P1
[10]
Methods for multi-objective optimization: An analysis
[J].
INFORMATION SCIENCES,
2015, 293
:338-350