共 32 条
- [1] Mnih V., Kavukcuoglu K., Silver D., Rusu A.A., Veness J., Bellemare M.G., Graves A., Riedmiller M., Fidjeland A.K., Ostrovski G., Et al., Human-level control through deep reinforcement learning, Nature, 518, 7540, pp. 529-533, (2015)
- [2] Argall B.D., Chernova S., Veloso M., Browning B., A survey of robot learning from demonstration, Robot Autonom Syst, 57, 5, pp. 469-483, (2009)
- [3] Silver D., Hubert T., Schrittwieser J., Antonoglou I., Lai M., Guez A., Lanctot M., Sifre L., Kumaran D., Graepel T., Et al., A general reinforcement learning algorithm that masters chess, shogi, and go through self-play, Science, 362, 6419, pp. 1140-1144, (2018)
- [4] Zhu Z., Zhao H., A survey of deep RL and IL for autonomous driving policy learning, IEEE Trans Intell Transp Syst, 23, 9, pp. 14043-14065, (2021)
- [5] Peng P., Barnes M., Wang C., Wang W., Li S., Swanson H.L., Dardick W., Tao S., A meta-analysis on the relation between reading and working memory, Psychol Bull, 144, 1, (2018)
- [6] Sadeghi F., Levine S., Cad2rl: Real single-image flight without a single real image., (2016)
- [7] Liu H., Huang Z., Wu J., Lv C., Improved deep reinforcement learning with expert demonstrations for urban autonomous driving, 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 921-928, (2022)
- [8] Liu H., Liu H.H., Chi C., Zhai Y., Zhan X., Navigation information augmented artificial potential field algorithm for collision avoidance in UAV formation flight, Aerosp Syst, 3, pp. 229-241, (2020)
- [9] Kumar D., Pandey M., An effective and secure data sharing in p2p network using biased contribution index based rumour riding protocol (bcirr), Opt Mem Neural Netw, 29, 4, pp. 336-353, (2020)
- [10] Kumar D., Dubey A.K., Pandey M., Time and position aware resource search algorithm for the mobile peer-to-peer network using ant colony optimisation, Int J Commun Netw Distrib Syst, 28, 6, pp. 621-654, (2022)