Imitation learning of a model predictive controller for real-time humanoid robot walking

被引：0

作者：

Porto, Vitor G. B. de A. ^{[1
]}

Melo, Dicksiano C. ^{[1
]}

Maximo, Marcos R. O. A. ^{[1
]}

Afonso, Rubens J. M. ^{[2
]}

机构：

[1] Aeronaut Inst Technol, Comp Sci Div, Autonomous Computat Syst Lab LAB SCA, Sao Jose Dos Campos, Brazil

[2] Aeronaut Inst Technol, Elect Engn Div, Sao Jose Dos Campos, Brazil

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2025年 / 143卷

关键词：

Humanoid robot walking; Imitation learning; Neural network; Model predictive control;

D O I：

10.1016/j.engappai.2024.109919

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Bipedal walking is an especially challenging task for humanoid robots, as it requires a robust controller to make a humanoid robot walk stably and react to disturbances. State-of-the-art algorithms have made use of model predictive controllers, but they require an exceedingly high computational cost and are often impractical to embed in a real robot's hardware. This paper contributes by showing how imitation learning may be employed to copy the behavior of a model predictive controller to a neural network, being able to predict the trajectory of the Center of Mass (CoM), as well as the position, rotation, and duration of each step. Our method was tested on a simplified simulation, a realistic full-body simulation, and on areal humanoid robot. The results showcase an algorithm that is close in terms of performance to the original controller, while requiring only a small fraction of its computational cost - with a speedup of 650 times, enabling it to be used in real time on real robots.

引用

页数：21

共 61 条

[1] Abadi M., Agarwal A., Barham P., Brevdo E., Chen Z., Citro C., Corrado G.S., Davis A., Dean J., Devin M., Ghemawat S., Goodfellow I., Harp A., Irving G., Isard M., Jia Y., Jozefowicz R., Kaiser L., Kudlur M., Levenberg J., Mane D., Monga R., Moore S., Murray D., Olah C., Schuster M., Shlens J., Steiner B., Sutskever I., Talwar K., Tucker P., Vanhoucke V., Vasudevan V., Viegas F., Vinyals O., Warden P., Wattenberg M., Wicke M., Yu Y., Zheng X., TensorFlow: Large-scale machine learning on heteroge
[2] Aftab Z., Robert T., Wieber, P.-B. Ankle, hip and stepping strategies for humanoid balance recovery with a single Model Predictive Control scheme, pp. 159-164, (2012)
[3] Akesson B.M., Toivonen H.T., A neural network model predictive controller, J. Process Control, 16, 9, pp. 937-946, (2006)
[4] Akiba T., Sano S., Yanase T., Ohta T., Koyama M., Optuna: A next-generation hyperparameter optimization framework, pp. 2623-2631, (2019)
[5] Bain M., Sammut C., A framework for behavioural cloning, Machine Intelligence 15, (1995)
[6] Bohorquez N., Wieber P., Adaptive step duration in biped walking: A robust approach to nonlinear constraints, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics, Humanoids, pp. 724-729, (2017)
[7] Bohorquez N., Wieber P., Adaptive step rotation in biped walking, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, pp. 720-725, (2018)
[8] Carius J., Farshidian F., Hutter M., MPC-net: A first principles guided policy search, IEEE Robotics Autom. Lett., 5, 2, pp. 2897-2904, (2020)
[9] Cavagnari L., Magni L., Scattolini R., Neural network implementation of nonlinear receding-horizon control, Neural Comput. Appl., 8, pp. 86-92, (1999)
[10] Chen S., Saulnier K., Atanasov N., Lee D.D., Kumar V., Pappas G.J., Morari M., Approximating explicit model predictive control using constrained neural networks, 2018 Annual American Control Conference, ACC, pp. 1520-1527, (2018)

← 1 2 3 4 5 6 7 →