Fault-Tolerant Optimal Attitude Tracking Control of Quadrotor Subject to State and Input Constraints Using Safe Reinforcement Learning

Roshanravan, Sajad; Shamaghdari, Saeed

Fault-Tolerant Optimal Attitude Tracking Control of Quadrotor Subject to State and Input Constraints Using Safe Reinforcement Learning

Document Type : Dynamics, Vibrations, and Control

Authors

¹ Ph.D. Student, Faculty of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran

² Corresponding author: Associate Professor, Faculty of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran

Abstract

In this article, a method for designing a fault-tolerant optimal attitude tracking control (FTOATC) for a quadrotor UAV subject to component and actuator faults is presented. The proposed fault-tolerant method is based on safe reinforcement learning (SRL) and is capable of ensuring input and state constraints without the need for prior knowledge of the quadrotor dynamics. To this end, the proposed optimal method is presented with a dual neural network (NN) structure consisting of identifier-critic neural networks. In the identifier NN update law, in addition to considering the variable forgetting factor dependent on measurement noise, the experience response method is used, which increases convergence speed and robustness to measurement noise and reduces estimation error. In this method, solving the constrained FTOATC problem is equivalent to solving an unconstrained optimal stabilization problem for an augmented system, where control input constraints and states are guaranteed by selecting suitable cost functions on the input signal and appropriate control barrier functions (CBF)on the states, respectively. Furthermore, fault detection is performed without the need for any model or filter bank, simply by comparing the residual value of the Hamilton-Jacobi-Bellman (HJB) equation with a predetermined threshold. The Uniformly Ultimately Boundedness (UUB) of identifier and critic NN weight errors and, as a result, the convergence of the control input to the neighborhood of the optimal solution are all proved by Lyapunov theory and the performance of the method is validated through simulation results.

Highlights

Model-free method
improvement in the convergence properties of the identifier and critic NNs.
Ensuring input and state constraints.
Guaranteeing system stability at all times
HJB-based fault detection without requiring any additional filter.

Keywords

20.1001.1.26455323.1403.20.1.9.1

References

[1] Amirani MZ, Bigdeli N, Haeri M. Time varying formation control of unmanned aerial vehicle multi-agent systems with unknown leader input. Journal of Aerospace Mechanics.2021;17(2):53-69. DOR https://dorl.net/dor/20.1001.1.26455323.1400.17.2.4.7.

[2] Mahdavi F, Shamaghdari S. Optimal formation control for unmanned aerial vehicle teams with collision avoidance constraint and unknown dynamics. Journal of Aerospace Mechanics. 2023;19(1): 61-79. DOR https://dorl.net/dor/20.1001.1.26455323.1402.19.1.5.0.

[3] Zhao W, Liu H, Lewis FL. Data-driven fault-tolerant control for attitude synchronization of nonlinear quadrotors. IEEE Transactions on Automatic Control. 2021;66(11):5584-91. DOI :10.1109/TAC.2021.3053194.

[4] Amin AA, Hasan KM. A review of fault tolerant control systems: advancements and applications. Measurement. 2019;143:58-68. DOI :10.1016/j.measurement.2019.04.083.

[5] Roshanravan S, Sobhani Gendeshmin B, Shamaghdari S. Design of an actuator fault-tolerant controller for an air vehicle with nonlinear dynamics. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering. 2019;233(10):3534-46. DOI :10.1177/0954410018801254.

[6] Jiang J, Yu X. Fault-tolerant control systems: A comparative study between active and passive approaches. Annual Reviews in control. 2012;36(1):60-72. DOI :10.1016/j.arcontrol.2012.03.005.

[7] Rudin K, Ducard GJ, Siegwart RY. Active fault-tolerant control with imperfect fault detection information: Applications to UAVs. IEEE Transactions on Aerospace and Electronic Systems. 2019;56(4):2792-805.

[8] Lan J, Patton RJ. A new strategy for integration of fault estimation within fault-tolerant control. Automatica. 2016;69:48-59.

[9] Roshanravan S, Shamaghdari S. Simultaneous fault detection and isolation and fault-tolerant control using supervisory control technique: asynchronous switching approach. Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering. 2020;234(8):900-11. DOI :10.1177/0959651819893891.

[10] Ruan Z, Yang Q, Ge SS, Sun Y. Performance-guaranteed fault-tolerant control for uncertain nonlinear systems via learning-based switching scheme. IEEE Transactions on Neural Networks and Learning Systems. 2020;32(9):4138-50. DOI :10.1109/TNNLS.2020.3016954.

[11] Li L, Luo H, Ding SX, Yang Y, Peng K. Performance-based fault detection and fault-tolerant control for automatic control systems. Automatica. 2019;99:308-16. DOI :10.1016/j.automatica.2018.10.047.

[12] Cheng W, Zhang K, Jiang B. Hierarchical Structure-Based Fixed-Time Optimal Fault-Tolerant Time-Varying Output Formation Control for Heterogeneous Multiagent Systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2023;53(8):4856-66.. DOI :10.1109/TSMC.2023.3257426.

[13] Bardi M, Dolcetta IC. Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations: Springer; 1997. DOI :10.1007/978-0-8176-4755-1.

[14] Lewis FL, Vrabie D. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE circuits and systems magazine. 2009;9(3):32-50. DOI :10.1109/MCAS.2009.933854.

[15] Huang J, Zeng W, Xiong H, Noack BR, Hu G, Liu S, Xu Y, Cao H. Symmetry-Informed Reinforcement Learning and its Application to Low-Level Attitude Control of Quadrotors. IEEE Transactions on Artificial Intelligence. 2023;5(3):1147-61. DOI :10.1109/TAI.2023.3249683.

[16] Bernini N, Bessa M, Delmas R, Gold A, Goubault E. Reinforcement learning with formal performance metrics for quadcopter attitude control under non-nominal contexts. Engineering Applications of Artificial Intelligence. 2024; 127: 107090. DOI :10.1016/j.engappai.2023.107090.

[17] Zhu Y, Lian S, Zhong W, Meng, W. Reinforcement learning method for quadrotor attitude control based on expert information. 8th International Conference on Automation, Control and Robotics Engineering (CACRE); 2023: IEEE. DOI :10.1109/CACRE58689.2023.10208497.

[18] Yang Y, Vamvoudakis KG, Modares H, Yin Y, Wunsch DC. Safe intermittent reinforcement learning with static and dynamic event generators. IEEE Transactions on Neural Networks and Learning Systems. 2020;31(12):5441-55. DOI :10.1109/TNNLS.2020.2967871.

[19] Marvi Z, Kiumarsi B. Safe reinforcement learning: A control barrier function optimization approach. International Journal of Robust and Nonlinear Control. 2021;31(6):1923-40. DOI :10.1002/rnc.5132.

[20] Al-Tamimi A, Lewis FL, Abu-Khalaf M. Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B. 2008;38(4):943-9. DOI :10.1109/TSMCB.2008.926614.

[21] Lv Y, Na J, Yang Q, Wu X, Guo Y. Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics. International Journal of Control. 2016;89(1):99-112. DOI :10.1080/00207179.2015.1060362.

[22] Lv Y, Na J, Zhao X, Huang Y, Ren X. Multi-H∞ controls for unknown input-interference nonlinear system with reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems. 2021. DOI :10.1109/TNNLS.2021.3130092.

[23] Mishra A, Ghosh S. Simultaneous identification and optimal tracking control of unknown continuous-time systems with actuator constraints. International Journal of Control. 2022;95(8):2005-23. DOI :10.1080/00207179.2021.1890824.

[24] Roshanravan S, Shamaghdari S. Adaptive fault-tolerant tracking control for affine nonlinear systems with unknown dynamics via reinforcement learning. IEEE Transactions on Automation Science and Engineering. 2022;21(1):569-80. DOI :10.1109/TASE.2022.3223702.

[25] Dierks T, Jagannathan S, editors. Optimal control of affine nonlinear continuous-time systems. Proceedings of the 2010 American control conference; 2010: IEEE. DOI :10.1109/ACC.2010.5531586.

[26] Liu D, Yang X, Wang D, Wei Q. Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints. IEEE transactions on cybernetics. 2015;45(7):1372-85. DOI :10.1109/TCYB.2015.2417170.

[27] Yang H, Jiang B, Staroswiecki M. Supervisory fault tolerant control for a class of uncertain nonlinear systems. Automatica. 2009;45(10):2319-24. DOI :10.1016/j.automatica.2009.06.019.

[28] Ma H-J, Xu L-X, Yang G-H. Multiple environment integral reinforcement learning-based fault-tolerant control for affine nonlinear systems. IEEE Transactions on Cybernetics. 2019;51(4):1913-28. DOI :10.1109/TCYB.2018.2889679.

[29] Choi YC, Ahn HS. Nonlinear control of quadrotor for point tracking: Actual implementation and experimental tests. IEEE/ASME Transactions on Mechatronics. 2014;20(3):1179-92. DOI :10.1109/TMECH.2014.2329945.

[30] Edwards C, Lombaerts T, Smaili H. Fault tolerant flight control. Lecture notes in control and information sciences. 2010;399:1-560. DOI :10.1007/978-3-642-11690-2.

[31] Modares H, Lewis FL, Naghibi-Sistani M-B. Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks. IEEE Transactions on neural networks and learning systems. 2013;24(10):1513-25. DOI :10.1109/TNNLS.2013.2276571.

[32] Na J, Mahyuddin MN, Herrmann G, Ren X, Barber P. Robust adaptive finite‐time parameter estimation and control for robotic systems. International Journal of Robust and Nonlinear Control. 2015;25(16):3045-71. DOI :10.1002/rnc.3247.

[33] Modares H, Lewis FL. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica. 2014;50(7):1780-92. DOI :10.1016/j.automatica.2014.05.011.

[34] Abu-Khalaf M, Lewis FL. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica. 2005;41(5):779-91. DOI :10.1016/j.automatica.2004.11.034.

[35] Modares H, Lewis FL, Naghibi-Sistani M-B. Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica. 2014;50(1):193-202. DOI :10.1016/j.automatica.2013.09.043.

[36] Stone M. The generalized Weierstrass approximation theorem. Mathematics Magazine. 1948;21(5): 237-254.

[37] Rudin W. Principles of mathematical analysis1953.

[38] Ding SX. Model-based fault diagnosis techniques: design schemes, algorithms, and tools: Springer Science & Business Media; 2008.

Fault-Tolerant Optimal Attitude Tracking Control of Quadrotor Subject to State and Input Constraints Using Safe Reinforcement Learning

References

Volume 20, Issue 1 - Serial Number 75
Serial No. 75, Spring Quarterly
April 2024
Pages 141-161

Files

History

Share

How to cite

Statistics

Fault-Tolerant Optimal Attitude Tracking Control of Quadrotor Subject to State and Input Constraints Using Safe Reinforcement Learning

References

Volume 20, Issue 1 - Serial Number 75Serial No. 75, Spring QuarterlyApril 2024Pages 141-161

Files

History

Share

How to cite

Statistics

Volume 20, Issue 1 - Serial Number 75
Serial No. 75, Spring Quarterly
April 2024
Pages 141-161