Sum-of-Squares Optimization Based Approximate Dynamic Programming for Time-Varying Systems and Its Application in Suboptimal Guidance Law Design

Document Type : Dynamics, Vibrations, and Control

Authors

1 Ph.D. Student, Department of Control, Faculty of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran

2 Corresponding author: Associate Professor, Department of Control, Faculty of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran

Abstract

In this paper, we propose a method for sub-optimal control of time-varying polynomial systems and use it for pursuits guidance law design. Since, engagement equations between pursuit and target are depend on the range between them and this range is varying during the flight, guidance law designer is faced with a time-varying system. The developed methods for control of time-invariant systems are not directly applicable for time-varying systems. One of the conventional approaches for pursuits guidance law design is the optimal control. Approximate dynamic programming is a well-known method for solving the optimal control problem. One of the challenges of using this method for control of nonlinear time-varying systems is the difficulty of solving the Bellman equation. In the proposed method of this paper, solving the Bellman equation has been relaxed with solving a sum-of-squares optimization problem. It will be proved that the designed control policy with this method is globally exponentially stabilizing. Finally, performance of the proposed method for pursuits guidance will be illustrated with numerical simulations.

Keywords


Smiley face

[1] Shneydor NA. Missile guidance and pursuit: kinematics, dynamics and control: Elsevier; 1998.##
[2] Shaferman V, Shima T. Linear quadratic guidance laws for imposing a terminal intercept angle. Journal of Guidance, Control, and Dynamics. 2008;31(5):1400-12.##
[3] Nasrollahi S, Khooshehmehri A. A model of predictive terminal guidance based on whale optimization algorithm considering the aerodynamic model of the pursuer. Journal of Aerospace Mechanics. 2021;17(1):37-50.##
[4] Mohammadzaman I, Momeni H. PI Guidance Law Design with Finite Time Convergence. Journal of Aerospace Mechanics. 2011;7(1).##
[5] Golestani M, Mohammadzaman I, Yazdanpanah MJ, Vali AR. Application of finite-time integral sliding mode to guidance law design. Journal of Dynamic Systems, Measurement, and Control. 2015;137(11):114501.##
[6] Yang C-D, Chen H-Y. Nonlinear H robust guidance law for homing missiles. Journal of Guidance, Control, and Dynamics. 1998;21(6):882-90.##
[7] Kirk DE. Optimal control theory: an introduction: Courier Corporation; 2004.##
[8] Bellman R. Dynamic programming. Science. 1966;153(3731):34-7.##
[9] Bellman R, Dreyfus S. Functional approximations and dynamic programming. Mathematical Tables and Other Aids to Computation. 1959:247-51.##
[10] Howard RA. Dynamic programming and markov processes. 1960.##
[11] Werbos P. Advanced forecasting methods for global crisis warning and models of intelligence. General System Yearbook. 1977:25-38.##
[12] Kiumarsi B, Vamvoudakis KG, Modares H, Lewis FL. Optimal and autonomous control using reinforcement learning: A survey. IEEE transactions on neural networks and learning systems. 2017;29(6):2042-62.##
[13] Sun J, Liu C, Ye Q. Robust differential game guidance laws design for uncertain interceptor-target engagement via adaptive dynamic programming. International Journal of Control. 2017;90(5):990-1004.##
[14] Pakkhesal S, Shamaghdari S. Sum‐of‐squares‐based policy iteration for suboptimal control of polynomial time‐varying systems. Asian Journal of Control. 2021.##
[15] Parrilo PA. Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization: California Institute of Technology; 2000.##
[16] Jiang Y, Jiang Z-P. Global adaptive dynamic programming for continuous-time nonlinear systems. IEEE Transactions on Automatic Control. 2015;60(11):2917-29.##
[17] Zhu Y, Zhao D, He H. Invariant adaptive dynamic programming for discrete-time optimal control. IEEE Transactions on Systems, Man, and Cybernetics: Systems. 2019;50(11):3959-71.##
[18] Yazdani NM, Moghaddam RK, Kiumarsi B, Modares H. A Safety-Certified Policy Iteration Algorithm for Control of Constrained Nonlinear Systems. IEEE Control Systems Letters. 2020;4(3):686-91.##
[19] Vandenberghe L, Boyd S. Semidefinite programming. SIAM review. 1996;38(1):49-95.##
[20] Prajna S, Papachristodoulou A, Parrilo PA. SOSTOOLS: sum of squares optimization toolbox for MATLAB–user’s guide. Control and Dynamical Systems, California Institute of Technology, Pasadena, CA. 2004;91125.##
[21] ApS M. Mosek optimization toolbox for MATLAB. User’s Guide and Reference Manual, Version. 2019;4.##
[22] Khalil HK. Nonlinear systems third edition. Patience Hall. 2002;115.##
Volume 18, Issue 4 - Serial Number 70
Serial No. 70, Winter Quarterly
December 2022
Pages 89-103
  • Receive Date: 10 July 2022
  • Revise Date: 09 August 2022
  • Accept Date: 24 August 2022
  • Publish Date: 23 October 2022