Aerial combat maneuvering policy learning based on confrontation demonstrations and dynamic quality replay. (May 2022)
- Record Type:
- Journal Article
- Title:
- Aerial combat maneuvering policy learning based on confrontation demonstrations and dynamic quality replay. (May 2022)
- Main Title:
- Aerial combat maneuvering policy learning based on confrontation demonstrations and dynamic quality replay
- Authors:
- Hu, Dongyuan
Yang, Rennong
Zhang, Ying
Yue, Longfei
Yan, Mengda
Zuo, Jialiang
Zhao, Xiaoru - Abstract:
- Abstract: Unmanned Combat Aerial Vehicles (UCAVs) is becoming a crucial platform to perform dangerous missions with the ability of autonomy and intelligence in making decisions. Recently, many researches have focused on UCAV air-to-air combat mission and attempted to solve maneuvering policy using deep reinforcement learning (DRL). Yet, previous studies on the sequential decision problem often have limitations due to the complexity of combat environment, overdependence on expert knowledge, and low learning efficiency from large-scale exploration space. In view of the lacks, the paper aims to automatically formulate counter maneuvers with the application of DRL and proposes a novel dynamic quality replay (DQR) method which is capable of guiding agent learn tactical policy from historical data efficiently, and can get rid of the dependence on traditional expert knowledge. Firstly, in order to accelerate model convergence, confrontation demonstrations generated with traditional algorithms are used for agent basic training before interacting with the environment. Moreover, the formal training is divided into four stages, in which DQR introduces a common framework to filter new interactive data for memory replay, including the process of classification, evaluation and sampling. The episodes are classified into different class based on opponent models and confrontation result. DQR can evaluate sample quality in each class and sample training batch with episode quality dynamicallyAbstract: Unmanned Combat Aerial Vehicles (UCAVs) is becoming a crucial platform to perform dangerous missions with the ability of autonomy and intelligence in making decisions. Recently, many researches have focused on UCAV air-to-air combat mission and attempted to solve maneuvering policy using deep reinforcement learning (DRL). Yet, previous studies on the sequential decision problem often have limitations due to the complexity of combat environment, overdependence on expert knowledge, and low learning efficiency from large-scale exploration space. In view of the lacks, the paper aims to automatically formulate counter maneuvers with the application of DRL and proposes a novel dynamic quality replay (DQR) method which is capable of guiding agent learn tactical policy from historical data efficiently, and can get rid of the dependence on traditional expert knowledge. Firstly, in order to accelerate model convergence, confrontation demonstrations generated with traditional algorithms are used for agent basic training before interacting with the environment. Moreover, the formal training is divided into four stages, in which DQR introduces a common framework to filter new interactive data for memory replay, including the process of classification, evaluation and sampling. The episodes are classified into different class based on opponent models and confrontation result. DQR can evaluate sample quality in each class and sample training batch with episode quality dynamically The new method is suitable for continuous action or discrete action space, and can makes pay-offs between specific demonstrations and exploration. Finally, we designed different initial engagement scenarios to train the agent models, whose basic algorithms are DQN, VPG, DDPG and SAC. In the test experiment, the winning rate of models can reach 0.6 or more, and the best model SAC can reach 0.8 for each stage, which means the feasibility and efficiency of the methods. … (more)
- Is Part Of:
- Engineering applications of artificial intelligence. Volume 111(2022)
- Journal:
- Engineering applications of artificial intelligence
- Issue:
- Volume 111(2022)
- Issue Display:
- Volume 111, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 111
- Issue:
- 2022
- Issue Sort Value:
- 2022-0111-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-05
- Subjects:
- Aerial combat -- Intelligent maneuvering decision -- Situational assessment -- Dynamic quality replay -- Deep reinforcement learning
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.engappai.2022.104767 ↗
- Languages:
- English
- ISSNs:
- 0952-1976
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21214.xml