Modeling Deception in Multi-Robot Target-Attacker-Defender Game via Deep Reinforcement Learning

Abstract

Deception is a crucial strategy in adversarial scenarios, yet its application in multi-agent confrontations remains understudied. This paper investigates deception in a multi-robot Target-Attacker-Defender (MR-TAD) game, where Attackers aim to capture Targets while evading Defenders. To model deception effectively, we propose a hierarchical decision-making framework that integrates multi-agent reinforcement learning (MARL) for high-level deceptive strategies and optimal control for low-level motion control. Furthermore, we introduce a novel composite deception-oriented reward function, which combines hitting rewards, belief switch rewards, and position advantage rewards to facilitate the training of deceptive behaviors. Simulation results across varying numbers of robots demonstrate that incorporating deception significantly increases the success rate of Attackers, with an average improvement of over 70% compared to non-deceptive strategies. Additionally, real-world experiments with omnidirectional mobile robots further confirm the effectiveness of the proposed method. This study establishes a generalizable framework for modeling deception in multiagent systems, with potential applications in various multi-agent scenarios.

Contributions

We formulated a problem of MR-TAD games where agents can switch their goals, and a novel hierarchical decision-making scheme is proposed to model deception in MR-TAD games.
A composite deception-oriented reward design method is presented to facilitate the training of deceptive behaviors, greatly improving training efficiency and convergence.
Simulations of varying numbers of robots are carried out to demonstrate the scalability and robustness of our method. Besides, we conducted real-world experiments based on omnidirectional vehicles.

Visualization

The following figures describe the overall structure of this work, and the illustration of Target-Attacker-Defender game. Use T, A, and D to denote Target, Attacker, and Defender, respectively.

deception gif 1 — Overall structure of the MR-TAD game

deception gif 2 — Illustration of sub-TAD scenario

The following figures illustrate the training reward and the belief switch process in 6T-3A-4D scenario.

deception gif 4 — Belief switch process of Attackers

The following GIFs illustrate the deception process of multiple Attacker robots trained using our proposed method, and a comparative experiment that use no deception under the same setting.

deception gif 5 — 2T-1A-1D game with deception

deception gif 6 — 2T-1A-1D game without deception

deception gif 7 — 3T-2A-2D game with deception

deception gif 8 — 3T-2A-2D game without deception

deception gif 9 — 4T-2A-3D game with deception

deception gif 10 — 4T-2A-3D game without deception

deception gif 11 — 6T-3A-4D game with deception

deception gif 12 — 6T-3A-4D game without deception

Fandi Gou

Modeling Deception in Multi-Robot Target-Attacker-Defender Game via Deep Reinforcement Learning

Abstract

Contributions

Visualization