Abstract
Differential games are a class of game theory problems governed by differential equations. Differential games are often defined in the continuous domain and solved by the calculus of variations. However, modelling and solving these games are not straightforward tasks. Differential games, like game theory, are often involved with social dilemmas and social behaviours. Modelling these social phenomena with mathematical tools is often problematic. In this paper, we modelled deception to increase the pay-off in differential games. Deception is modelled as a bi-level policy system, and each level is modelled with a fuzzy controller. Fuzzy controllers are trained using a novel hierarchical fuzzy actor-critic learning algorithm. A deceitful player plays against multiple opponents. Although there is one ultimate goal for the player, it can choose multiple fake goals as well. The intention is to find a strategy to switch between the fake goals and the true goal to fool the opponents. The simulation platform is the game of guarding territories, a specific form of the pursuit–evasion games. We propose a method to easily increase the number of defenders with minimum changes in the policies. We create a universal structure that is not affected by the curse of dimensionality. We show that a discerning invader capable of using deception can improve its performance against the defenders by increasing the chance of invasion. We investigate the single-invader single-defender game and the single-invader multi-defender game. We study the superior invader and agents with the same speed. In all mentioned situations, the invader increases its pay-off by using deception versus being honest. A two-level policy system is used in this paper to model deception. The lower-level policy controls each goal’s invasion actions, while the higher-level policy controls deception where a successful game is not initially possible.















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Code/Data Availability
The software and dataset are archived in the Machine Learning and Robotics Laboratory, Carleton University. They are available from the corresponding author on reasonable request.
References
Wagner, A.R., Arkin, R.C.: Acting deceptively: providing robots with the capacity for deception. Int. J. Soc. Robot. 3(1), 5–26 (2011)
Bond, C.F., Robinson, M.: The evolution of deception. J. Nonverbal Behav. 12(4), 295–307 (1988)
Skyrms, B.: Signals: Evolution, Learning, and Information. Oxford University Press, Oxford (2010)
Greenberg, I.: The role of deception in decision theory. J. Conflict Resolut. 26(1), 139–156 (1982)
Whaley, B.: Toward a general theory of deception. J. Strateg. Stud. 5(1), 178–192 (1982)
Ettinger, D., Jehiel, P.: A theory of deception. Am. Econ. J. 2(1), 1–20 (2010)
Bond, C.F., Kahler, K.N., Paolicelli, L.M.: The miscommunication of deception: an adaptive perspective. J. Exp. Soc. Psychol. 21(4), 331–345 (1985)
Shim, J., Arkin, R.C.: “Biologically-inspired deceptive behavior for a robot,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7426 LNAI, pp. 401–411 (2012)
Meira-Góes, R., Kang, E., Kwong, R.H., Lafortune, S.: Synthesis of sensor deception attacks at the supervisory layer of Cyber-Physical Systems. Automatica 121, 109172 (2020)
Ornik, M., Topcu, U.: Deception in Optimal Control. In: Proceedings of the 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2018, pp. 821–828 (2019)
Karabag, M.O., Ornik, M., Topcu, U.: Deception in supervisory control. IEEE Trans. Autom. Control 67(2), 738–753 (2022)
Kouzehgar, M., Badamchizadeh, M.A.: Fuzzy signaling game of deception between ant-inspired deceptive robots with interactive learning. Appl. Soft Comput. 75, 373–387 (2019)
Venkatesan, R.H., Sinha, N.K.: The Target Guarding Problem Revisited: Some Interesting Revelations, vol. 47 (2014). 19th IFAC World Congress
Asgharnia, A., Schwartz, H.M., Atia, M.: Deception in the game of guarding multiple territories: A machine learning approach. In: Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 381–388 (2020)
Asgharnia, A., Schwartz, H.M., Atia, M.: Deception in a multi-agent adversarial game: The game of guarding several territories. In: Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1321–1327 (2020)
Garcia, E., Casbeer, D.W., Pachter, M.: Active target defence differential game: fast defender case. IET Control Theory Appl. 11(17), 2985–2993 (2017)
Garcia, E., Casbeer, D.W., Pachter, M.: The complete differential game of active target defense. arXiv (2020)
Garcia, E., Casbeer, D.W., Pachter, M.: Pursuit in the presence of a defender. Dyn. Games Appl. 9(3), 652–670 (2019)
Isaacs, R.: Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization. Courier Corporation, Chelmsford (1999)
Blasch, E.P., Pham, K., Shen, D.: Orbital satellite pursuit-evasion game-theoretical control. In: Proceedings of the 2012 11th International Conference on Information Science, Signal Processing and their Applications, ISSPA 2012, pp. 1007–1012 (2012)
Lau, M., Steffens, M., Mavris, D.: Closed-loop control in active target defense using machine learning. AIAA Scitech 2019 Forum, no. January (2019)
Awheda, M.D., Schwartz, H.M.: A decentralized fuzzy learning algorithm for pursuit-evasion differential games with superior evaders. J. Intell. Robot. Syst. 83(1), 35–53 (2016)
Schwartz, H.: An object oriented approach to fuzzy actor-critic learning for multi-agent differential games. In: Proceedings of the 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 183–190 (2019)
Pachter, M.:Isaacs’ two-on-one pursuit-evasion game. In: Advances in Dynamic Games, pp. 25–55. Springer (2020)
Garcia, E., Bopardikar, S.D.: Cooperative containment of a high-speed evader. Proc. Am. Control Conf. 2021–May, 4698–4703 (2021)
Garcia, E.: Cooperative target protection from a superior attacker. Automatica 131, 109696 (2021)
Von Moll, A., Casbeer, D., Garcia, E., Milutinović, D., Pachter, M.: The multi-pursuer single-evader game: a geometric approach. J. Intell. Robot. Syst. 96(2), 193–207 (2019)
Yan, R., Shi, Z., Zhong, Y.: Reach-avoid games with two defenders and one attacker: an analytical approach. IEEE Trans. Cybern. 49(3), 1035–1046 (2019)
Yan, R., Shi, Z., Zhong, Y.: Cooperative strategies for two-evader-one-pursuer reach-avoid differential games. Int. J. Syst. Sci. 52(9), 1894–1912 (2021)
Makkapati, V.R., Tsiotras, P.: Optimal evading strategies and task allocation in multi-player Pursuit-Evasion problems. Dyn. Games Appl. 9(4), 1168–1187 (2019)
Qadir, M.Z., Piao, S., Jiang, H., Souidi, M.E.H.: A novel approach for multi-agent cooperative pursuit to capture grouped evaders. J. Supercomput. 76(5), 3416–3426 (2020)
Awheda, M.D., Schwartz, H.M.: A residual gradient fuzzy reinforcement learning algorithm for differential games. Int. J. Fuzzy Syst. 19(4), 1058–1076 (2017)
Leng, L., Li, J., Zhu, J., Hwang, K.S., Shi, H.: Multi-agent reward-iteration fuzzy Q-learning. Int. J. Fuzzy Syst. 23(6), 1669–1679 (2021)
Gneezy, U.: Deception: the role of consequences. Am. Econ. Rev. 95(1), 384–394 (2005)
McEnenaey, W., Singh, R.: Deception in autonomous vehicle decision making in an adversarial environment. Collect. Techn. Pap. 4(August), 3032–3043 (2005)
Dragan, A., Holladay, R., Srinivasa, S.: Deceptive robot motion: synthesis, analysis and experiments. Auton. Robot. 39(3), 331–345 (2015)
Bontrager, P., Khalifa, A., Anderson, D., Stephenson, M., Salge, C., Togelius, J.: superstition in the network: deep reinforcement learning plays deceptive games. Proc. AAAI Conf. 15, 10–16 (2019)
Ghiya, S., Sycara, K.: Learning complex multi-agent policies in presence of an adversary. arXiv:2008.07698 (2020)
Li, C., Wei, X., Zhao, Y., Geng, X.: An effective maximum entropy exploration approach for deceptive game in reinforcement learning R. Neurocomputing 403, 98–108 (2020)
Oliveira, E.D., Donadoni, L., Boriero, S., Bonarini, A.: Deceptive actions to improve the attribution of rationality to playing robotic agents. Int. J. Soc. Robot. 13(2), 391–405 (2021)
Raslan, H., Schwartz, H., Givigi, S.: A learning invader for the guarding a territory game. J. Intell. Roboti. Syst. 83(1), 55–70 (2016)
Klancar, G., Zdesar, A., Blazic, S., Skrjanc, I.: Wheeled Mobile Robotics: From Fundamentals Towards Autonomous Systems, 1st edn. Butterworth-Heinemann, Oxford (2017)
Analikwu, C.V., Schwartz, H.M.: Multi-agent learning in the game of guarding a territory. Int. J. Innov. Comput. Inf. Control 13, 1855–1872 (2017)
Dai, X., Li, C.K., Rad, A.B.: An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control. IEEE Trans. Intell. Transp. Syst. 6(3), 285–293 (2005)
Schwartz, H.M.: Multi-Agent Machine Learning: A Reinforcement Approach. Wiley, Hoboken (2014)
Jouffe, L.: Actor-critic learning based on fuzzy inference system. In: Proceedings of the 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929) 1, pp. 339–344 (1996)
Botvinick, M.M., Niv, Y., Barto, A.C.: Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113(3), 262–280 (2009)
Bacon, P.-l., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1726–1734
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Levy, A., Platt, R., Konidaris, G., Saenko, K.: Learning multi-level hierarchies with hindsight. In: Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, pp. 1–16 (2019)
Chen, S., Arkin, R.C.: Counter-misdirection in behavior-based multi-robot teams. In: Proceedings of the ISR 2021-2021 IEEE International Conference on Intelligence and Safety for Robotics, pp. 268–275 (2021)
Funding
This research is funded by the Natural Sciences and Engineering Research Council of Canada (NSERC). (No. RGPIN-2017-06379 and No. RGPIN-2017-06261).
Author information
Authors and Affiliations
Contributions
AA: Methodology, Software, Writing - original draft. HS: Supervision, Writing - review and editing. MA: Supervision, Writing - review and commenting.
Corresponding author
Ethics declarations
Competing interests
The authors have no financial or proprietary interests in any material discussed in this article.
Ethical Approval
Not applicable (this article does not contain any studies with human participants or animals performed by any of the authors).
Consent to Participate
Not applicable (this article does not contain any studies with human participants or animals performed by any of the authors).
Consent for Publication
All authors have approved the manuscript and agree with its publication on the Journal of Intelligent and Robotic Systems.
Rights and permissions
About this article
Cite this article
Asgharnia, A., Schwartz, H. & Atia, M. Learning Deception Using Fuzzy Multi-Level Reinforcement Learning in a Multi-Defender One-Invader Differential Game. Int. J. Fuzzy Syst. 24, 3015–3038 (2022). https://doi.org/10.1007/s40815-022-01352-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40815-022-01352-6