skip to main content
10.1145/3678957.3685721acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Participation Role-Driven Engagement Estimation of ASD Individuals in Neurodiverse Group Discussions

Published: 04 November 2024 Publication History

Abstract

Adults with autism spectrum disorder (ASD) face difficulties in communicating with neurotypical people in their daily lives and workplaces. In addition, research on modeling communication in neurodiverse groups is scarce. To recognize communication difficulties caused by neurodiversity, we first, collected a multimodal corpus for decision-making discussions in neurodiverse groups that included a person with ASD and two neurotypical participants. For corpus analysis, we investigated eye-gaze and facial expression exchanges between individuals with ASD and neurotypical participants during both listening and speaking. The findings were extended to automatically estimate the engagement of ASD individuals. To capture the effect of contingent behaviors between ASD individuals and neurotypical participants, we developed a transformer-based model that considers the participation role by changing the direction of cross-person attention depending on whether the ASD individual is listening or speaking. The proposed approach yields comparable results to the state-of-the-art for engagement estimation in neurotypical group conversations while accounting for the dynamic nature of behavior influence in face-to-face interactions. The code associated with this study is available at https://github.com/IUI-Lab/switch-attention.

References

[1]
Seyedeh Zahra Asghari, Sajjad Farashi, Saeid Bashirian, and Ensiyeh Jenabi. 2021. Distinctive prosodic features of people with autism spectrum disorder: a systematic review and meta-analysis study. Scientific Reports 11, Article 23093 (2021). https://doi.org/10.1038/s41598-021-02487-6
[2]
American Psychiatric Association. 2013. Diagnostic and statistical manual of mental disorders : DSM-5. — 5th ed.
[3]
Tadas Baltrusaitis, Peter Robinson, and Louis Philippe Morency. 2016. OpenFace: An open source facial behavior analysis toolkit. In 2016 IEEE Winter Conference on Applications of Computer Vision, WACV 2016. IEEE, 1–10.
[4]
Dan Bohus and Eric Horvitz. 2009. Models for Multiparty Engagement in Open-World Dialog. In Proceedings of the SIGDIAL 2009 Conference, Patrick Healey, Roberto Pieraccini, Donna Byron, Steve Young, and Matthew Purver (Eds.). Association for Computational Linguistics, London, UK, 225–234. https://aclanthology.org/W09-3933
[5]
Zhixi Cai, Shreya Ghosh, Kalin Stefanov, Abhinav Dhall, Jianfei Cai, Hamid Rezatofighi, Reza Haffari, and Munawar Hayat. 2023. MARLIN: Masked Autoencoder for facial video Representation LearnINg. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1493–1504. https://doi.org/10.1109/CVPR52729.2023.00150
[6]
Ginevra Castellano, Iolanda Leite, André Pereira, Carlos Martinho, Ana Paiva, and Peter W. Mcowan. 2014. Context-Sensitive Affect Recognition for a Robotic Game Companion. ACM Trans. Interact. Intell. Syst. 4, 2, Article 10 (jun 2014), 25 pages. https://doi.org/10.1145/2622615
[7]
Ginevra Castellano, André Pereira, Iolanda Leite, Ana Paiva, and Peter W. McOwan. 2009. Detecting user engagement with a robot companion using task and social interaction-based features. In Proceedings of the 2009 International Conference on Multimodal Interfaces (Cambridge, Massachusetts, USA) (ICMI-MLMI ’09). Association for Computing Machinery, New York, NY, USA, 119–126. https://doi.org/10.1145/1647314.1647336
[8]
A. Choi, C. D. Melo, W. Woo, and J. Gratch. 2012. Affective engagement to emotional facial expressions of embodied social agents in a decision-making game. Comput. Anim. VirtualWorlds 23 (2012). https://doi.org/10.1002/cav.1458
[9]
Nick S. Dalton. 2013. Neurodiversity HCI. interactions 20, 2 (March 2013), 72–75. https://oro.open.ac.uk/37774/
[10]
Soumia Dermouche and Catherine Pelachaud. 2019. Engagement Modeling in Dyadic Interaction. In 2019 International Conference on Multimodal Interaction (Suzhou, China) (ICMI ’19). Association for Computing Machinery, New York, NY, USA, 440–445. https://doi.org/10.1145/3340555.3353765
[11]
Shichuan Du, Yong Tao, and Aleix M. Martinez. 2014. Compound facial expressions of emotion. Proceedings of the National Academy of Sciences 111, 15 (2014), E1454–E1462. https://doi.org/10.1073/pnas.1322355111 arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.1322355111
[12]
D. Gatica-Perez, L. McCowan, Dong Zhang, and S. Bengio. 2005. Detecting group interest-level in meetings. In Proceedings. (ICASSP ’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., Vol. 1. I/489–I/492 Vol. 1. https://doi.org/10.1109/ICASSP.2005.1415157
[13]
Kornelia Gentsch, Didier Grandjean, and Klaus Scherer. 2015. Appraisals Generate Specific Configurations of Facial Muscle Movements in a Gambling Task: Evidence for the Component Process Model of Emotion. PLoS ONE 10, 8 (2015), e0135837. https://doi.org/10.1371/journal.pone.0135837
[14]
Ryo Ishii, Yukiko I. Nakano, and Toyoaki Nishida. 2013. Gaze awareness in conversational agents: Estimating a user’s conversational engagement from eye gaze. ACM Trans. Interact. Intell. Syst. 3, 2, Article 11 (aug 2013), 25 pages. https://doi.org/10.1145/2499474.2499480
[15]
Connor Tom Keating and Jennifer Louise Cook. 2020. Facial Expression Production and Recognition in Autism Spectrum Disorders: A Shifting Landscape. Child and Adolescent Psychiatric Clinics of North America 29, 3 (2020), 557–571. https://doi.org/10.1016/j.chc.2020.02.006 Autism Spectrum Disorder Across the Lifespan: Part II.
[16]
Yubin Kim, Dong Won Lee, Paul Pu Liang, Sharifa Alghowinem, Cynthia Breazeal, and Hae Won Park. 2023. HIINT: Historical, Intra- and Inter- personal Dynamics Modeling with Cross-person Memory Transformer(ICMI ’23). Association for Computing Machinery, New York, NY, USA, 314–325. https://doi.org/10.1145/3577190.3614122
[17]
Young-Kyung Kim, Rimita Lahiri, Md. Nasir, So Hyun Kim, Somer Bishop, Catherine Lord, and Shrikanth S. Narayanan. 2021. Analyzing Short Term Dynamic Speech Features for Understanding Behavioral Traits of Children with Autism Spectrum Disorder. In Proc. Interspeech 2021. 2916–2920. https://doi.org/10.21437/Interspeech.2021-2111
[18]
D. P. Kingma and J. Ba. 2014. Adam: a method for stochastic optimization. Computing Research Repository abs/1412.6980 (2014).
[19]
Dong Won Lee, Yubin Kim, Rosalind Picard, Cynthia Breazeal, and Hae Won Park. 2023. Multipar-T: Multiparty-Transformer for Capturing Contingent Behaviors in Group Conversations. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23).
[20]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2020. Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 2 (2020), 318–327. https://doi.org/10.1109/TPAMI.2018.2858826
[21]
Jiayao Ma, Xinbo Jiang, Songhua Xu, and Xueying Qin. 2021. Hierarchical Temporal Multi-Instance Learning for Video-based Student Learning Engagement Assessment. In IJCAI 2021. 2782–2789.
[22]
L. McCowan, D. Gatica-Perez, S. Bengio, G. Lathoud, M. Barnard, and D. Zhang. 2005. Automatic analysis of multimodal group actions in meetings. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 3 (2005), 305–317. https://doi.org/10.1109/TPAMI.2005.49
[23]
Lilia Moshkina, Susan Trickett, and J. Gregory Trafton. 2014. Social engagement in public places: a tale of one robot. In Proceedings of the 2014 ACM/IEEE International Conference on Human-Robot Interaction (Bielefeld, Germany) (HRI ’14). Association for Computing Machinery, New York, NY, USA, 382–389. https://doi.org/10.1145/2559636.2559678
[24]
Yukiko I. Nakano and Ryo Ishii. 2010. Estimating user’s engagement from eye-gaze behaviors in human-agent conversations. In Proceedings of the 15th International Conference on Intelligent User Interfaces (Hong Kong, China) (IUI ’10). Association for Computing Machinery, New York, NY, USA, 139–148. https://doi.org/10.1145/1719970.1719990
[25]
Fumio Nihei, Yukiko I. Nakano, Yuki Hayashi, Hung-Hsuan Hung, and Shogo Okada. 2014. Predicting Influential Statements in Group Discussions using Speech and Head Motion Information. In Proceedings of the 16th International Conference on Multimodal Interaction (Istanbul, Turkey) (ICMI ’14). Association for Computing Machinery, New York, NY, USA, 136–143. https://doi.org/10.1145/2663204.2663248
[26]
Catharine Oertel, Ginevra Castellano, Mohamed Chetouani, Jauwairia Nasir, Mohammad Obaid, Catherine Pelachaud, and Christopher Peters. 2020. Engagement in Human-Agent Interaction: An Overview. Frontiers in Robotics and AI 7 (2020). https://doi.org/10.3389/frobt.2020.00092
[27]
Catharine Oertel and Giampiero Salvi. 2013. A gaze-based method for relating group involvement to individual engagement in multimodal multiparty dialogue. In Proceedings of the 15th ACM on International Conference on Multimodal Interaction (Sydney, Australia) (ICMI ’13). Association for Computing Machinery, New York, NY, USA, 99–106. https://doi.org/10.1145/2522848.2522865
[28]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Autodiff Workshop, NIPS.
[29]
Isabella Poggi. 2007. Mind, hands, face and body : a goal and belief view of multimodal communication. Weidler.
[30]
Ognjen Rudovic, Jaeryoung Lee, Miles Dai, Bjorn Schuller, and Rosalind W. Picard. 2018. Personalized machine learning for robot perception of affect and engagement in autism therapy. Science Robotics 3, 19 (2018), eaao6760. https://doi.org/10.1126/scirobotics.aao6760 arXiv:https://www.science.org/doi/pdf/10.1126/scirobotics.aao6760
[31]
Ognjen Rudovic, Meiru Zhang, Bjorn Schuller, and Rosalind Picard. 2019. Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning Approach. In 2019 International Conference on Multimodal Interaction (Suzhou, China) (ICMI ’19). Association for Computing Machinery, New York, NY, USA, 6–15. https://doi.org/10.1145/3340555.3353742
[32]
Takeshi Saga, Hiroki Tanaka, and Satoshi Nakamura. 2023. Computational analyses of linguistic features with schizophrenic and autistic traits along with formal thought disorders. In Proceedings of the 25th International Conference on Multimodal Interaction(ICMI ’23). Association for Computing Machinery, New York, NY, USA, 119–124. https://doi.org/10.1145/3577190.3614132
[33]
Hanan Salam, Oya Çeliktutan, Isabelle Hupont, Hatice Gunes, and Mohamed Chetouani. 2017. Fully Automatic Analysis of Engagement and Its Relationship to Personality in Human-Robot Interactions. IEEE Access 5 (2017), 705–721. https://doi.org/10.1109/ACCESS.2016.2614525
[34]
Joel Shor and Subhashini Venugopalan. 2022. TRILLsson: Distilled Universal Paralinguistic Speech Representations. In Proc. Interspeech 2022. 356–360. https://doi.org/10.21437/Interspeech.2022-118
[35]
Candace L. Sidner, Cory D. Kidd, Christopher Lee, and Neal Lesh. 2004. Where to look: a study of human-robot engagement. In Proceedings of the 9th International Conference on Intelligent User Interfaces (Funchal, Madeira, Portugal) (IUI ’04). Association for Computing Machinery, New York, NY, USA, 78–84. https://doi.org/10.1145/964442.964458
[36]
Judy Singer. 1999. Why can’t you be normal for once in your life? From a problem with no name to the emergence of a new category of difference. Disability Discourse (1999), 59–70.
[37]
Sudha M. Srinivasan, Inge-Marie Eigsti, Linda Neelly, and Anjana N. Bhat. 2016. The effects of embodied rhythm and robotic interventions on the spontaneous and responsive social attention patterns of children with autism spectrum disorder (ASD): A pilot randomized controlled trial. Autism Spectrum Disorders 27 (2016), 54–72. https://doi.org/10.1016/j.rasd.2016.01.004
[38]
Kalin Stefanov, Giampiero Salvi, Dimosthenis Kontogiorgos, Hedvig Kjellström, and Jonas Beskow. 2019. Modeling of Human Visual Attention in Multiparty Open-World Dialogues. J. Hum.-Robot Interact. 8, 2, Article 8 (jun 2019), 21 pages. https://doi.org/10.1145/3323231
[39]
Lars Steinert, Felix Putze, Dennis Küster, and Tanja Schultz. 2020. Towards Engagement Recognition of People with Dementia in Care Settings. In Proceedings of the 2020 International Conference on Multimodal Interaction (Virtual Event, Netherlands) (ICMI ’20). Association for Computing Machinery, New York, NY, USA, 558–565. https://doi.org/10.1145/3382507.3418856
[40]
Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J. Zico Kolter, Louis-Philippe Morency, and Ruslan Salakhutdinov. 2019. Multimodal Transformer for Unaligned Multimodal Language Sequences. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Anna Korhonen, David Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, Florence, Italy, 6558–6569. https://doi.org/10.18653/v1/P19-1656
[41]
Bethany C. Wangelin, Margaret M. Bradley, Anna Kastner, and Peter J. Lang. 2012. Affective engagement for facial expressions and emotional scenes: The influence of social anxiety. Biological Psychology 91, 1 (2012), 103–110. https://doi.org/10.1016/j.biopsycho.2012.05.002
[42]
Anne S. Warlaumont, D. Kimbrough Oller, Rick Dale, Jeffrey A. Richards, Jill Gilkerson, and Dongxin Xu. 2010. Vocal Interaction Dynamics of Children with and Without Autism. In Proceedings of the 32nd Annual Conference of the Cognitive Science Society, S. Ohlsson and R. Catrambone (Eds.). Cognitive Science Society.

Index Terms

  1. Participation Role-Driven Engagement Estimation of ASD Individuals in Neurodiverse Group Discussions

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      ICMI '24: Proceedings of the 26th International Conference on Multimodal Interaction
      November 2024
      725 pages
      ISBN:9798400704628
      DOI:10.1145/3678957
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 November 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Adults with autism spectrum disorder
      2. engagement estimation
      3. neurodiverse group communication
      4. participation role

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      • JST Moonshot R&D
      • JSPS

      Conference

      ICMI '24
      ICMI '24: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
      November 4 - 8, 2024
      San Jose, Costa Rica

      Acceptance Rates

      Overall Acceptance Rate 453 of 1,080 submissions, 42%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 49
        Total Downloads
      • Downloads (Last 12 months)49
      • Downloads (Last 6 weeks)12
      Reflects downloads up to 23 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media

      pFad - Phonifier reborn

      Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy