100% found this document useful (1 vote)
348 views33 pages

AI Agents vs. Agentic AI

This document reviews the distinctions between AI Agents and Agentic AI, providing a conceptual taxonomy, application mapping, and analysis of challenges. It outlines the evolution from modular AI Agents, which are task-specific and limited in autonomy, to Agentic AI systems that enable multi-agent collaboration and dynamic task management. The paper aims to clarify these paradigms to inform the design and deployment of future intelligent systems.

Uploaded by

alihelali007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
348 views33 pages

AI Agents vs. Agentic AI

This document reviews the distinctions between AI Agents and Agentic AI, providing a conceptual taxonomy, application mapping, and analysis of challenges. It outlines the evolution from modular AI Agents, which are task-specific and limited in autonomy, to Agentic AI systems that enable multi-agent collaboration and dynamic task management. The paper aims to clarify these paradigms to inform the design and deployment of future intelligent systems.

Uploaded by

alihelali007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

AI Agents vs.

Agentic AI: A Conceptual


Taxonomy, Applications and Challenges
Ranjan Sapkota∗‡ , Konstantinos I. Roumeliotis† , Manoj Karkee∗‡
∗ CornellUniversity, Department of Biological and Environmental Engineering, USA
† Department of Informatics and Telecommunications, University of the Peloponnese, 22131 Tripoli, Greece

‡ Corresponding authors: rs2672@cornell.edu, mk2684@cornell.edu

Abstract—This review critically distinguishes between AI Notably, Castelfranchi [3] laid critical groundwork by intro-
arXiv:2505.10468v2 [cs.AI] 16 May 2025

Agents and Agentic AI, offering a structured conceptual tax- ducing ontological categories for social action, structure, and
onomy, application mapping, and challenge analysis to clarify mind, arguing that sociality emerges from individual agents’
their divergent design philosophies and capabilities. We begin by
outlining the search strategy and foundational definitions, charac- actions and cognitive processes in a shared environment,
terizing AI Agents as modular systems driven by LLMs and LIMs with concepts like goal delegation and adoption forming the
for narrow, task-specific automation. Generative AI is positioned basis for cooperation and organizational behavior. Similarly,
as a precursor, with AI agents advancing through tool integration, Ferber [4] provided a comprehensive framework for MAS,
prompt engineering, and reasoning enhancements. In contrast, defining agents as entities with autonomy, perception, and
agentic AI systems represent a paradigmatic shift marked by
multi-agent collaboration, dynamic task decomposition, persis- communication capabilities, and highlighting their applica-
tent memory, and orchestrated autonomy. Through a sequential tions in distributed problem-solving, collective robotics, and
evaluation of architectural evolution, operational mechanisms, synthetic world simulations. These early works established
interaction styles, and autonomy levels, we present a compara- that individual social actions and cognitive architectures are
tive analysis across both paradigms. Application domains such fundamental to modeling collective phenomena, setting the
as customer support, scheduling, and data summarization are
contrasted with Agentic AI deployments in research automa- stage for modern AI agents. This paper builds on these insights
tion, robotic coordination, and medical decision support. We to explore how social action modeling, as proposed in [3], [4],
further examine unique challenges in each paradigm including informs the design of AI agents capable of complex, socially
hallucination, brittleness, emergent behavior, and coordination intelligent interactions in dynamic environments.
failure and propose targeted solutions such as ReAct loops, RAG, These systems were designed to perform specific tasks with
orchestration layers, and causal modeling. This work aims to
provide a definitive roadmap for developing robust, scalable, and predefined rules, limited autonomy, and minimal adaptability
explainable AI-driven systems. to dynamic environments. Agent-like systems were primarily
Index Terms—AI Agents, Agentic AI, Autonomy, Reasoning, reactive or deliberative, relying on symbolic reasoning, rule-
Context Awareness, Multi-Agent Systems, Conceptual Taxonomy, based logic, or scripted behaviors rather than the learning-
vision-language model driven, context-aware capabilities of modern AI agents [5], [6].
For instance, expert systems used knowledge bases and infer-
Source:
ence engines to emulate human decision-making in domains
like medical diagnosis (e.g., MYCIN [7]). Reactive agents,
AI Agents Agentic AI
such as those in robotics, followed sense-act cycles based on
hardcoded rules, as seen in early autonomous vehicles like the
Stanford Cart [8]. Multi-agent systems facilitated coordina-
Nov 2022 Nov 2023 Nov 2024 2025
tion among distributed entities, exemplified by auction-based
Fig. 1: Global Google search trends showing rising interest resource allocation in supply chain management [9], [10].
in “AI Agents” and “Agentic AI” since November 2022 Scripted AI in video games, like NPC behaviors in early RPGs,
(ChatGPT Era). used predefined decision trees [11]. Furthermore, BDI (Belief-
Desire-Intention) architectures enabled goal-directed behavior
in software agents, such as those in air traffic control simu-
I. I NTRODUCTION lations [12], [13]. These early systems lacked the generative
capacity, self-learning, and environmental adaptability of mod-
Prior to the widespread adoption of AI agents and agentic ern agentic AI, which leverages deep learning, reinforcement
AI around 2022 (Before ChatGPT Era), the development learning, and large-scale data [14].
of autonomous and intelligent agents was deeply rooted in Recent public and academic interest in AI Agents and Agen-
foundational paradigms of artificial intelligence, particularly tic AI reflects this broader transition in system capabilities.
multi-agent systems (MAS) and expert systems, which em- As illustrated in Figure 1, Google Trends data demonstrates
phasized social action and distributed intelligence [1], [2]. a significant rise in global search interest for both terms
following the emergence of large-scale generative models in ities, building on existing standards, securing interactions by
late 2022. This shift is closely tied to the evolution of agent default, supporting long-running tasks, and ensuring modality
design from the pre-2022 era, where AI agents operated in agnosticism. These guidelines aim to lay the groundwork for
constrained, rule-based environments, to the post-ChatGPT a responsive, scalable agentic infrastructure.
period marked by learning-driven, flexible architectures [15]– Architectures such as CrewAI demonstrate how these agen-
[17]. These newer systems enable agents to refine their perfor- tic frameworks can orchestrate decision-making across dis-
mance over time and interact autonomously with unstructured, tributed roles, facilitating intelligent behavior in high-stakes
dynamic inputs [18]–[20]. For instance, while pre-modern applications including autonomous robotics, logistics manage-
expert systems required manual updates to static knowledge ment, and adaptive decision-support [34]–[37].
bases, modern agents leverage emergent neural behaviors As the field progresses from Generative Agents toward
to generalize across tasks [17]. The rise in trend activity increasingly autonomous systems, it becomes critically impor-
reflects increasing recognition of these differences. Moreover, tant to delineate the technological and conceptual boundaries
applications are no longer confined to narrow domains like between AI Agents and Agentic AI. While both paradigms
simulations or logistics, but now extend to open-world settings build upon large LLMs and extend the capabilities of gener-
demanding real-time reasoning and adaptive control. This mo- ative systems, they embody fundamentally different architec-
mentum, as visualized in Figure 1, underscores the significance tures, interaction models, and levels of autonomy. AI Agents
of recent architectural advances in scaling autonomous agents are typically designed as single-entity systems that perform
for real-world deployment. goal-directed tasks by invoking external tools, applying se-
The release of ChatGPT in November 2022 marked a pivotal quential reasoning, and integrating real-time information to
inflection point in the development and public perception of complete well-defined functions [17], [38]. In contrast, Agen-
artificial intelligence, catalyzing a global surge in adoption, tic AI systems are composed of multiple, specialized agents
investment, and research activity [21]. In the wake of this that coordinate, communicate, and dynamically allocate sub-
breakthrough, the AI landscape underwent a rapid transforma- tasks within a broader workflow [14], [39]. This architec-
tion, shifting from the use of standalone LLMs toward more tural distinction underpins profound differences in scalability,
autonomous, task-oriented frameworks [22]. This evolution adaptability, and application scope.
progressed through two major post-generative phases: AI Understanding and formalizing the taxonomy between these
Agents and Agentic AI. Initially, the widespread success of two paradigms (AI Agents and Agentic AI) is scientifically
ChatGPT popularized Generative Agents, which are LLM- significant for several reasons. First, it enables more precise
based systems designed to produce novel outputs such as text, system design by aligning computational frameworks with
images, and code from user prompts [23], [24]. These agents problem complexity ensuring that AI Agents are deployed
were quickly adopted across applications ranging from con- for modular, tool-assisted tasks, while Agentic AI is reserved
versational assistants (e.g., GitHub Copilot [25]) and content- for orchestrated multi-agent operations. Moreover, it allows
generation platforms (e.g., Jasper [26]) to creative tools (e.g., for appropriate benchmarking and evaluation: performance
Midjourney [27]), revolutionizing domains like digital design, metrics, safety protocols, and resource requirements differ
marketing, and software prototyping throughout 2023. markedly between individual-task agents and distributed agent
Although the term AI agent was first introduced in systems. Additionally, clear taxonomy reduces development
1998 [3], it has since evolved significantly with the rise inefficiencies by preventing the misapplication of design prin-
of generative AI. Building upon this generative founda- ciples such as assuming inter-agent collaboration in a system
tion, a new class of systems—commonly referred to as AI architected for single-agent execution. Without this clarity,
agents—has emerged. These agents enhanced LLMs with practitioners risk both under-engineering complex scenarios
capabilities for external tool use, function calling, and se- that require agentic coordination and over-engineering simple
quential reasoning, enabling them to retrieve real-time in- applications that could be solved with a single AI Agent.
formation and execute multi-step workflows autonomously Since the field of artificial intelligence has seen significant
[28], [29]. Frameworks such as AutoGPT [30] and BabyAGI advancements, particularly in the development of AI Agents
(https://github.com/yoheinakajima/babyagi) exemplified this and Agentic AI. These terms, while related, refer to distinct
transition, showcasing how LLMs could be embedded within concepts with different capabilities and applications. This
feedback loops to dynamically plan, act, and adapt in goal- article aims to clarify the differences between AI Agents and
driven environments [31], [32]. By late 2023, the field had Agentic AI, providing researchers with a foundational under-
advanced further into the realm of Agentic AI complex, multi- standing of these technologies. The objective of this study is
agent systems in which specialized agents collaboratively to formalize the distinctions, establish a shared vocabulary,
decompose goals, communicate, and coordinate toward shared and provide a structured taxonomy between AI Agents and
objectives. In line with this evolution, Google introduced the Agentic AI that informs the next generation of intelligent agent
Agent-to-Agent (A2A) protocol in 2025 [33], a proposed design across academic and industrial domains, as illustrated
standard designed to enable seamless interoperability among in Figure 2.
agents across different frameworks and vendors. The protocol This review provides a comprehensive conceptual and archi-
is built around five core principles: embracing agentic capabil- tectural analysis of the progression from traditional AI Agents
based planning. The review culminates in a forward-looking
roadmap that envisions the convergence of modular AI Agents
Autonomy
and orchestrated Agentic AI in mission-critical domains. Over-
all, this paper aims to provide researchers with a structured
taxonomy and actionable insights to guide the design, deploy-
Interaction ment, and evaluation of next-generation agentic systems.
A. Methodology Overview
AI Agents This review adopts a structured, multi-stage methodology
& Architecture designed to capture the evolution, architecture, application,
Agentic AI
and limitations of AI Agents and Agentic AI. The process
is visually summarized in Figure 3, which delineates the
Scope/ sequential flow of topics explored in this study. The analytical
Complexity framework was organized to trace the progression from basic
agentic constructs rooted in LLMs to advanced multi-agent
Mechanisms orchestration systems. Each step of the review was grounded in
rigorous literature synthesis across academic sources and AI-
powered platforms, enabling a comprehensive understanding
of the current landscape and its emerging trajectories.
The review begins by establishing a foundational under-
Fig. 2: Mind map of Research Questions relevant to AI standing of AI Agents, examining their core definitions, design
Agents and Agentic AI. Each color-coded branch represents principles, and architectural modules as described in the litera-
a key dimension of comparison: Architecture, Mechanisms, ture. These include components such as perception, reasoning,
Scope/Complexity, Interaction, and Autonomy. and action selection, along with early applications like cus-
tomer service bots and retrieval assistants. This foundational
layer serves as the conceptual entry point into the broader
to emergent Agentic AI systems. Rather than organizing the agentic paradigm.
study around formal research questions, we adopt a sequential, Next, we delve into the role of LLMs as core reasoning
layered structure that mirrors the historical and technical components, emphasizing how pre-trained language models
evolution of these paradigms. Beginning with a detailed de- underpin modern AI Agents. This section details how LLMs,
scription of our search strategy and selection criteria, we through instruction fine-tuning and reinforcement learning
first establish the foundational understanding of AI Agents from human feedback (RLHF), enable natural language in-
by analyzing their defining attributes, such as autonomy, reac- teraction, planning, and limited decision-making capabilities.
tivity, and tool-based execution. We then explore the critical We also identify their limitations, such as hallucinations, static
role of foundational models specifically LLMs and Large knowledge, and a lack of causal reasoning.
Image Models (LIMs) which serve as the core reasoning and Building on these foundations, the review proceeds to the
perceptual substrates that drive agentic behavior. Subsequent emergence of Agentic AI, which represents a significant con-
sections examine how generative AI systems have served ceptual leap. Here, we highlight the transformation from tool-
as precursors to more dynamic, interactive agents, setting augmented single-agent systems to collaborative, distributed
the stage for the emergence of Agentic AI. Through this ecosystems of interacting agents. This shift is driven by the
lens, we trace the conceptual leap from isolated, single-agent need for systems capable of decomposing goals, assigning
systems to orchestrated multi-agent architectures, highlight- subtasks, coordinating outputs, and adapting dynamically to
ing their structural distinctions, coordination strategies, and changing contexts—capabilities that surpass what isolated AI
collaborative mechanisms. We further map the architectural Agents can offer.
evolution by dissecting the core system components of both The next section examines the architectural evolution from
AI Agents and Agentic AI, offering comparative insights into AI Agents to Agentic AI systems, contrasting simple, modular
their planning, memory, orchestration, and execution layers. agent designs with complex orchestration frameworks. We
Building upon this foundation, we review application domains describe enhancements such as persistent memory, meta-agent
spanning customer support, healthcare, research automation, coordination, multi-agent planning loops (e.g., ReAct and
and robotics, categorizing real-world deployments by system Chain-of-Thought prompting), and semantic communication
capabilities and coordination complexity. We then assess key protocols. Comparative architectural analysis is supported with
challenges faced by both paradigms including hallucination, examples from platforms like AutoGPT, CrewAI, and Lang-
limited reasoning depth, causality deficits, scalability issues, Graph.
and governance risks. To address these limitations, we outline Following the architectural exploration, the review presents
emerging solutions such as retrieval-augmented generation, an in-depth analysis of application domains where AI Agents
tool-based reasoning, memory architectures, and simulation- and Agentic AI are being deployed. This includes six key
Foundational
LLMs as Core
Hybrid Literature Search Understanding
Reasoning Components
of AI Agents

Architectural Evolution: Emergence of Applications of


Agents → Agentic AI Agentic AI AI Agents & Agentic AI

Challenges & Limitations


(Agents + Agentic AI)

Potential Solutions:
RAG, Causal
Models, Planning

Fig. 3: Methodology pipeline from foundational AI agents to Agentic AI systems, applications, limitations, and solution
strategies.

application areas for each paradigm, ranging from knowledge academic repositories and AI-enhanced literature discovery
retrieval, email automation, and report summarization for AI tools. Specifically, twelve platforms were queried: academic
Agents, to research assistants, robotic swarms, and strategic databases such as Google Scholar, IEEE Xplore, ACM Dig-
business planning for Agentic AI. Use cases are discussed in ital Library, Scopus, Web of Science, ScienceDirect, and
the context of system complexity, real-time decision-making, arXiv; and AI-powered interfaces including ChatGPT, Per-
and collaborative task execution. plexity.ai, DeepSeek, Hugging Face Search, and Grok. Search
Subsequently, we address the challenges and limitations queries incorporated Boolean combinations of terms such as
inherent to both paradigms. For AI Agents, we focus on issues “AI Agents,” “Agentic AI,” “LLM Agents,” “Tool-augmented
like hallucination, prompt brittleness, limited planning ability, LLMs,” and “Multi-Agent AI Systems.”
and lack of causal understanding. For Agentic AI, we identify Targeted queries such as “Agentic AI + Coordination +
higher-order challenges such as inter-agent misalignment, error Planning,” and “AI Agents + Tool Usage + Reasoning”
propagation, unpredictability of emergent behavior, explain- were employed to retrieve papers addressing both conceptual
ability deficits, and adversarial vulnerabilities. These problems underpinnings and system-level implementations. Literature
are critically examined with references to recent experimental inclusion was based on criteria such as novelty, empirical
studies and technical reports. evaluation, architectural contribution, and citation impact. The
Finally, the review outlines potential solutions to over- rising global interest in these technologies, as illustrated in
come these challenges, drawing on recent advances in causal Figure 1 using Google Trends data, underscores the urgency
modeling, retrieval-augmented generation (RAG), multi-agent of synthesizing this emerging knowledge space.
memory frameworks, and robust evaluation pipelines. These
strategies are discussed not only as technical fixes but as foun- II. F OUNDATIONAL U NDERSTANDING OF AI AGENTS
dational requirements for scaling agentic systems into high- AI Agents are an autonomous software entities engineered
stakes domains such as healthcare, finance, and autonomous for goal-directed task execution within bounded digital en-
robotics. vironments [14], [40]. These agents are defined by their
Taken together, this methodological structure enables a ability to perceive structured or unstructured inputs [41],
comprehensive and systematic assessment of the state of AI reason over contextual information [42], [43], and initiate
Agents and Agentic AI. By sequencing the analysis across actions toward achieving specific objectives, often acting
foundational understanding, model integration, architectural as surrogates for human users or subsystems [44]. Unlike
growth, applications, and limitations, the study aims to provide conventional automation scripts, which follow deterministic
both theoretical clarity and practical guidance to researchers workflows, AI agents demonstrate reactive intelligence and
and practitioners navigating this rapidly evolving field. limited adaptability, allowing them to interpret dynamic inputs
1) Search Strategy: To construct this review, we imple- and reconfigure outputs accordingly [45]. Their adoption has
mented a hybrid search methodology combining traditional been reported across a range of application domains, including
AI Agents

Fig. 4: Core characteristics of AI Agents autonomy, task-specificity, and reactivity illustrated with symbolic representations for
agent design and operational behavior.

customer service automation [46], [47], personal productivity feedback loops and basic learning heuristics [17], [62].
assistance [48], internal information retrieval [49], [50], and Together, these three traits provide a foundational profile for
decision support systems [51], [52]. A noteworthy example of understanding and evaluating AI Agents across deployment
autonomous AI agents is Anthropic’s ”Computer Use” project, scenarios. The remainder of this section elaborates on each
where Claude was trained to navigate computers to automate characteristic, offering theoretical grounding and illustrative
repetitive processes, build and test software, and perform open- examples.
ended tasks such as research [53].
• Autonomy: A central feature of AI Agents is their
1) Overview of Core Characteristics of AI Agents: AI ability to function with minimal or no human intervention
Agents are widely conceptualized as instantiated operational after deployment [59]. Once initialized, these agents are
embodiments of artificial intelligence designed to interface capable of perceiving environmental inputs, reasoning
with users, software ecosystems, or digital infrastructures in over contextual data, and executing predefined or adaptive
pursuit of goal-directed behavior [54]–[56]. These agents dis- actions in real-time [17]. Autonomy enables scalable
tinguish themselves from general-purpose LLMs by exhibiting deployment in applications where persistent oversight is
structured initialization, bounded autonomy, and persistent impractical, such as customer support bots or scheduling
task orientation. While LLMs primarily function as reactive assistants [47], [63].
prompt followers [57], AI Agents operate within explicitly de- • Task-Specificity: AI Agents are purpose-built for narrow,
fined scopes, engaging dynamically with inputs and producing well-defined tasks [60], [61]. They are optimized to
actionable outputs in real-time environments [58]. execute repeatable operations within a fixed domain, such
Figure 4 illustrates the three foundational characteristics that as email filtering [64], [65], database querying [66], or
recur across architectural taxonomies and empirical deploy- calendar coordination [39], [67]. This task specialization
ments of AI Agents. These include autonomy, task-specificity, allows for efficiency, interpretability, and high precision
and reactivity with adaptation. First, autonomy denotes the in automation tasks where general-purpose reasoning is
agent’s ability to act independently post-deployment, mini- unnecessary or inefficient.
mizing human-in-the-loop dependencies and enabling large- • Reactivity and Adaptation: AI Agents often include
scale, unattended operation [47], [59]. Second, task-specificity basic mechanisms for interacting with dynamic inputs,
encapsulates the design philosophy of AI agents being spe- allowing them to respond to real-time stimuli such as
cialized for narrowly scoped tasks allowing high-performance user requests, external API calls, or state changes in
optimization within a defined functional domain such as software environments [17], [62]. Some systems integrate
scheduling, querying, or filtering [60], [61]. Third, reactivity rudimentary learning [68] through feedback loops [69],
refers to an agent’s capacity to respond to changes in its [70], heuristics [71], or updated context buffers to refine
environment, including user commands, software states, or behavior over time, particularly in settings like personal-
API responses; when extended with adaptation, this includes ized recommendations or conversation flow management
[72]–[74].
These core characteristics collectively enable AI Agents to
serve as modular, lightweight interfaces between pretrained AI
models and domain-specific utility pipelines. Their architec-
tural simplicity and operational efficiency position them as key
enablers of scalable automation across enterprise, consumer,
and industrial settings. Although still limited in reasoning
depth compared to more general AI systems [75], their high
usability and performance within constrained task boundaries
have made them foundational components in contemporary
intelligent system design.
2) Foundational Models: The Role of LLMs and LIMs:
The foundational progress in AI agents has been significantly
accelerated by the development and deployment of LLMs
and LIMs, which serve as the core reasoning and perception
engines in contemporary agent systems. These models enable Fig. 5: An AI agent–enabled drone autonomously inspects
AI agents to interact intelligently with their environments, an orchard, identifying diseased fruits and damaged branches
understand multimodal inputs, and perform complex reasoning using vision models, and triggers real-time alerts for targeted
tasks that go beyond hard-coded automation. horticultural interventions
LLMs such as GPT-4 [76] and PaLM [77] are trained on
massive datasets of text from books, web content, and dialogue
corpora. These models exhibit emergent capabilities in natural
natural language processing and large language models in
language understanding, question answering, summarization,
generating drone action plans from human-issued queries,
dialogue coherence, and even symbolic reasoning [78], [79].
demonstrating how LLMs support naturalistic interaction and
Within AI agent architectures, LLMs serve as the primary
mission planning. Similarly, Natarajan et al. [92] explore deep
decision-making engine, allowing the agent to parse user
learning and reinforcement learning for scene understand-
queries, plan multi-step solutions, and generate naturalistic
ing, spatial mapping, and multi-agent coordination in aerial
responses. For instance, an AI customer support agent powered
robotics. These studies converge on the critical importance
by GPT-4 can interpret customer complaints, query backend
of AI-driven autonomy, perception, and decision-making in
systems via tool integration, and respond in a contextually
advancing drone-based agents.
appropriate and emotionally aware manner [80], [81].
Large Image Models (LIMs) such as CLIP [82] and BLIP- Importantly, LLMs and LIMs are often accessed via infer-
2 [83] extend the agent’s capabilities into the visual domain. ence APIs provided by cloud-based platforms such as OpenAI
Trained on image-text pairs, LIMs enable perception-based https://openai.com/, HuggingFace https://huggingface.co/, and
tasks including image classification, object detection, and Google Gemini https://gemini.google.com/app. These services
vision-language grounding. These capabilities are increasingly abstract away the complexity of model training and fine-
vital for agents operating in domains such as robotics [84], tuning, enabling developers to rapidly build and deploy agents
autonomous vehicles [85], [86], and visual content moderation equipped with state-of-the-art reasoning and perceptual abil-
[87], [88]. ities. This composability accelerates prototyping and allows
For example, as illustrated in Figure 5 in an autonomous agent frameworks like LangChain [93] and AutoGen [94]
drone agent tasked with inspecting orchards, a LIM can to orchestrate LLM and LIM outputs across task workflows.
identify diseased fruits [89] or damaged branches by inter- In short, foundational models give modern AI agents their
preting live aerial imagery and triggering predefined inter- basic understanding of language and visuals. Language models
vention protocols. Upon detection, the system autonomously help them reason with words, and image models help them
triggers predefined intervention protocols, such as notifying understand pictures-working together, they allow AI to make
horticultural staff or marking the location for targeted treat- smart decisions in complex situations.
ment without requiring human intervention [17], [59]. This 3) Generative AI as a Precursor: A consistent theme in the
workflow exemplifies the autonomy and reactivity of AI agents literature is the positioning of generative AI as the foundational
in agricultural environment and recent literature underscores precursor to agentic intelligence. These systems primarily
the growing sophistication of such drone-based AI agents. operate on pretrained LLMs and LIMs, which are optimized
Chitra et al. [90] provide a comprehensive overview of AI to synthesize novel content text, images, audio, or code
algorithms foundational to embodied agents, highlighting the based on input prompts. While highly expressive, generative
integration of computer vision, SLAM, reinforcement learning, models fundamentally exhibit reactive behavior: they produce
and sensor fusion. These components collectively support real- output only when explicitly prompted and do not pursue goals
time perception and adaptive navigation in dynamic envi- autonomously or engage in self-initiated reasoning [95], [96].
ronments. Kourav et al. [91] further emphasize the role of Key Characteristics of Generative AI:
• Reactivity: As non-autonomous systems, generative require adaptive planning [119], [120], real-time decision-
models are exclusively input-driven [97], [98]. Their making [121], [122], and environment-aware behavior [123].
operations are triggered by user-specified prompts and 1) LLMs as Core Reasoning Components:
they lack internal states, persistent memory, or goal- LLMs such as GPT-4 [76], PaLM [77], Claude
following mechanisms [99]–[101]. https://www.anthropic.com/news/claude-3-5-sonnet, and
• Multimodal Capability: Modern generative systems can LLaMA [115] are pre-trained on massive text corpora using
produce a diverse array of outputs, including coherent self-supervised objectives and fine-tuned using techniques
narratives, executable code, realistic images, and even such as Supervised Fine-Tuning (SFT) and Reinforcement
speech transcripts. For instance, models like GPT-4 [76], Learning from Human Feedback (RLHF) [124], [125]. These
PaLM-E [102], and BLIP-2 [83] exemplify this capacity, models encode rich statistical and semantic knowledge,
enabling language-to-image, image-to-text, and cross- allowing them to perform tasks like inference, summarization,
modal synthesis tasks. code generation, and dialogue management. In agentic
• Prompt Dependency and Statelessness: Although gen- contexts, however, their capabilities are repurposed not
erative systems are stateless in that they do not retain con- merely to generate responses, but to serve as cognitive
text across interactions unless explicitly provided [103], substrates interpreting user goals, generating action plans,
[104], recent advancements like GPT-4.1 support larger selecting tools, and managing multi-turn workflows.
context windows-up to 1 million tokens-and are better Recent work identifies these models as central
able to utilize that context thanks to improved long-text to the architecture of contemporary agentic
comprehension [105]. Their design also lacks intrinsic systems. For instance, AutoGPT [30] and BabyAGI
feedback loops [106], state management [107], [108], https://github.com/yoheinakajima/babyagi use GPT-4 as
or multi-step planning a requirement for autonomous both a planner and executor: the model analyzes high-level
decision-making and iterative goal refinement [109], objectives, decomposes them into actionable subtasks, invokes
[110]. external APIs as needed, and monitors progress to determine
Despite their remarkable generative fidelity, these systems subsequent actions. In such systems, the LLM operates in a
are constrained by their inability to act upon the environment loop of prompt processing, state updating, and feedback-based
or manipulate digital tools independently. For instance, they correction, closely emulating autonomous decision-making.
cannot search the internet, parse real-time data, or interact 2) Tool-Augmented AI Agents: Enhancing Functionality:
with APIs without human-engineered wrappers or scaffolding To overcome limitations inherent to generative-only systems
layers. As such, they fall short of being classified as true such as hallucination, static knowledge cutoffs, and restricted
AI Agents, whose architectures integrate perception, decision- interaction scopes, researchers have proposed the concept of
making, and external tool-use within closed feedback loops. tool-augmented LLM agents [126] such as Easytool [127],
The limitations of generative AI in handling dynamic tasks, Gentopia [128], and ToolFive [129]. These systems integrate
maintaining state continuity, or executing multi-step plans led external tools, APIs, and computation platforms into the
to the development of tool-augmented systems, commonly agent’s reasoning pipeline, allowing for real-time information
referred to as AI Agents [111]. These systems build upon access, code execution, and interaction with dynamic data
the language processing backbone of LLMs but introduce environments.
additional infrastructure such as memory buffers, tool-calling Tool Invocation. When an agent identifies a need that
APIs, reasoning chains, and planning routines to bridge the cannot be addressed through its internal knowledge such as
gap between passive response generation and active task querying a current stock price, retrieving up-to-date weather
completion. This architectural evolution marks a critical shift information, or executing a script, it generates a structured
in AI system design: from content creation to autonomous function call or API request [130], [131]. These calls are
utility [112], [113]. The trajectory from generative systems to typically formatted in JSON, SQL, or Python dictionary,
AI agents underscores a progressive layering of functionality depending on the target service, and routed through an or-
that ultimately supports the emergence of agentic behaviors. chestration layer that executes the task.
Result Integration. Once a response is received from the
A. Language Models as the Engine for AI Agent Progression tool, the output is parsed and reincorporated into the LLM’s
The emergence of AI agent as a transformative paradigm context window. This enables the agent to synthesize new
in artificial intelligence is closely tied to the evolution and reasoning paths, update its task status, and decide on the next
repurposing of large-scale language models such as GPT-3 step. The ReAct framework [132] exemplifies this architecture
[114], Llama [115], T5 [116], Baichuan 2 [117] and GPT3mix by combining reasoning (Chain-of-Thought prompting) and
[118]. A substantial and growing body of research confirms action (tool use), with LLMs alternating between internal
that the leap from reactive generative models to autonomous, cognition and external environment interaction. A prominent
goal-directed agents is driven by the integration of LLMs example of a tool-augmented AI agent is ChatGPT, which,
as core reasoning engines within dynamic agentic systems. when unable to answer a query directly, autonomously invokes
These models, originally trained for natural language pro- the Web Search API to retrieve more recent and relevant
cessing tasks, are increasingly embedded in frameworks that information, performs reasoning over the retrieved content,
and formulates a response based on its understanding [133]. through tool-augmented reasoning, recent literature identifies
3) Illustrative Examples and Emerging Capabilities: Tool- notable limitations that constrain their scalability in complex,
augmented LLM agents have demonstrated capabilities across multi-step, or cooperative scenarios [137]–[139]. These con-
a range of applications. In AutoGPT [30], the agent may straints have catalyzed the development of a more advanced
plan a product market analysis by sequentially querying the paradigm: Agentic AI. This emerging class of systems extends
web, compiling competitor data, summarizing insights, and the capabilities of traditional agents by enabling multiple
generating a report. In a coding context, tools like GPT- intelligent entities to collaboratively pursue goals through
Engineer combine LLM-driven design with local code exe- structured communication [140]–[142], shared memory [143],
cution environments to iteratively develop software artifacts [144], and dynamic role assignment [14].
[134], [135]. In research domains, systems like Paper-QA 1) Conceptual Leap: From Isolated Tasks to Coordinated
[136] utilize LLMs to query vectorized academic databases, Systems: AI Agents, as explored in prior sections, integrate
grounding answers in retrieved scientific literature to ensure LLMs with external tools and APIs to execute narrowly scoped
factual integrity. operations such as responding to customer queries, performing
These capabilities have opened pathways for more robust document retrieval, or managing schedules. However, as use
behavior of AI agents such as long-horizon planning, cross- cases increasingly demand context retention, task interde-
tool coordination, and adaptive learning loops. Nevertheless, pendence, and adaptability across dynamic environments, the
the inclusion of tools also introduces new challenges in or- single-agent model proves insufficient [145], [146].
chestration complexity, error propagation, and context window Agentic AI systems represent an emergent class of intelli-
limitations all active areas of research.The progression toward gent architectures in which multiple specialized agents collab-
AI Agents is inseparable from the strategic integration of orate to achieve complex, high-level objectives. As defined in
LLMs as reasoning engines and their augmentation through recent frameworks, these systems are composed of modular
structured tool use. This synergy transforms static language agents each tasked with a distinct subcomponent of a broader
models into dynamic cognitive entities capable of perceiving, goal and coordinated through either a centralized orchestrator
planning, acting, and adapting setting the stage for multi-agent or a decentralized protocol [16], [141]. This structure signifies
collaboration, persistent memory, and scalable autonomy. a conceptual departure from the atomic, reactive behaviors
Figure 6 illustrates a representative case: a news query agent typically observed in single-agent architectures, toward a form
that performs real-time web search, summarizes retrieved of system-level intelligence characterized by dynamic inter-
documents, and generates an articulate, context-aware answer. agent collaboration.
Such workflows have been demonstrated in implementations A key enabler of this paradigm is goal decomposition,
using LangChain, AutoGPT, and OpenAI function-calling wherein a user-specified objective is automatically parsed and
paradigms. divided into smaller, manageable tasks by planning agents
[39]. These subtasks are then distributed across the agent
network. Multi-step reasoning and planning mechanisms
facilitate the dynamic sequencing of these subtasks, allowing
the system to adapt in real time to environmental shifts or
partial task failures. This ensures robust task execution even
under uncertainty [14].
Inter-agent communication is mediated through distributed
communication channels, such as asynchronous messaging
queues, shared memory buffers, or intermediate output ex-
changes, enabling coordination without necessitating contin-
uous central oversight [14], [147]. Furthermore, reflective
reasoning and memory systems allow agents to store context
across multiple interactions, evaluate past decisions, and itera-
tively refine their strategies [148]. Collectively, these capabili-
ties enable Agentic AI systems to exhibit flexible, adaptive,
and collaborative intelligence that exceeds the operational
Fig. 6: Workflow of an AI Agent performing real-time news limits of individual agents.
search, summarization, and answer generation, as commonly A widely accepted conceptual illustration in the literature
described in the literature (e.g., Author, Year). delineates the distinction between AI Agents and Agentic AI
through the analogy of smart home systems. As depicted in
Figure 7, the left side represents a traditional AI Agent in the
III. T HE E MERGENCE OF AGENTIC AI FROM AI AGENT form of a smart thermostat. This standalone agent receives
F OUNDATIONS a user-defined temperature setting and autonomously controls
While AI Agents represent a significant leap in artificial in- the heating or cooling system to maintain the target tempera-
telligence capabilities, particularly in automating narrow tasks ture. While it demonstrates limited autonomy such as learning
Fig. 7: Comparative illustration of AI Agent vs. Agentic AI, synthesizing conceptual distinctions found in the literature (e.g.,
Author, Year). Left: A single-task AI Agent. Right: A multi-agent, collaborative Agentic AI system.

user schedules or reducing energy usage during absence, it frameworks.


operates in isolation, executing a singular, well-defined task 2) Key Differentiators between AI Agents and Agentic AI:
without engaging in broader environmental coordination or To systematically capture the evolution from Generative AI
goal inference [17], [59]. to AI Agents and further to Agentic AI, we structure our
In contrast, the right side of Figure 7 illustrates an Agentic comparative analysis around a foundational taxonomy where
AI system embedded in a comprehensive smart home ecosys- Generative AI serves as the baseline. While AI Agents and
tem. Here, multiple specialized agents interact synergistically Agentic AI represent increasingly autonomous and interactive
to manage diverse aspects such as weather forecasting, daily systems, both paradigms are fundamentally grounded in gener-
scheduling, energy pricing optimization, security monitoring, ative architectures, especially LLMs and LIMs. Consequently,
and backup power activation. These agents are not just reactive each comparative table in this subsection includes Generative
modules; they communicate dynamically, share memory states, AI as a reference column to highlight how agentic behavior
and collaboratively align actions toward a high-level system diverges and builds upon generative foundations.
goal (e.g., optimizing comfort, safety, and energy efficiency A set of fundamental distinctions between AI Agents and
in real time). For instance, a weather forecast agent might Agentic AI particularly in terms of scope, autonomy, architec-
signal upcoming heatwaves, prompting early pre-cooling via tural composition, coordination strategy, and operational com-
solar energy before peak pricing hours, as coordinated by an plexity are synthesized in Table I, derived from close analysis
energy management agent. Simultaneously, the system might of prominent frameworks such as AutoGen [94] and ChatDev
delay high-energy tasks or activate surveillance systems during [149]. These comparisons provide a multi-dimensional view
occupant absence, integrating decisions across domains. This of how single-agent systems transition into coordinated, multi-
figure embodies the architectural and functional leap from agent ecosystems. Through the lens of generative capabilities,
task-specific automation to adaptive, orchestrated intelligence. we trace the increasing sophistication in planning, communica-
The AI Agent acts as a deterministic component with limited tion, and adaptation that characterizes the shift toward Agentic
scope, while Agentic AI reflects distributed intelligence, char- AI.
acterized by goal decomposition, inter-agent communication, While Table I delineates the foundational and operational
and contextual adaptation, hallmarks of modern agentic AI differences between AI Agents and Agentic AI, a more gran-
TABLE I: Key Differences Between AI Agents and Agentic trast, Agentic AI systems extend this capacity through multi-
AI step planning, meta-learning, and inter-agent communication,
Feature AI Agents Agentic AI
positioning them for use in complex environments requiring
Autonomous autonomous goal setting and coordination. Generative Agents,
software Systems of multiple AI as a more recent construct, inherit LLM-centric pretraining
Definition programs that agents collaborating to capabilities and excel in producing multimodal content cre-
perform specific achieve complex goals.
tasks. atively, yet they lack the proactive orchestration and state-
High autonomy Higher autonomy with persistent behaviors seen in Agentic AI systems.
Autonomy Level within specific the ability to manage The second table (Table III) provides a process-driven
tasks. multi-step, complex tasks. comparison across three agent categories: Generative AI,
Typically handle Handle complex,
Task
single, specific multi-step tasks requiring
AI Agents, and Agentic AI. This framing emphasizes how
Complexity functional pipelines evolve from prompt-driven single-model
tasks. coordination.
Involve multi-agent inference in Generative AI, to tool-augmented execution in AI
Operate
Collaboration collaboration and Agents, and finally to orchestrated agent networks in Agentic
independently.
information sharing.
AI. The structure column underscores this progression: from
Learn and adapt Learn and adapt across a
Learning and
within their wider range of tasks and single LLMs to integrated toolchains and ultimately to dis-
Adaptation
specific domain. environments. tributed multi-agent systems. Access to external data, a key
Customer service
Supply chain
operational requirement for real-world utility, also increases
chatbots, virtual in sophistication, from absent or optional in Generative AI
management, business
Applications assistants,
process optimization, to modular and coordinated in Agentic AI. Collectively, these
automated
virtual project managers.
workflows. comparative views reinforce that the evolution from generative
to agentic paradigms is marked not just by increasing system
complexity but also by deeper integration of autonomy, mem-
ular taxonomy is required to understand how these paradigms ory, and decision-making across multiple levels of abstraction.
emerge from and relate to broader generative frameworks. Furthermore, to provide a deeper multi-dimensional un-
Specifically, the conceptual and cognitive progression from derstanding of the evolving agentic landscape, Tables V
static Generative AI systems to tool-augmented AI Agents, through IX extend the comparative taxonomy to dissect five
and further to collaborative Agentic AI ecosystems, necessi- critical dimensions: core function and goal alignment, archi-
tates an integrated comparative framework. This transition is tectural composition, operational mechanism, scope and com-
not merely structural but also functional encompassing how plexity, and interaction-autonomy dynamics. These dimensions
initiation mechanisms, memory use, learning capacities, and serve to not only reinforce the structural differences between
orchestration strategies evolve across the agentic spectrum. Generative AI, AI Agents, and Agentic AI, but also introduce
Moreover, recent studies suggest the emergence of hybrid an emergent category Generative Agents representing modular
paradigms such as ”Generative Agents,” which blend gen- agents designed for embedded subtask-level generation within
erative modeling with modular task specialization, further broader workflows. Table V situates the three paradigms in
complicating the agentic landscape. In order to capture these terms of their overarching goals and functional intent. While
nuanced relationships, Table II synthesizes the key conceptual Generative AI centers on prompt-driven content generation,
and cognitive dimensions across four archetypes: Generative AI Agents emphasize tool-based task execution, and Agentic
AI, AI Agents, Agentic AI, and inferred Generative Agents. AI systems orchestrate full-fledged workflows. This functional
By positioning Generative AI as a baseline technology, this expansion is mirrored architecturally in Table VI, where the
taxonomy highlights the scientific continuum that spans from system design transitions from single-model reliance (in Gen-
passive content generation to interactive task execution and erative AI) to multi-agent orchestration and shared memory
finally to autonomous, multi-agent orchestration. This multi- utilization in Agentic AI. Table VII then outlines how these
tiered lens is critical for understanding both the current ca- paradigms differ in their workflow execution pathways, high-
pabilities and future trajectories of agentic intelligence across lighting the rise of inter-agent coordination and hierarchical
applied and theoretical domains. communication as key drivers of agentic behavior.
To further operationalize the distinctions outlined in Ta- Furthermore, Table VIII explores the increasing scope and
ble I, Tables III and II extend the comparative lens to en- operational complexity handled by these systems ranging
compass a broader spectrum of agent paradigms including from isolated content generation to adaptive, multi-agent col-
AI Agents, Agentic AI, and emerging Generative Agents. laboration in dynamic environments. Finally, Table IX syn-
Table III presents key architectural and behavioral attributes thesizes the varying degrees of autonomy, interaction style,
that highlight how each paradigm differs in terms of pri- and decision-making granularity across the paradigms. These
mary capabilities, planning scope, interaction style, learning tables collectively establish a rigorous framework to classify
dynamics, and evaluation criteria. AI Agents are optimized and analyze agent-based AI systems, laying the groundwork
for discrete task execution with limited planning horizons and for principled evaluation and future design of autonomous,
rely on supervised or rule-based learning mechanisms. In con- intelligent, and collaborative agents operating at scale.
TABLE II: Taxonomy Summary of AI Agent Paradigms: Conceptual and Cognitive Dimensions

Conceptual Dimension Generative AI AI Agent Agentic AI Generative Agent


(Inferred)

Initiation Type Prompt-triggered by user or Prompt or goal-triggered Goal-initiated or orchestrated Prompt or system-level trig-
input with tool use task ger
Goal Flexibility (None) fixed per prompt (Low) executes specific goal (High) decomposes and (Low) guided by subtask
adapts goals goal
Temporal Continuity Stateless, single-session out- Short-term continuity within Persistent across workflow Context-limited to subtask
put task stages
Learning/Adaptation Static (pretrained) (Might in future) Tool selec- (Yes) Learns from outcomes Typically static; limited
tion strategies may evolve adaptation
Memory Use No memory or short context Optional memory or tool Shared episodic/task mem- Subtask-local or contextual
window cache ory memory
Coordination Strategy None (single-step process) Isolated task execution Hierarchical or decentralized Receives instructions from
coordination system
System Role Content generator Tool-using task executor Collaborative workflow or- Subtask-level modular gener-
chestrator ator

TABLE III: Key Attributes of AI Agents, Agentic AI, and on user prompts, AI Agents are characterized by their ability
Generative Agents to perform targeted tasks using external tools. Agentic AI,
by contrast, is defined by its ability to pursue high-level
Aspect AI Agent Agentic AI Generative
Agent goals through the orchestration of multiple subagents each
addressing a component of a broader workflow. This shift
Primary Task execution Autonomous Content genera-
Capability goal setting tion from output generation to workflow execution marks a critical
Planning Single-step Multi-step N/A (content inflection point in the evolution of autonomous systems.
Horizon only)
In Table VI, the architectural distinctions are made explicit,
Learning Rule-based or su- Reinforcement/meta- Large-scale pre-
Mecha- pervised learning training especially in terms of system composition and control logic.
nism Generative AI relies on a single model with no built-in capabil-
Interaction Reactive Proactive Creative ity for tool use or delegation, whereas AI Agents combine lan-
Style
guage models with auxiliary APIs and interface mechanisms
Evaluation Accuracy, latency Engagement, Coherence, diver-
Focus adaptability sity to augment functionality. Agentic AI extends this further by
introducing multi-agent systems where collaboration, memory
TABLE IV: Comparison of Generative AI, AI Agents, and persistence, and orchestration protocols are central to the
Agentic AI system’s operation. This expansion is crucial for enabling
intelligent delegation, context preservation, and dynamic role
Feature Generative AI AI Agent Agentic AI assignment capabilities absent in both generative and single-
agent systems. Likewise in Table VII dives deeper into how
Core Content genera- Task-specific Complex
Function tion execution using workflow these systems function operationally, emphasizing differences
tools automation in execution logic and information flow. Unlike Generative
Mechanism Prompt → LLM Prompt → Tool Goal → Agent AI’s linear pipeline (prompt → output), AI Agents implement
→ Output Call → LLM → Orchestration →
Output Output procedural mechanisms to incorporate tool responses mid-
Structure Single model LLM + tool(s) Multi-agent sys- process. Agentic AI introduces recursive task reallocation and
tem cross-agent messaging, thus facilitating emergent decision-
External None (unless Via external APIs Coordinated making that cannot be captured by static LLM outputs alone.
Data added) multi-agent
Access access
Table VIII further reinforces these distinctions by mapping
Key Trait Reactivity Tool-use Collaboration each system’s capacity to handle task diversity, temporal scale,
and operational robustness. Here, Agentic AI emerges as
uniquely capable of supporting high-complexity goals that de-
Each of the comparative tables presented from Table V mand adaptive, multi-phase reasoning and execution strategies.
through Table IX offers a layered analytical lens to isolate Furthermore, Table IX brings into sharp relief the opera-
the distinguishing attributes of Generative AI, AI Agents, and tional and behavioral distinctions across Generative AI, AI
Agentic AI, thereby grounding the conceptual taxonomy in Agents, and Agentic AI, with a particular focus on autonomy
concrete operational and architectural features. Table V, for levels, interaction styles, and inter-agent coordination. Gener-
instance, addresses the most fundamental layer of differentia- ative AI systems, typified by models such as GPT-3 [114]
tion: core function and system goal. While Generative AI is and and DALL·E https://openai.com/index/dall-e-3/, remain
narrowly focused on reactive content production conditioned reactive generating content solely in response to prompts
TABLE V: Comparison by Core Function and Goal

Feature Generative AI AI Agent Agentic AI Generative Agent


(Inferred)

Primary Goal Create novel content based Execute a specific task us- Automate complex work- Perform a specific genera-
on prompt ing external tools flow or achieve high-level tive sub-task
goals
Core Function Content generation (text, Task execution with exter- Workflow orchestration and Sub-task content generation
image, audio, etc.) nal interaction goal achievement within a workflow

TABLE VI: Comparison by Architectural Components

Component Generative AI AI Agent Agentic AI Generative Agent


(Inferred)

Core Engine LLM / LIM LLM Multiple LLMs (potentially LLM


diverse)
Prompts Yes (input trigger) Yes (task guidance) Yes (system goal and agent Yes (sub-task guidance)
tasks)
Tools/APIs No (inherently) Yes (essential) Yes (available to constituent Potentially (if sub-task re-
agents) quires)
Multiple Agents No No Yes (essential; collabora- No (is an individual agent)
tive)
Orchestration No No Yes (implicit or explicit) No (is part of orchestration)

TABLE VII: Comparison by Operational Mechanism

Mechanism Generative AI AI Agent Agentic AI Generative Agent


(Inferred)

Primary Driver Reactivity to prompt Tool calling for task execu- Inter-agent communication Reactivity to input or sub-
tion and collaboration task prompt
Interaction Mode User → LLM User → Agent → Tool User → System → Agents System/Agent → Agent →
Output
Workflow Handling Single generation step Single task execution Multi-step workflow coordi- Single step within workflow
nation
Information Flow Input → Output Input → Tool → Output Input → Agent1 → Agent2 Input (from system/agent)
→ ... → Output → Output

TABLE VIII: Comparison by Scope and Complexity

Aspect Generative AI AI Agent Agentic AI Generative Agent


(Inferred)

Task Scope Single piece of generated Single, specific, defined task Complex, multi-faceted Specific sub-task (often
content goal or workflow generative)
Complexity Low (relative) Medium (integrates tools) High (multi-agent coordina- Low to Medium (one task
tion) component)
Example (Video) Chatbot Tavily Search Agent YouTube-to-Blog Title/Description/Conclusion
Conversion System Generator

TABLE IX: Comparison by Interaction and Autonomy

Feature Generative AI AI Agent Agentic AI Generative Agent


(Inferred)

Autonomy Level Low (requires prompt) Medium (uses tools au- High (manages entire pro- Low to Medium (executes
tonomously) cess) sub-task)
External Interaction None (baseline) Via specific tools or APIs Through multiple Possibly via tools (if
agents/tools needed)
Internal Interaction N/A N/A High (inter-agent) Receives input from system
or agent
Decision Making Pattern selection Tool usage decisions Goal decomposition and as- Best sub-task generation
signment strategy
without maintaining persistent state or engaging in iterative • Perception Module: This subsystem ingests input signals
reasoning. In contrast, AI Agents such as those constructed from users (e.g., natural language prompts) or external
with LangChain [93] or MetaGPT [150], exhibit a higher systems (e.g., APIs, file uploads, sensor streams). It is
degree of autonomy, capable of initiating external tool invoca- responsible for preprocessing data into a format inter-
tions and adapting behaviors within bounded tasks. However, pretable by the agent’s reasoning module. For example,
their autonomy is typically confined to isolated task execution, in LangChain-based agents [93], [153], the perception
lacking long-term state continuity or collaborative interaction. layer handles prompt templating, contextual wrapping,
Agentic AI systems mark a significant departure from these and retrieval augmentation via document chunking and
paradigms by introducing internal orchestration mechanisms embedding search.
and multi-agent collaboration frameworks. For example, plat- • Knowledge Representation and Reasoning (KRR)
forms like AutoGen [94] and ChatDev [149] exemplify agentic Module: At the core of the agent’s intelligence lies
coordination through task decomposition, role assignment, the KRR module, which applies symbolic, statistical, or
and recursive feedback loops. In AutoGen, one agent might hybrid logic to input data. Techniques include rule-based
serve as a planner while another retrieves information and logic (e.g., if-then decision trees), deterministic workflow
a third synthesizes a report each communicating through engines, and simple planning graphs. Reasoning in agents
shared memory buffers and governed by an orchestrator agent like AutoGPT [30] is enhanced with function-calling
that monitors dependencies and overall task progression. This and prompt chaining to simulate thought processes (e.g.,
structured coordination allows for more complex goal pur- “step-by-step” prompts or intermediate tool invocations).
suit and flexible behavior in dynamic environments. Such • Action Selection and Execution Module: This module
architectures fundamentally shift the locus of intelligence translates inferred decisions into external actions using
from single-model outputs to emergent system-level behavior, an action library. These actions may include sending
wherein agents learn, negotiate, and update decisions based on messages, updating databases, querying APIs, or pro-
evolving task states. Thus, the comparative taxonomy not only ducing structured outputs. Execution is often managed
highlights increasing levels of operational independence but by middleware like LangChain’s “agent executor,” which
also illustrates how Agentic AI introduces novel paradigms of links LLM outputs to tool calls and observes responses
communication, memory integration, and decentralized con- for subsequent steps [93].
trol, paving the way for the next generation of autonomous • Basic Learning and Adaptation: Traditional AI Agents
systems with scalable, adaptive intelligence. feature limited learning mechanisms, such as heuristic
parameter adjustment [154], [155] or history-informed
A. Architectural Evolution: From AI Agents to Agentic AI context retention. For instance, agents may use simple
Systems memory buffers to recall prior user inputs or apply
While both AI Agents and Agentic AI systems are grounded scoring mechanisms to improve tool selection in future
in modular design principles, Agentic AI significantly extends iterations.
the foundational architecture to support more complex, dis-
Customization of these agents typically involves domain-
tributed, and adaptive behaviors. As illustrated in Figure 8,
specific prompt engineering, rule injection, or workflow tem-
the transition begins with core subsystems Perception, Rea-
plates, distinguishing them from hard-coded automation scripts
soning, and Action, that define traditional AI Agents. Agentic
by their ability to make context-aware decisions. Systems like
AI enhances this base by integrating advanced components
ReAct [132] exemplify this architecture, combining reasoning
such as Specialized Agents, Advanced Reasoning & Plan-
and action in an iterative framework where agents simulate
ning, Persistent Memory, and Orchestration. The figure further
internal dialogue before selecting external actions.
emphasizes emergent capabilities including Multi-Agent Col-
laboration, System Coordination, Shared Context, and Task 2) Architectural Enhancements in Agentic AI: Agentic AI
Decomposition, all encapsulated within a dotted boundary systems inherit the modularity of AI Agents but extend
that signifies the shift toward reflective, decentralized, and their architecture to support distributed intelligence, inter-
goal-driven system architectures. This progression marks a agent communication, and recursive planning. The literature
fundamental inflection point in intelligent agent design. This documents a number of critical architectural enhancements
section synthesizes findings from empirical frameworks such that differentiate Agentic AI from its predecessors [156],
as LangChain [93], AutoGPT [94], and TaskMatrix [151], [157].
highlighting this progression in architectural sophistication. • Ensemble of Specialized Agents: Rather than operating
1) Core Architectural Components of AI Agents: Foun- as a monolithic unit, Agentic AI systems consist of
dational AI Agents are typically composed of four primary multiple agents, each assigned a specialized function e.g.,
subsystems: perception, reasoning, action, and learning. These a summarizer, a retriever, a planner. These agents inter-
subsystems form a closed-loop operational cycle, commonly act via communication channels (e.g., message queues,
referred to as “Understand, Think, Act” from a user interface blackboards, or shared memory). For instance MetaGPT
perspective, or “Input, Processing, Action, Learning” in sys- [150] exemplify this approach by modeling agents after
tems design literature [14], [152]. corporate departments (e.g., CEO, CTO, engineer), where
Agentic AI

AI Agents

Multi-Agent
Collaboration Task-Decomposition

System Coordination
Shared Context

Fig. 8: Illustrating architectural evolution from traditional AI Agents to modern Agentic AI systems. It begins with core
modules Perception, Reasoning, and Action and expands into advanced components including Specialized Agents, Advanced
Reasoning & Planning, Persistent Memory, and Orchestration. The diagram further captures emergent properties such as Multi-
Agent Collaboration, System Coordination, Shared Context, and Task Decomposition, all enclosed within a dotted boundary
signifying layered modularity and the transition to distributed, adaptive agentic AI intelligence.

roles are modular, reusable, and role-bound. Orchestrators often include task managers, evaluators, or
• Advanced Reasoning and Planning: Agentic systems moderators. In ChatDev [149], for example, a virtual
embed recursive reasoning capabilities using frameworks CEO meta-agent distributes subtasks to departmental
such as ReAct [132], Chain-of-Thought (CoT) prompting agents and integrates their outputs into a unified strategic
[158], and Tree of Thoughts [159]. These mechanisms response.
allow agents to break down a complex task into multiple These enhancements collectively enable Agentic AI to sup-
reasoning stages, evaluate intermediate results, and re- port scenarios that require sustained context, distributed labor,
plan actions dynamically. This enables the system to multi-modal coordination, and strategic adaptation. Use cases
respond adaptively to uncertainty or partial failure. range from research assistants that retrieve, summarize, and
• Persistent Memory Architectures: Unlike traditional draft documents in tandem (e.g., AutoGen pipelines [94])
agents, Agentic AI incorporates memory subsystems to to smart supply chain agents that monitor logistics, vendor
persist knowledge across task cycles or agent sessions performance, and dynamic pricing models in parallel.
[160], [161]. Memory types include episodic memory The shift from isolated perception–reasoning–action loops
(task-specific history) [162], [163], semantic memory to collaborative and reflective multi-agent workflows marks a
(long-term facts or structured data) [164], [165], and key inflection point in the architectural design of intelligent
vector-based memory for retrieval-augmented generation systems. This progression positions Agentic AI as the next
(RAG) [166], [167]. For example, AutoGen [94] agents stage of AI infrastructure capable not only of executing
maintain scratchpads for intermediate computations, en- predefined workflows but also of constructing, revising, and
abling stepwise task progression. managing complex objectives across agents with minimal
• Orchestration Layers / Meta-Agents: A key innovation human supervision.
in Agentic AI is the introduction of orchestrators meta-
IV. A PPLICATION OF AI AGENTS AND AGENTIC AI
agents that coordinate the lifecycle of subordinate agents,
manage dependencies, assign roles, and resolve conflicts. To illustrate the real-world utility and operational diver-
gence between AI Agents and Agentic AI systems, this study
Customer Support
Automation and Multi-Agent
Internal Enterprise Research Assistants
Search

Email Filtering and Intelligent Robotics


Prioritization Coordination

Collaborative
Personalized Content
Medical Decision
Recommendation,
Support
Basic Data Analysis
and Reporting Multi-Agent Game
AI & Adaptive
Autonomous Workflow
Scheduling Automation
Assistants

Fig. 9: Categorized applications of AI Agents and Agentic AI across eight core functional domains.

synthesizes a range of applications drawn from recent litera- 1) Customer Support Automation and Internal Enter-
ture, as visualized in Figure 9. We systematically categorize prise Search: AI Agents are widely adopted in en-
and analyze application domains across two parallel tracks: terprise environments for automating customer support
conventional AI Agent systems and their more advanced and facilitating internal knowledge retrieval. In cus-
Agentic AI counterparts. For AI Agents, four primary use tomer service, these agents leverage retrieval-augmented
cases are reviewed: (1) Customer Support Automation and LLMs interfaced with APIs and organizational knowl-
Internal Enterprise Search, where single-agent models handle edge bases to answer user queries, triage tickets, and
structured queries and response generation; (2) Email Filtering perform actions like order tracking or return initia-
and Prioritization, where agents assist users in managing tion [47]. For internal enterprise search, agents built
high-volume communication through classification heuristics; on vector stores (e.g., Pinecone, Elasticsearch) retrieve
(3) Personalized Content Recommendation and Basic Data semantically relevant documents in response to natu-
Reporting, where user behavior is analyzed for automated ral language queries. Tools such as Salesforce Ein-
insights; and (4) Autonomous Scheduling Assistants, which stein https://www.salesforce.com/artificial-intelligence/,
interpret calendars and book tasks with minimal user input. Intercom Fin https://www.intercom.com/fin, and Notion
In contrast, Agentic AI applications encompass broader and AI https://www.notion.com/product/ai demonstrate how
more dynamic capabilities, reviewed through four additional structured input processing and summarization capabil-
categories: (1) Multi-Agent Research Assistants that retrieve, ities reduce workload and improve enterprise decision-
synthesize, and draft scientific content collaboratively; (2) making.
Intelligent Robotics Coordination, including drone and multi- A practical example (Figure 10a) of this dual func-
robot systems in fields like agriculture and logistics; (3) tionality can be seen in a multinational e-commerce
Collaborative Medical Decision Support, involving diagnostic, company deploying an AI Agent-based customer support
treatment, and monitoring subsystems; and (4) Multi-Agent and internal search assistant. For customer support, the
Game AI and Adaptive Workflow Automation, where decen- AI Agent integrates with the company’s CRM (e.g.,
tralized agents interact strategically or handle complex task Salesforce) and fulfillment APIs to resolve queries such
pipelines. as “Where is my order?” or “How can I return this
1) Application of AI Agents: item?” Within milliseconds, the agent retrieves contex-
tual data from shipping databases and policy repos-
itories, then generates a personalized response using
retrieval-augmented generation. For internal enterprise
search, employees use the same system to query past
meeting notes, sales presentations, or legal documents.
When an HR manager types “summarize key benefits
policy changes from last year,” the agent queries a
Pinecone vector store embedded with enterprise doc-
umentation, ranks results by semantic similarity, and
returns a concise summary along with source links. (a)
These capabilities not only reduce ticket volume and
support overhead but also minimize time spent searching
for institutional knowledge. The result is a unified,
responsive system that enhances both external service
delivery and internal operational efficiency using mod-
ular AI Agent architectures.
2) Email Filtering and Prioritization: Within productivity
tools, AI Agents automate email triage through content
classification and prioritization. Integrated with systems
like Microsoft Outlook and Superhuman, these agents (b)
analyze metadata and message semantics to detect ur-
gency, extract tasks, and recommend replies. They apply
user-tuned filtering rules, behavioral signals, and intent
classification to reduce cognitive overload. Autonomous
actions, such as auto-tagging or summarizing threads,
enhance efficiency, while embedded feedback loops en-
able personalization through incremental learning [63].
Figure10b illustrates a practical implementation of AI
Agents in the domain of email filtering and prioriti-
zation. In modern workplace environments, users are
inundated with high volumes of email, leading to cog- (c)
nitive overload and missed critical communications. AI
Agents embedded in platforms like Microsoft Outlook
or Superhuman act as intelligent intermediaries that
classify, cluster, and triage incoming messages. These
agents evaluate metadata (e.g., sender, subject line) and
semantic content to detect urgency, extract actionable
items, and suggest smart replies. As depicted, the AI
agent autonomously categorizes emails into tags such
as “Urgent,” “Follow-up,” and “Low Priority,” while
also offering context-aware summaries and reply drafts. (d)
Through continual feedback loops and usage patterns,
the system adapts to user preferences, gradually refining Fig. 10: Applications of AI Agents in enterprise settings: (a)
classification thresholds and improving prioritization ac- Customer support and internal enterprise search; (b) Email
curacy. This automation offloads decision fatigue, allow- filtering and prioritization; (c) Personalized content recom-
ing users to focus on high-value tasks, while maintain- mendation and basic data reporting; and (d) Autonomous
ing efficient communication management in fast-paced, scheduling assistants. Each example highlights modular AI
information-dense environments. Agent integration for automation, intent understanding, and
3) Personalized Content Recommendation and Basic adaptive reasoning across operational workflows and user-
Data Reporting: AI Agents support adaptive personal- facing systems.
ization by analyzing behavioral patterns for news, prod-
uct, or media recommendations. Platforms like Amazon,
YouTube, and Spotify deploy these agents to infer user
preferences via collaborative filtering, intent detection, lot) enable natural-language data queries and automated
and content ranking. Simultaneously, AI Agents in an- report generation by converting prompts to structured
alytics systems (e.g., Tableau Pulse, Power BI Copi- database queries and visual summaries, democratizing
business intelligence access. overhead, increase scheduling efficiency, and enable
A practical illustration (Figure 10c) of AI Agents in smoother team workflows by proactively resolving am-
personalized content recommendation and basic data biguity and optimizing calendar utilization.
reporting can be found in e-commerce and enterprise
analytics systems. Consider an AI agent deployed on a TABLE X: Representative AI Agents (2023–2025): Applica-
retail platform like Amazon: as users browse, click, and tions and Operational Characteristics
purchase items, the agent continuously monitors inter-
Model / Reference Application Operation as AI Agent
action patterns such as dwell time, search queries, and Area
purchase sequences. Using collaborative filtering and ChatGPT Deep Re- Research Analy- Synthesizes hundreds of
content-based ranking, the agent infers user intent and search Mode sis / Reporting sources into reports; functions
dynamically generates personalized product suggestions OpenAI (2025) Deep as a self-directed research
Research OpenAI analyst.
that evolve over time. For example, after purchasing Operator Web Automation Navigates websites, fills forms,
gardening tools, a user may be recommended compat- OpenAI (2025) Opera- and completes online tasks au-
ible soil sensors or relevant books. This level of per- tor OpenAI tonomously.
sonalization enhances customer engagement, increases Agentspace: Deep Re- Enterprise Generates business
search Agent Reporting intelligence reports using
conversion rates, and supports long-term user retention. Google (2025) Google Gemini models.
Simultaneously, within a corporate setting, an AI agent Agentspace
integrated into Power BI Copilot allows non-technical NotebookLM Plus Knowledge Man- Summarizes, organizes, and
staff to request insights using natural language, for Agent agement retrieves data across Google
Google (2025) Workspace apps.
instance, “Compare Q3 and Q4 sales in the Northeast.” NotebookLM
The agent translates the prompt into structured SQL Nova Act Workflow Automates browser-based
queries, extracts patterns from the database, and outputs Amazon (2025) Ama- Automation tasks such as scheduling, HR
a concise visual summary or narrative report. This zon Nova requests, and email.
Manus Agent Personal Task Executes trip planning, site
application reduces dependency on data analysts and Monica (2025) Manus Automation building, and product compar-
empowers broader business decision-making through Agenthttps://manus.im/ isons via browsing.
intuitive, language-driven interfaces. Harvey Legal Automates document drafting,
4) Autonomous Scheduling Assistants: AI Agents in- Harvey AI (2025) Har- Automation legal review, and predictive
vey case analysis.
tegrated with calendar systems autonomously manage
Otter Meeting Agent Meeting Transcribes meetings and pro-
meeting coordination, rescheduling, and conflict reso- Otter.ai (2025) Otter Management vides highlights, summaries,
lution. Tools like x.ai and Reclaim AI interpret vague and action items.
scheduling commands, access calendar APIs, and iden- Otter Sales Agent Sales Analyzes sales calls, extracts
tify optimal time slots using learned user preferences. Otter.ai (2025) Otter Enablement insights, and suggests follow-
sales agent ups.
They minimize human input while adapting to dynamic ClickUp Brain Project Manage- Automates task tracking, up-
availability constraints. Their ability to interface with ClickUp (2025) ment dates, and project workflows.
enterprise systems and respond to ambiguous instruc- ClickUp Brain
tions highlights the modular autonomy of contemporary Agentforce Customer Routes tickets and generates
Agentforce (2025) Support context-aware replies for sup-
scheduling agents. Agentforce port teams.
A practical application of autonomous scheduling agents Microsoft Copilot Office Productiv- Automates writing, formula
can be seen in corporate settings as depicted in Fig- Microsoft (2024) Mi- ity generation, and summarization
ure 10d where employees manage multiple overlapping crosoft Copilot in Microsoft 365.
responsibilities across global time zones. Consider an Project Astra Multimodal As- Processes text, image, audio,
Google DeepMind sistance and video for task support and
executive assistant AI agent integrated with Google (2025) Project Astra recommendations.
Calendar and Slack that interprets a command like “Find Claude 3.5 Agent Enterprise Assis- Uses multimodal input for rea-
a 45-minute window for a follow-up with the product Anthropic (2025) tance soning, personalization, and
Claude 3.5 Sonnet enterprise task completion.
team next week.” The agent parses the request, checks
availability for all participants, accounts for time zone
differences, and avoids meeting conflicts or working- 2) Appications of Agentic AI:
hour violations. If it identifies a conflict with a pre- 1) Multi-Agent Research Assistants: Agentic AI systems
viously scheduled task, it may autonomously propose are increasingly deployed in academic and industrial
alternative windows and notify affected attendees via research pipelines to automate multi-stage knowledge
Slack integration. Additionally, the agent learns from work. Platforms like AutoGen and CrewAI assign spe-
historical user preferences such as avoiding early Friday cialized roles to multiple agents retrievers, summarizers,
meetings and refines its suggestions over time. Tools synthesizers, and citation formatters under a central
like Reclaim AI and Clockwise exemplify this capabil- orchestrator. The orchestrator distributes tasks, manages
ity, offering calendar-aware automation that adapts to role dependencies, and integrates outputs into coherent
evolving workloads. Such assistants reduce coordination drafts or review summaries. Persistent memory allows
for cross-agent context sharing and refinement over dynamically shuttle crates between pickers and storage,
time. These systems are being used for literature re- adjusting tasks in response to picker load levels and
views, grant preparation, and patent search pipelines, terrain changes. All agents communicate asynchronously
outperforming single-agent systems such as ChatGPT by through a shared protocol, and the orchestrator contin-
enabling concurrent sub-task execution and long-context uously adjusts task priorities based on weather fore-
management [94]. casts or mechanical faults. If one picker fails, nearby
For example, a real-world application of agentic AI as units autonomously reallocate workload. This adaptive,
depicted in Figure 11a is in the automated drafting of memory-driven coordination exemplifies Agentic AI’s
grant proposals. Consider a university research group potential to reduce labor costs, increase harvest effi-
preparing a National Science Foundation (NSF) sub- ciency, and respond to uncertainties in complex agricul-
mission. Using an AutoGen-based architecture, distinct tural environments far surpassing the rigid programming
agents are assigned: one retrieves prior funded proposals of legacy agricultural robots [94], [150].
and extracts structural patterns; another scans recent 3) Collaborative Medical Decision Support: In high-
literature to summarize related work; a third agent aligns stakes clinical environments, Agentic AI enables dis-
proposal objectives with NSF solicitation language; and tributed medical reasoning by assigning tasks such as
a formatting agent structures the document per com- diagnostics, vital monitoring, and treatment planning to
pliance guidelines. The orchestrator coordinates these specialized agents. For example, one agent may retrieve
agents, resolving dependencies (e.g., aligning methodol- patient history, another validates findings against diag-
ogy with objectives) and ensuring stylistic consistency nostic guidelines, and a third proposes treatment options.
across sections. Persistent memory modules store evolv- These agents synchronize through shared memory and
ing drafts, feedback from collaborators, and funding reasoning chains, ensuring coherent, safe recommenda-
agency templates, enabling iterative improvement over tions. Applications include ICU management, radiology
multiple sessions. Compared to traditional manual pro- triage, and pandemic response. Real-world pilots show
cesses, this multi-agent system significantly accelerates improved efficiency and decision accuracy compared to
drafting time, improves narrative cohesion, and ensures isolated expert systems [92].
regulatory alignment offering a scalable, adaptive ap- For example, in a hospital ICU (Figure 11c), an agentic
proach to collaborative scientific writing in academia AI system supports clinicians in managing complex
and R&D-intensive industries. patient cases. A diagnostic agent continuously ana-
2) Intelligent Robotics Coordination: In robotics and lyzes vitals and lab data for early detection of sepsis
automation, Agentic AI underpins collaborative behav- risk. Simultaneously, a history retrieval agent accesses
ior in multi-robot systems. Each robot operates as a electronic health records (EHRs) to summarize comor-
task specialized agent such as pickers, transporters, or bidities and recent procedures. A treatment planning
mappers while an orchestrator supervises and adapts agent cross-references current symptoms with clinical
workflows. These architectures rely on shared spatial guidelines (e.g., Surviving Sepsis Campaign), proposing
memory, real-time sensor fusion, and inter-agent syn- antibiotic regimens or fluid protocols. The orchestra-
chronization for coordinated physical actions. Use cases tor integrates these insights, ensures consistency, and
include warehouse automation, drone-based orchard in- surfaces conflicts for human review. Feedback from
spection, and robotic harvesting [150]. For instance, physicians is stored in a persistent memory module,
agricultural drone swarms may collectively map tree allowing agents to refine their reasoning based on prior
rows, identify diseased fruits, and initiate mechanical interventions and outcomes. This coordinated system
interventions. This dynamic allocation enables real-time enhances clinical workflow by reducing cognitive load,
reconfiguration and autonomy across agents facing un- shortening decision times, and minimizing oversight
certain or evolving environments. risks. Early deployments in critical care and oncology
For example, in commercial apple orchards (Figure 11b), units have demonstrated increased diagnostic precision
Agentic AI enables a coordinated multi-robot system and better adherence to evidence-based protocols, offer-
to optimize the harvest season. Here, task-specialized ing a scalable solution for safer, real-time collaborative
robots such as autonomous pickers, fruit classifiers, medical support.
transport bots, and drone mappers operate as agentic 4) Multi-Agent Game AI and Adaptive Workflow Au-
units under a central orchestrator. The mapping drones tomation: In simulation environments and enterprise
first survey the orchard and use vision-language models systems, Agentic AI facilitates decentralized task exe-
(VLMs) to generate high-resolution yield maps and cution and emergent coordination. Game platforms like
identify ripe clusters. This spatial data is shared via a AI Dungeon deploy independent NPC agents with goals,
centralized memory layer accessible by all agents. Picker memory, and dynamic interactivity to create emergent
robots are assigned to high-density zones, guided by narratives and social behavior. In enterprise workflows,
path-planning agents that optimize routes around obsta- systems such as MultiOn and Cognosys use agents to
cles and labor zones. Simultaneously, transport agents manage processes like legal review or incident esca-
Using Agentic AI to
coordinate robotic harvest

Central Memory Layer


Retrieve prior
proposals Align with
solicitation

Structure the Store evolving


document drafts

(a)
(b)

Goal
Module

Memory
Store

(c) (d)
Fig. 11: Illustrative Applications of Agentic AI Across Domains: Figure 11 presents four real-world applications of agentic AI
systems. (a) Automated grant writing using multi-agent orchestration for structured literature analysis, compliance alignment,
and document formatting. (b) Coordinated multi-robot harvesting in apple orchards using shared spatial memory and task-
specific agents for mapping, picking, and transport. (c) Clinical decision support in hospital ICUs through synchronized agents
for diagnostics, treatment planning, and EHR analysis, enhancing safety and workflow efficiency. (d) Cybersecurity incident
response in enterprise environments via agents handling threat classification, compliance analysis, and mitigation planning.
In all cases, central orchestrators manage inter-agent communication, shared memory enables context retention, and feedback
mechanisms drive continual learning. These use cases highlight agentic AI’s capacity for scalable, autonomous task coordination
in complex, dynamic environments across science, agriculture, healthcare, and IT security.

lation, where each step is governed by a specialized increasingly deployed to autonomously manage cyber-
module. These architectures exhibit resilience, exception security incident response workflows. When a potential
handling, and feedback-driven adaptability far beyond threat is detected such as abnormal access patterns
rule-based pipelines. or unauthorized data exfiltration specialized agents are
For example, in a modern enterprise IT environment activated in parallel. One agent performs real-time threat
(as depicted in Figure 11d), Agentic AI systems are classification using historical breach data and anomaly
detection models. A second agent queries relevant log TABLE XI: Representative Agentic AI Models (2023–2025):
data from network nodes and correlates patterns across Applications and Operational Characteristics
systems. A third agent interprets compliance frameworks Model / Reference Application Operation as Agentic AI
(e.g., GDPR or HIPAA) to assess the regulatory sever- Area
ity of the event. A fourth agent simulates mitigation Auto-GPT Task Automation Decomposes high-level
[30] goals, executes subtasks
strategies and forecasts operational risks. These agents via tools/APIs, and
coordinate under a central orchestrator that evaluates iteratively self-corrects.
collective outputs, integrates temporal reasoning, and GPT Engineer Code Generation Builds entire codebases:
issues recommended actions to human analysts. Through Open Source (2023) plans, writes, tests, and re-
GPT Engineer fines based on output.
shared memory structures and iterative feedback, the
MetaGPT Software Collab- Coordinates specialized
system learns from prior incidents, enabling faster and [150]) oration agents (e.g., coder, tester)
more accurate responses in future cases. Compared for modular multi-role
to traditional rule-based security systems, this agentic project development.
BabyAGI Project Manage- Continuously creates, pri-
model enhances decision latency, reduces false positives, Nakajima (2024) ment oritizes, and executes sub-
and supports proactive threat containment in large-scale BabyAGI tasks to adaptively meet
organizational infrastructures [94]. user goals.
Voyager Game Learns in Minecraft, in-
V. C HALLENGES AND L IMITATIONS IN AI AGENTS AND Wang et al. (2023) Exploration vents new skills, sets sub-
[168] goals, and adapts strategy
AGENTIC AI in real time.
To systematically understand the operational and theoret- CAMEL Multi-Agent Simulates agent societies
ical limitations of current intelligent systems, we present a Liu et al. (2023) [169] Simulation with communication, ne-
gotiation, and emergent
comparative visual synthesis in Figure 12, which categorizes collaborative behavior.
challenges and potential remedies across both AI Agents and Einstein Copilot Customer Automates full support
Agentic AI paradigms. Figure 12a outlines the four most Salesforce (2024) Ein- Automation workflows, escalates is-
pressing limitations specific to AI Agents namely, lack of stein Copilot sues, and improves via
feedback loops.
causal reasoning, inherited LLM constraints (e.g., hallucina- Copilot Studio Productivity Au- Manages documents,
tions, shallow reasoning), incomplete agentic properties (e.g., (Agentic Mode) tomation meetings, and projects
autonomy, proactivity), and failures in long-horizon planning Microsoft (2025) across Microsoft 365 with
Github Agentic adaptive orchestration.
and recovery. These challenges often arise due to their reliance Copilot
on stateless LLM prompts, limited memory, and heuristic Atera AI Copilot IT Operations Diagnoses/resolves IT is-
reasoning loops. Atera (2025) Atera sues, automates ticketing,
In contrast, Figure 12b identifies eight critical bottlenecks Agentic AI and learns from evolving
infrastructures.
unique to Agentic AI systems, such as inter-agent error cas-
AES Safety Audit Industrial Safety Automates audits,
cades, coordination breakdowns, emergent instability, scala- Agent assesses compliance,
bility limits, and explainability issues. These challenges stem AES (2025) AES and evolves strategies to
from the complexity of orchestrating multiple agents across agentic enhance safety outcomes.
distributed tasks without standardized architectures, robust DeepMind Gato General Robotics Performs varied tasks
(Agentic Mode) across modalities,
communication protocols, or causal alignment frameworks. Reed et al. (2022) dynamically learns,
Figure 13 complements this diagnostic framework by syn- [170] plans, and executes.
thesizing ten forward-looking design strategies aimed at mit- GPT-4o + Plugins Enterprise Manages complex work-
OpenAI (2024) GPT- Automation flows, integrates external
igating these limitations. These include Retrieval-Augmented 4O Agentic tools, and executes adap-
Generation (RAG), tool-based reasoning [126], [127], [129], tive decisions.
agentic feedback loops (ReAct [132]), role-based multi-agent
orchestration, memory architectures, causal modeling, and
governance-aware design. Together, these three panels offer
reasoning, planning, and robust adaptation. The key challenges
a consolidated roadmap for addressing current pitfalls and
and limitations (Figure 12a) of AI Agents are as summarized
accelerating the development of safe, scalable, and context-
into following five points:
aware autonomous systems.
1) Challenges and Limitations of AI Agents: While AI 1) Lack of Causal Understanding: One of the most foun-
Agents have garnered considerable attention for their ability to dational challenges lies in the agents’ inability to reason
automate structured tasks using LLMs and tool-use interfaces, causally [171], [172]. Current LLMs, which form the
the literature highlights significant theoretical and practical cognitive core of most AI Agents, excel at identifying
limitations that inhibit their reliability, generalization, and statistical correlations within training data. However, as
long-term autonomy [132], [157]. These challenges arise from noted in recent research from DeepMind and conceptual
both the architectural dependence on static, pretrained models analyses by TrueTheta, they fundamentally lack the
and the difficulty of instilling agentic qualities such as causal capacity for causal modeling distinguishing between
(a) (b)
Fig. 12: Illustration of Chellenges: (a) Key limitations of AI Agents including causality deficits and shallow reasoning. (b)
Amplified coordination and stability challenges in Agentic AI systems.

mere association and cause-effect relationships [173]– comprehension. Agents may still fail at multi-step in-
[175]. For instance, while an LLM-powered agent might ference, misalign task objectives, or make logically
learn that visiting a hospital often co-occurs with illness, inconsistent conclusions despite the appearance of struc-
it cannot infer whether the illness causes the visit or vice tured reasoning [132]. Such shortcomings underscore
versa, nor can it simulate interventions or hypothetical the absence of genuine understanding and generalizable
changes. planning capabilities.
This deficit becomes particularly problematic under Another key limitation lies in computational cost and
distributional shifts, where real-world conditions differ latency. Each cycle of agentic decision-making partic-
from the training regime [176], [177]. Without such ularly in planning or tool-calling may require several
grounding, agents remain brittle, failing in novel or LLM invocations. This not only increases runtime la-
high-stakes scenarios. For example, a navigation agent tency but also scales resource consumption, creating
that excels in urban driving may misbehave in snow or practical bottlenecks in real-world deployments and
construction zones if it lacks an internal causal model cloud-based inference systems. Furthermore, LLMs have
of road traction or spatial occlusion. a static knowledge cutoff and cannot dynamically in-
2) Inherited Limitations from LLMs: AI Agents, particu- tegrate new information unless explicitly augmented
larly those powered by LLMs, inherit a number of intrin- via retrieval or tool plugins. They also reproduce the
sic limitations that impact their reliability, adaptability, biases of their training datasets, which can manifest as
and overall trustworthiness in practical deployments culturally insensitive or skewed responses [185], [186].
[178]–[180]. One of the most prominent issues is the ten- Without rigorous auditing and mitigation strategies,
dency to produce hallucinations plausible but factually these issues pose serious ethical and operational risks,
incorrect outputs. In high-stakes domains such as legal particularly when agents are deployed in sensitive or
consultation or scientific research, these hallucinations user-facing contexts.
can lead to severe misjudgments and erode user trust 3) Incomplete Agentic Properties: A major limitation
[181], [182]. Compounding this is the well-documented of current AI Agents is their inability to fully satisfy
prompt sensitivity of LLMs, where even minor varia- the canonical agentic properties defined in foundational
tions in phrasing can lead to divergent behaviors. This literature, such as autonomy, proactivity, reactivity, and
brittleness hampers reproducibility, necessitating metic- social ability [142], [180]. While many systems mar-
ulous manual prompt engineering and often requiring keted as ”agents” leverage LLMs to perform useful
domain-specific tuning to maintain consistency across tasks, they often fall short of these fundamental cri-
interactions [183]. teria in practice. Autonomy, for instance, is typically
Furthermore, while recent agent frameworks adopt rea- partial at best. Although agents can execute tasks with
soning heuristics like Chain-of-Thought (CoT) [158], minimal oversight once initialized, they remain heavily
[184] and ReAct [132] to simulate deliberative pro- reliant on external scaffolding such as human-defined
cesses, these approaches remain shallow in semantic prompts, planning heuristics, or feedback loops to func-
tion effectively [187]. Self-initiated task generation, self- yet safe or verifiable enough for deployment in critical
monitoring, or autonomous error correction are rare or infrastructure [193]. The absence of causal reasoning
absent, limiting their capacity for true independence. leads to unpredictable behavior under distributional shift
Proactivity is similarly underdeveloped. Most AI Agents [172], [194]. Furthermore, evaluating the correctness of
require explicit user instruction to act and lack the capac- an agent’s plan especially when the agent fabricates
ity to formulate or reprioritize goals dynamically based intermediate steps or rationales remains an unsolved
on contextual shifts or evolving objectives [188]. As a problem in interpretability [110], [195]. Safety guaran-
result, they behave reactively rather than strategically, tees, such as formal verification, are not yet available
constrained by the static nature of their initialization. Re- for open-ended, LLM-powered agents. While AI Agents
activity itself is constrained by architectural bottlenecks. represent a major step beyond static generative models,
Agents do respond to environmental or user input, but their limitations in causal reasoning, adaptability, robust-
response latency caused by repeated LLM inference calls ness, and planning restrict their deployment in high-
[189], [190], coupled with narrow contextual memory stakes or dynamic environments. Most current systems
windows [160], [191], inhibits real-time adaptability. rely on heuristic wrappers and brittle prompt engineering
Perhaps the most underexplored capability is social rather than grounded agentic cognition. Bridging this
ability. True agentic systems should communicate and gap will require future systems to integrate causal mod-
coordinate with humans or other agents over extended els, dynamic memory, and verifiable reasoning mech-
interactions, resolving ambiguity, negotiating tasks, and anisms. These limitations also set the stage for the
adapting to social norms. emergence of Agentic AI systems, which attempt to
However, existing implementations exhibit brittle, address these bottlenecks through multi-agent collabo-
template-based dialogue that lacks long-term memory ration, orchestration layers, and persistent system-level
integration or nuanced conversational context. Agent- context.
to-agent interaction is often hardcoded or limited to
2) Challenges and Limitations of Agentic AI: Agentic AI
scripted exchanges, hindering collaborative execution
systems represent a paradigm shift from isolated AI agents to
and emergent behavior [101], [192]. Collectively, these
collaborative, multi-agent ecosystems capable of decomposing
deficiencies reveal that while AI Agents demonstrate
and executing complex goals [14]. These systems typically
functional intelligence, they remain far from meeting the
consist of orchestrated or communicating agents that interact
formal benchmarks of intelligent, interactive, and adap-
via tools, APIs, and shared environments [18], [39]. While
tive agents. Bridging this gap is essential for advancing
this architectural evolution enables more ambitious automa-
toward more autonomous, socially capable AI systems.
tion, it introduces a range of amplified and novel challenges
4) Limited Long-Horizon Planning and Recovery: A
that compound existing limitations of individual LLM-based
persistent limitation of current AI Agents lies in their
agents. The current challenges and limitations of Agentic AI
inability to perform robust long-horizon planning, es-
are as follows:
pecially in complex, multi-stage tasks. This constraint
stems from their foundational reliance on stateless 1) Amplified Causality Challenges: One of the most
prompt-response paradigms, where each decision is critical limitations in Agentic AI systems is the magni-
made without an intrinsic memory of prior reasoning fication of causality deficits already observed in single-
steps unless externally managed. Although augmenta- agent architectures. Unlike traditional AI Agents that
tions such as the ReAct framework [132] or Tree- operate in relatively isolated environments, Agentic AI
of-Thoughts [159] introduce pseudo-recursive reason- systems involve complex inter-agent dynamics, where
ing, they remain fundamentally heuristic and lack true each agent’s action can influence the decision space of
internal models of time, causality, or state evolution. others. Without a robust capacity for modeling cause-
Consequently, agents often falter in tasks requiring ex- effect relationships, these systems struggle to coordinate
tended temporal consistency or contingency planning. effectively and adapt to unforeseen environmental shifts.
For example, in domains such as clinical triage or A key manifestation of this challenge is inter-agent
financial portfolio management, where decisions depend distributional shift, where the behavior of one agent
on prior context and dynamically unfolding outcomes, alters the operational context for others. In the absence
agents may exhibit repetitive behaviors such as endlessly of causal reasoning, agents are unable to anticipate the
querying tools or fail to adapt when sub-tasks fail or downstream impact of their outputs, resulting in coor-
return ambiguous results. The absence of systematic dination breakdowns or redundant computations [196].
recovery mechanisms or error detection leads to brittle Furthermore, these systems are particularly vulnerable to
workflows and error propagation. This shortfall severely error cascades: a faulty or hallucinated output from one
limits agent deployment in mission-critical environments agent can propagate through the system, compounding
where reliability, fault tolerance, and sequential coher- inaccuracies and corrupting subsequent decisions. For
ence are essential. example, if a verification agent erroneously validates
5) Reliability and Safety Concerns: AI Agents are not false information, downstream agents such as summariz-
ers or decision-makers may unknowingly build upon that instability. This includes phenomena such as infinite
misinformation, compromising the integrity of the entire planning loops, action deadlocks, and contradictory
system. This fragility underscores the urgent need for behaviors emerging from asynchronous or misaligned
integrating causal inference and intervention modeling agent decisions. Without centralized arbitration mecha-
into the design of multi-agent workflows, especially in nisms, conflict resolution protocols, or fallback strate-
high-stakes or dynamic environments where systemic gies, these instabilities compound over time, making
robustness is essential. the system fragile and unreliable. The stochasticity and
2) Communication and Coordination Bottlenecks: A opacity of large language model-based agents further
fundamental challenge in Agentic AI lies in achieving exacerbate this issue, as their internal decision logic is
efficient communication and coordination across mul- not easily interpretable or verifiable. Consequently, en-
tiple autonomous agents. Unlike single-agent systems, suring the predictability and controllability of emergent
Agentic AI involves distributed agents that must col- behavior remains a central challenge in designing safe
lectively pursue a shared objective necessitating precise and scalable Agentic AI systems.
alignment, synchronized execution, and robust commu- 4) Scalability and Debugging Complexity: As Agen-
nication protocols. However, current implementations tic AI systems scale in both the number of agents
fall short in these aspects. One major issue is goal and the diversity of specialized roles, maintaining sys-
alignment and shared context, where agents often lack tem reliability and interpretability becomes increas-
a unified semantic understanding of overarching objec- ingly complex [198], [199]. A central limitation stems
tives. This hampers sub-task decomposition, dependency from the black-box chains of reasoning characteristic
management, and progress monitoring, especially in of LLM-based agents. Each agent may process inputs
dynamic environments requiring causal awareness and through opaque internal logic, invoke external tools,
temporal coherence. and communicate with other agents all of which occur
In addition, protocol limitations significantly hinder through multiple layers of prompt engineering, reason-
inter-agent communication. Most systems rely on nat- ing heuristics, and dynamic context handling. Tracing
ural language exchanges over loosely defined interfaces, the root cause of a failure thus requires unwinding
which are prone to ambiguity, inconsistent formatting, nested sequences of agent interactions, tool invocations,
and contextual drift. These communication gaps lead and memory updates, making debugging non-trivial and
to fragmented strategies, delayed coordination, and de- time-consuming.
graded system performance. Furthermore, resource con- Another significant constraint is the system’s non-
tention emerges as a systemic bottleneck when agents compositionality. Unlike traditional modular systems,
simultaneously access shared computational, memory, where adding components can enhance overall func-
or API resources. Without centralized orchestration or tionality, introducing additional agents in an Agentic
intelligent scheduling mechanisms, these conflicts can AI architecture often increases cognitive load, noise,
result in race conditions, execution delays, or outright and coordination overhead. Poorly orchestrated agent
system failures. Collectively, these bottlenecks illustrate networks can result in redundant computation, contradic-
the immaturity of current coordination frameworks in tory decisions, or degraded task performance. Without
Agentic AI, and highlight the pressing need for stan- robust frameworks for agent role definition, communica-
dardized communication protocols, semantic task plan- tion standards, and hierarchical planning, the scalability
ners, and global resource managers to ensure scalable, of Agentic AI does not necessarily translate into greater
coherent multi-agent collaboration. intelligence or robustness. These limitations highlight
3) Emergent Behavior and Predictability: One of the the need for systematic architectural controls and trace-
most critical limitations of Agentic AI lies in managing ability tools to support the development of reliable,
emergent behavior complex system-level phenomena large-scale agentic ecosystems.
that arise from the interactions of autonomous agents. 5) Trust, Explainability, and Verification: Agentic AI
While such emergence can potentially yield adaptive and systems pose heightened challenges in explainability and
innovative solutions, it also introduces significant unpre- verifiability due to their distributed, multi-agent architec-
dictability and safety risks [152], [197]. A key concern ture. While interpreting the behavior of a single LLM-
is the generation of unintended outcomes, where agent powered agent is already non-trivial, this complexity is
interactions result in behaviors that were not explicitly multiplied when multiple agents interact asynchronously
programmed or foreseen by system designers. These through loosely defined communication protocols. Each
behaviors may diverge from task objectives, generate agent may possess its own memory, task objective, and
misleading outputs, or even enact harmful actions par- reasoning path, resulting in compounded opacity where
ticularly in high-stakes domains like healthcare, finance, tracing the causal chain of a final decision or failure
or critical infrastructure. becomes exceedingly difficult. The lack of shared, trans-
As the number of agents and the complexity of their parent logs or interpretable reasoning paths across agents
interactions grow, so too does the likelihood of system makes it nearly impossible to determine why a particular
sequence of actions occurred or which agent initiated a finance, or defense. Furthermore, bias propagation and
misstep. amplification present a unique challenge: agents in-
Compounding this opacity is the absence of formal dividually trained on biased data may reinforce each
verification tools tailored for Agentic AI. Unlike tra- other’s skewed decisions through interaction, leading to
ditional software systems, where model checking and systemic inequities that are more pronounced than in
formal proofs offer bounded guarantees, there exists isolated models. These emergent biases can be subtle
no widely adopted methodology to verify that a multi- and difficult to detect without longitudinal monitoring
agent LLM system will perform reliably across all input or audit mechanisms.
distributions or operational contexts. This lack of verifia- Additionally, misalignment and value drift pose serious
bility presents a significant barrier to adoption in safety- risks in long-horizon or dynamic environments. With-
critical domains such as autonomous vehicles, finance, out a unified framework for shared value encoding,
and healthcare, where explainability and assurance are individual agents may interpret overarching objectives
non-negotiable. To advance Agentic AI safely, future differently or optimize for local goals that diverge from
research must address the foundational gaps in causal human intent. Over time, this misalignment can lead
traceability, agent accountability, and formal safety guar- to behavior that is inconsistent with ethical norms or
antees. user expectations. Current alignment methods, which are
6) Security and Adversarial Risks: Agentic AI architec- mostly designed for single-agent systems, are inadequate
tures introduce a significantly expanded attack surface for managing value synchronization across heteroge-
compared to single-agent systems, exposing them to neous agent collectives. These challenges highlight the
complex adversarial threats. One of the most critical urgent need for governance-aware agent architectures,
vulnerabilities lies in the presence of a single point of incorporating principles such as role-based isolation,
compromise. Since Agentic AI systems are composed of traceable decision logging, and participatory oversight
interdependent agents communicating over shared mem- mechanisms to ensure ethical integrity in autonomous
ory or messaging protocols, the compromise of even multi-agent systems.
one agent through prompt injection, model poisoning, 8) Immature Foundations and Research Gaps: Despite
or adversarial tool manipulation can propagate malicious rapid progress and high-profile demonstrations, Agentic
outputs or corrupted state across the entire system. For AI remains in a nascent research stage with unresolved
example, a fact-checking agent fed with tampered data foundational issues that limit its scalability, reliability,
could unintentionally legitimize false claims, which are and theoretical grounding. A central concern is the
then integrated into downstream reasoning by summa- lack of standard architectures. There is currently no
rization or decision-making agents. widely accepted blueprint for how to design, monitor,
Moreover, inter-agent dynamics themselves are suscepti- or evaluate multi-agent systems built on LLMs . This
ble to exploitation. Attackers can induce race conditions, architectural fragmentation makes it difficult to compare
deadlocks, or resource exhaustion by manipulating the implementations, replicate experiments, or generalize
coordination logic between agents. Without rigorous findings across domains. Key aspects such as agent
authentication, access control, and sandboxing mech- orchestration, memory structures, and communication
anisms, malicious agents or corrupted tool responses protocols are often implemented ad hoc, resulting in
can derail multi-agent workflows or cause erroneous brittle systems that lack interoperability and formal
escalation in task pipelines. These risks are exacerbated guarantees.
by the absence of standardized security frameworks for Equally critical is the absence of causal foundations as
LLM-based multi-agent systems, leaving most current scalable causal discovery and reasoning remain unsolved
implementations defenseless against sophisticated multi- challenges [200]. Without the ability to represent and
stage attacks. As Agentic AI moves toward broader reason about cause-effect relationships, Agentic AI sys-
adoption, especially in high-stakes environments, em- tems are inherently limited in their capacity to generalize
bedding secure-by-design principles and adversarial ro- safely beyond narrow training regimes [177], [201].
bustness becomes an urgent research imperative. This shortfall affects their robustness under distributional
7) Ethical and Governance Challenges: The distributed shifts, their capacity for proactive intervention, and
and autonomous nature of Agentic AI systems intro- their ability to simulate counterfactuals or hypothetical
duces profound ethical and governance concerns, par- plans core requirements for intelligent coordination and
ticularly in terms of accountability, fairness, and value decision-making.
alignment. In multi-agent settings, accountability gaps The gap between functional demos and principled de-
emerge when multiple agents interact to produce an sign thus underscores an urgent need for foundational
outcome, making it difficult to assign responsibility research in multi-agent system theory, causal infer-
for errors or unintended consequences. This ambiguity ence integration, and benchmark development. Only
complicates legal liability, regulatory compliance, and by addressing these deficiencies can the field progress
user trust, especially in domains such as healthcare, from prototype pipelines to trustworthy, general-purpose
Tool-Augmented Agentic Loop: Memory Architectures Multi-Agent
Retrieval-Augmented
Reasoning (Function Reasoning, Action, (Episodic, Semantic, Orchestration with
Generation (RAG)
Calling) Observation Vector) Role Specialization

Monitoring, Governance-Aware
Reflexive and Self- Causal Modeling Auditing, and Architectures
Programmatic Prompt
Critique Mechanisms and Simulation- Explainability (Accountability +
Engineering Pipelines
Based Planning Pipelines Role Isolation)

Fig. 13: Ten emerging architectural and algorithmic solutions such as RAG, tool use, memory, orchestration, and reflexive
mechanisms addressing reliability, scalability, and explainability across both paradigms

agentic frameworks suitable for deployment in high- from static predictors into interactive problem-solvers
stakes environments. [131], [161]. This allows them to dynamically retrieve
weather forecasts, schedule appointments, or execute
VI. P OTENTIAL S OLUTIONS AND F UTURE ROADMAP Python-based calculations, all beyond the capabilities of
The potential solutions (as illustrated in Figure 13) to these pure language modeling.
challenges and limitations of AI agents and Agentic AI are For Agentic AI, function calling supports agent level
summarized in the following points: autonomy and role differentiation. Agents within a team
1) Retrieval-Augmented Generation (RAG): For AI may use APIs to invoke domain-specific actions such as
Agents, Retrieval-Augmented Generation mitigates hal- querying clinical databases or generating visual charts
lucinations and expands static LLM knowledge by based on assigned roles. Function calls become part
grounding outputs in real-time data [202]. By embed- of an orchestrated pipeline, enabling fluid delegation
ding user queries and retrieving semantically relevant across agents [204]. This structured interaction reduces
documents from vector databases like FAISS Faiss or ambiguity in task handoff and fosters clearer behavioral
Pinecone Pinecone, agents can generate contextually boundaries, especially when integrated with validation
valid responses rooted in external facts. This is par- protocols or observation mechanisms [14], [18].
ticularly effective in domains such as enterprise search 3) Agentic Loop: Reasoning, Action, Observation: AI
and customer support, where accuracy and up-to-date Agents often suffer from single-pass inference limita-
knowledge are essential. tions. The ReAct pattern introduces an iterative loop
In Agentic AI systems, RAG serves as a shared ground- where agents reason about tasks, act by calling tools
ing mechanism across agents. For example, a summa- or APIs, and then observe results before continuing.
rizer agent may rely on the retriever agent to access This feedback loop allows for more deliberate, context-
the latest scientific papers before generating a synthesis. sensitive behaviors. For example, an agent may verify
Persistent, queryable memory allows distributed agents retrieved data before drafting a summary, thereby re-
to operate on a unified semantic layer, mitigating in- ducing hallucination and logical errors. In Agentic AI,
consistencies due to divergent contextual views. When this pattern is critical for collaborative coherence. ReAct
implemented across a multi-agent system, RAG helps enables agents to evaluate dependencies dynamically
maintain shared truth, enhances goal alignment, and reasoning over intermediate states, re-invoking tools
reduces inter-agent misinformation propagation. if needed, and adjusting decisions as the environment
2) Tool-Augmented Reasoning (Function Calling): AI evolves. This loop becomes more complex in multi-
Agents benefit significantly from function calling, which agent settings where each agent’s observation must be
extends their ability to interact with real-world systems reconciled against others’ outputs. Shared memory and
[166], [203]. Agents can query APIs, run local scripts, consistent logging are essential here, ensuring that the
or access structured databases, thus transforming LLMs reflective capacity of the system is not fragmented across
agents [132]. automate this process using task templates, context
4) Memory Architectures (Episodic, Semantic, Vector): fillers, and retrieval-augmented variables [214], [215].
AI Agents face limitations in long-horizon planning and These dynamic prompts are structured based on task
session continuity. Memory architectures address this by type, agent role, or user query, improving generalization
persisting information across tasks [205]. Episodic mem- and reducing failure modes associated with prompt
ory allows agents to recall prior actions and feedback, variability. In Agentic AI, prompt pipelines enable scal-
semantic memory encodes structured domain knowl- able, role-consistent communication. Each agent type
edge, and vector memory enables similarity-based re- (e.g., planner, retriever, summarizer) can generate or
trieval [206]. These elements are key for personalization consume structured prompts tailored to its function. By
and adaptive decision-making in repeated interactions. automating message formatting, dependency tracking,
Agentic AI systems require even more sophisticated and semantic alignment, programmatic prompting pre-
memory models due to distributed state management. vents coordination drift and ensures consistent reasoning
Each agent may maintain local memory while accessing across diverse agents in real time [14], [166].
shared global memory to facilitate coordination. For ex- 8) Causal Modeling and Simulation-Based Planning: AI
ample, a planner agent might use vector-based memory Agents often operate on statistical correlations rather
to recall prior workflows, while a QA agent references than causal models, leading to poor generalization under
semantic memory for fact verification. Synchronizing distribution shifts. Embedding causal inference allows
memory access and updates across agents enhances agents to distinguish between correlation and causation,
consistency, enables context-aware communication, and simulate interventions, and plan more robustly. For
supports long-horizon system-level planning. instance, in supply chain scenarios, a causally aware
5) Multi-Agent Orchestration with Role Specialization: agent can simulate the downstream impact of shipment
In AI Agents, task complexity is often handled via mod- delays. In Agentic AI, causal reasoning is vital for safe
ular prompt templates or conditional logic. However, coordination and error recovery. Agents must anticipate
as task diversity increases, a single agent may become how their actions impact others requiring causal graphs,
overloaded [207], [208]. Role specialization splitting simulation environments, or Bayesian inference layers.
tasks into subcomponents (e.g., planner, summarizer) al- For example, a planning agent may simulate different
lows lightweight orchestration even within single-agent strategies and communicate likely outcomes to others,
systems by simulating compartmentalized reasoning. In fostering strategic alignment and avoiding unintended
Agentic AI, orchestration is central. A meta-agent or emergent behaviors.
orchestrator distributes tasks among specialized agents, 9) Monitoring, Auditing, and Explainability Pipelines:
each with distinct capabilities. Systems like MetaGPT AI Agents lack transparency, complicating debugging
and ChatDev exemplify this: agents emulate roles such and trust. Logging systems that record prompts, tool
as CEO, engineer, or reviewer, and interact through calls, memory updates, and outputs enable post-hoc
structured messaging. This modular approach enhances analysis and performance tuning. These records help
interpretability, scalability, and fault isolation ensuring developers trace faults, refine behavior, and ensure
that failures in one agent do not cascade without con- compliance with usage guidelines especially critical in
tainment mechanisms from the orchestrator. enterprise or legal domains. For Agentic AI, logging and
6) Reflexive and Self-Critique Mechanisms: AI Agents explainability are exponentially more important. With
often fail silently or propagate errors. Reflexive mech- multiple agents interacting asynchronously, audit trails
anisms introduce the capacity for self-evaluation [209], are essential for identifying which agent caused an error
[210]. After completing a task, agents can critique their and under what conditions. Explainability pipelines that
own outputs using a secondary reasoning pass, increas- integrate across agents (e.g., timeline visualizations or
ing robustness and reducing error rates. For example, dialogue replays) are key to ensuring safety, especially
a legal assistant agent might verify that its drafted in regulatory or multi-stakeholder environments.
clause matches prior case laws before submission. For 10) Governance-Aware Architectures (Accountability
Agentic AI, reflexivity extends beyond self-critique to and Role Isolation): AI Agents currently lack built-
inter-agent evaluation. Agents can review each other’s in safeguards for ethical compliance or error attribution.
outputs e.g., a verifier agent auditing a summarizer’s Governance-aware designs introduce role-based access
work. Reflexion-like mechanisms ensure collaborative control, sandboxing, and identity resolution to ensure
quality control and enhance trustworthiness [211]. Such agents act within scope and their decisions can be
patterns also support iterative improvement and adaptive audited or revoked. These structures reduce risks in
replanning, particularly when integrated with memory sensitive applications such as healthcare or finance.
logs or feedback queues [212], [213]. In Agentic AI, governance must scale across roles,
7) Programmatic Prompt Engineering Pipelines: Man- agents, and workflows. Role isolation prevents rogue
ual prompt tuning introduces brittleness and reduces agents from exceeding authority, while accountability
reproducibility in AI Agents. Programmatic pipelines mechanisms assign responsibility for decisions and trace
causality across agents. Compliance protocols, ethical its core.
alignment checks, and agent authentication ensure safety A transformative direction for future AI systems is intro-
in collaborative settings paving the way for trustworthy duced by the Absolute Zero: Reinforced Self-play Reasoning
AI ecosystems. with Zero Data (AZR) framework, which reimagines the learn-
ing paradigm for AI Agents and Agentic AI by removing
AI Agents are projected to evolve significantly through dependency on external datasets [216]. Traditionally, both AI
enhanced modular intelligence focused on five key domains as Agents and Agentic AI architectures have relied on human-
depicted in Figure 14 as : proactive reasoning, tool integration, annotated data, static knowledge corpora, or preconfigured
causal inference, continual learning, and trust-centric opera- environments factors that constrain scalability and adaptability
tions. The first transformative milestone involves transitioning in open-world contexts. AZR addresses this limitation by
from reactive to Proactive Intelligence, where agents initiate enabling agents to autonomously generate, validate, and solve
tasks based on learned patterns, contextual cues, or latent their own tasks, using verifiable feedback mechanisms (e.g.,
goals rather than awaiting explicit prompts. This advancement code execution) to ground learning. This self-evolving mech-
depends heavily on robust Tool Integration, enabling agents to anism opens the door to truly autonomous reasoning agents
dynamically interact with external systems, such as databases, capable of self-directed learning and adaptation in dynamic,
APIs, or simulation environments, to fulfill complex user data-scarce environments.
tasks. Equally critical is the development of Causal Reasoning, In the context of Agentic AI—where multiple specialized
which will allow agents to move beyond statistical correlation, agents collaborate within orchestrated workflows AZR lays
supporting inference of cause-effect relationships essential the groundwork for agents to not only specialize but also co-
for tasks involving diagnosis, planning, or prediction. To evolve. For instance, scientific research pipelines could consist
maintain relevance over time, agents must adopt frameworks of agents that propose hypotheses, run simulations, validate
for Continuous Learning, incorporating feedback loops and findings, and revise strategies—entirely through self-play and
episodic memory to adapt their behavior across sessions and verifiable reasoning, without continuous human oversight. By
environments. Lastly, to build user confidence, agents must integrating the AZR paradigm, such systems can maintain
prioritize Trust & Safety mechanisms through verifiable out- persistent growth, knowledge refinement, and task flexibility
put logging, bias detection, and ethical guardrails especially across time. Ultimately, AZR highlights a future in which AI
as their autonomy increases. Together, these pathways will agents transition from static, pretrained tools to intelligent,
redefine AI Agents from static tools into adaptive cognitive self-improving ecosystems—positioning both AI Agents and
systems capable of autonomous yet controllable operation in Agentic AI at the forefront of next-generation artificial intel-
dynamic digital environments. ligence.
Agentic AI, as a natural extension of these foundations,
emphasizes collaborative intelligence through multi-agent co- VII. C ONCLUSION
ordination, contextual persistence, and domain-specific orches- In this study, we presented a comprehensive literature-based
tration. Future systems (Figure 14 right side) will exhibit evaluation of the evolving landscape of AI Agents and Agentic
Multi-Agent Scaling, enabling specialized agents to work in AI, offering a structured taxonomy that highlights foundational
parallel under distributed control for complex problem-solving concepts, architectural evolution, application domains, and key
mirroring team-based human workflows. This necessitates a limitations. Beginning with a foundational understanding, we
layer of Unified Orchestration, where meta-agents or orches- characterized AI Agents as modular, task-specific entities with
trators dynamically assign roles, monitor task dependencies, constrained autonomy and reactivity. Their operational scope
and mediate conflicts among subordinate agents. Sustained is grounded in the integration of LLMs and LIMs, which
performance over time depends on Persistent Memory archi- serve as core reasoning modules for perception, language
tectures, which preserve semantic, episodic, and shared knowl- understanding, and decision-making. We identified generative
edge for agents to coordinate longitudinal tasks and retain AI as a functional precursor, emphasizing its limitations in
state awareness. Simulation Planning is expected to become autonomy and goal persistence, and examined how LLMs
a core feature, allowing agent collectives to test hypotheti- drive the progression from passive generation to interactive
cal strategies, forecast consequences, and optimize outcomes task completion through tool augmentation.
before real-world execution. Moreover, Ethical Governance This study then explored the conceptual emergence of
frameworks will be essential to ensure responsible deployment Agentic AI systems as a transformative evolution from isolated
defining accountability, oversight, and value alignment across agents to orchestrated, multi-agent ecosystems. We analyzed
autonomous agent networks. Finally, tailored Domain-Specific key differentiators such as distributed cognition, persistent
Systems will emerge in fields like law, medicine, and sup- memory, and coordinated planning that distinguish Agentic
ply chains, leveraging contextual specialization to outperform AI from conventional agent models. This was followed by
generic agents. This future positions Agentic AI not merely a detailed breakdown of architectural evolution, highlight-
as a coordination layer on the top of AI Agents, but as a new ing the transition from monolithic, rule-based frameworks to
paradigm for collective machine intelligence with adaptive modular, role specialized networks facilitated by orchestra-
planning, recursive reasoning, and collaborative cognition at tion layers and reflective memory architectures. Additionally,
Multi-Agent
Scaling
Continuous Trust &
Learning Safety

Domain-
Unified Or-
Specific
chestration
Systems

Causal
Reasoning
AI Agents Agentic AI

Ethical Persistent
Governance Memory

Tool Proactive
Integration Intelligence
Simulation
Planning

Fig. 14: Mindmap visualization of the future roadmap for AI Agents and Agentic AI.

this study then surveyed application domains in which these S TATEMENT ON AI W RITING A SSISTANCE
paradigms are deployed. For AI Agents, we illustrated their ChatGPT and Perplexity were utilized to enhance grammat-
role in automating customer support, internal enterprise search, ical accuracy and refine sentence structure; all AI-generated
email prioritization, and scheduling. For Agentic AI, we revisions were thoroughly reviewed and edited for relevance.
demonstrated use cases in collaborative research, robotics, Additionally, ChatGPT-4o was employed to generate realistic
medical decision support, and adaptive workflow automation, visualizations.
supported by practical examples and industry-grade systems.
Finally, this study provided a deep analysis of the challenges R EFERENCES
and limitations affecting both paradigms. For AI Agents, [1] E. Oliveira, K. Fischer, and O. Stepankova, “Multi-agent systems:
we discussed hallucinations, shallow reasoning, and planning which research for which applications,” Robotics and Autonomous
constraints, while for Agentic AI, we addressed amplified Systems, vol. 27, no. 1-2, pp. 91–106, 1999.
[2] Z. Ren and C. J. Anumba, “Multi-agent systems in construction–state
causality issues, coordination bottlenecks, emergent behavior, of the art and prospects,” Automation in Construction, vol. 13, no. 3,
and governance concerns. These insights offer a roadmap for pp. 421–434, 2004.
future development and deployment of trustworthy, scalable [3] C. Castelfranchi, “Modelling social action for ai agents,” Artificial
intelligence, vol. 103, no. 1-2, pp. 157–182, 1998.
agentic systems. [4] J. Ferber and G. Weiss, Multi-agent systems: an introduction to
distributed artificial intelligence, vol. 1. Addison-wesley Reading,
1999.
ACKNOWLEDGEMENT [5] R. Calegari, G. Ciatto, V. Mascardi, and A. Omicini, “Logic-based
technologies for multi-agent systems: a systematic literature review,”
Autonomous Agents and Multi-Agent Systems, vol. 35, no. 1, p. 1, 2021.
This work was supported by the National Science Founda- [6] R. C. Cardoso and A. Ferrando, “A review of agent-based programming
tion and the United States Department of Agriculture, National for multi-agent systems,” Computers, vol. 10, no. 2, p. 16, 2021.
Institute of Food and Agriculture through the “Artificial Intel- [7] E. Shortliffe, Computer-based medical consultations: MYCIN, vol. 2.
Elsevier, 2012.
ligence (AI) Institute for Agriculture” Program under Award
[8] H. P. Moravec, “The stanford cart and the cmu rover,” Proceedings of
AWD003473, and AWD004595, Accession Number 1029004, the IEEE, vol. 71, no. 7, pp. 872–884, 1983.
”Robotic Blossom Thinning with Soft Manipulators”. [9] B. Dai and H. Chen, “A multi-agent and auction-based framework and
approach for carrier collaboration,” Logistics Research, vol. 3, pp. 101–
120, 2011.
D ECLARATIONS [10] J. Grosset, A.-J. Fougères, M. Djoko-Kouam, and J.-M. Bonnin,
“Multi-agent simulation of autonomous industrial vehicle fleets: To-
wards dynamic task allocation in v2x cooperation mode,” Integrated
The authors declare no conflicts of interest. Computer-Aided Engineering, vol. 31, no. 3, pp. 249–266, 2024.
[11] R. A. Agis, S. Gottifredi, and A. J. Garcı́a, “An event-driven behavior on Collaboration and Internet Computing (CIC), pp. 92–98, IEEE,
trees extension to facilitate non-player multi-agent coordination in 2023.
video games,” Expert Systems with Applications, vol. 155, p. 113457, [33] R. Surapaneni, J. Miku, M. Vakoc, and T. Segal, “Announcing the
2020. agent2agent protocol (a2a) - google developers blog,” 4 2025.
[12] A. Guerra-Hernández, A. El Fallah-Seghrouchni, and H. Soldano, [34] Z. Duan and J. Wang, “Exploration of llm multi-agent applica-
“Learning in bdi multi-agent systems,” in International Workshop on tion implementation based on langgraph+ crewai,” arXiv preprint
Computational Logic in Multi-Agent Systems, pp. 218–233, Springer, arXiv:2411.18241, 2024.
2004. [35] R. Sapkota, Y. Cao, K. I. Roumeliotis, and M. Karkee, “Vision-
[13] A. Saadi, R. Maamri, and Z. Sahnoun, “Behavioral flexibility in belief- language-action models: Concepts, progress, applications and chal-
desire-intention (bdi) architectures,” Multiagent and grid systems, lenges,” arXiv preprint arXiv:2505.04769, 2025.
vol. 16, no. 4, pp. 343–377, 2020. [36] R. Sapkota, K. I. Roumeliotis, R. H. Cheppally, M. F. Calero, and
[14] D. B. Acharya, K. Kuppan, and B. Divya, “Agentic ai: Autonomous M. Karkee, “A review of 3d object detection with vision-language
intelligence for complex goals–a comprehensive survey,” IEEE Access, models,” arXiv preprint arXiv:2504.18738, 2025.
2025. [37] R. Sapkota and M. Karkee, “Object detection with multimodal large
[15] M. Z. Pan, M. Cemri, L. A. Agrawal, S. Yang, B. Chopra, R. Tiwari, vision-language models: An in-depth review,” Available at SSRN
K. Keutzer, A. Parameswaran, K. Ramchandran, D. Klein, et al., “Why 5233953, 2025.
do multiagent systems fail?,” in ICLR 2025 Workshop on Building Trust [38] B. Memarian and T. Doleck, “Human-in-the-loop in artificial intel-
in Language Models and Applications, 2025. ligence in education: A review and entity-relationship (er) analysis,”
[16] L. Hughes, Y. K. Dwivedi, T. Malik, M. Shawosh, M. A. Albashrawi, Computers in Human Behavior: Artificial Humans, vol. 2, no. 1,
I. Jeon, V. Dutot, M. Appanderanda, T. Crick, R. De’, et al., “Ai agents p. 100053, 2024.
and agentic systems: A multi-expert analysis,” Journal of Computer [39] P. Bornet, J. Wirtz, T. H. Davenport, D. De Cremer, B. Evergreen,
Information Systems, pp. 1–29, 2025. P. Fersht, R. Gohel, S. Khiyara, P. Sund, and N. Mullakara, Agentic
[17] Z. Deng, Y. Guo, C. Han, W. Ma, J. Xiong, S. Wen, and Y. Xiang, Artificial Intelligence: Harnessing AI Agents to Reinvent Business,
“Ai agents under threat: A survey of key security challenges and future Work and Life. Irreplaceable Publishing, 2025.
pathways,” ACM Computing Surveys, vol. 57, no. 7, pp. 1–36, 2025. [40] F. Sado, C. K. Loo, W. S. Liew, M. Kerzel, and S. Wermter, “Ex-
[18] M. Gridach, J. Nanavati, K. Z. E. Abidine, L. Mendes, and C. Mack, plainable goal-driven agents and robots-a comprehensive review,” ACM
“Agentic ai for scientific discovery: A survey of progress, challenges, Computing Surveys, vol. 55, no. 10, pp. 1–41, 2023.
and future directions,” arXiv preprint arXiv:2503.08979, 2025.
[41] J. Heer, “Agency plus automation: Designing artificial intelligence into
[19] T. Song, M. Luo, X. Zhang, L. Chen, Y. Huang, J. Cao, Q. Zhu, D. Liu, interactive systems,” Proceedings of the National Academy of Sciences,
B. Zhang, G. Zou, et al., “A multiagent-driven robotic ai chemist vol. 116, no. 6, pp. 1844–1850, 2019.
enabling autonomous chemical research on demand,” Journal of the
[42] G. Papagni, J. de Pagter, S. Zafari, M. Filzmoser, and S. T. Koeszegi,
American Chemical Society, vol. 147, no. 15, pp. 12534–12545, 2025.
“Artificial agents’ explainability to support trust: considerations on
[20] M. M. Karim, D. H. Van, S. Khan, Q. Qu, and Y. Kholodov, “Ai
timing and context,” Ai & Society, vol. 38, no. 2, pp. 947–960, 2023.
agents meet blockchain: A survey on secure and scalable collaboration
[43] P. Wang and H. Ding, “The rationality of explanation or human
for multi-agents,” Future Internet, vol. 17, no. 2, p. 57, 2025.
capacity? understanding the impact of explainable artificial intelligence
[21] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, et al., “Improv-
on human-ai trust and decision performance,” Information Processing
ing language understanding by generative pre-training,” arxiv, 2018.
& Management, vol. 61, no. 4, p. 103732, 2024.
[22] J. Sánchez Cuadrado, S. Pérez-Soler, E. Guerra, and J. De Lara,
“Automating the development of task-oriented llm-based chatbots,” [44] E. Popa, “Human goals are constitutive of agency in artificial intelli-
in Proceedings of the 6th ACM Conference on Conversational User gence (ai),” Philosophy & Technology, vol. 34, no. 4, pp. 1731–1750,
Interfaces, pp. 1–10, 2024. 2021.
[23] Y. Lu, A. Aleta, C. Du, L. Shi, and Y. Moreno, “Llms and generative [45] M. Chacon-Chamorro, L. F. Giraldo, N. Quijano, V. Vargas-Panesso,
agent-based models for complex systems research,” Physics of Life C. González, J. S. Pinzón, R. Manrique, M. Rı́os, Y. Fonseca,
Reviews, 2024. D. Gómez-Barrera, et al., “Cooperative resilience in artificial intel-
[24] A. Zhang, Y. Chen, L. Sheng, X. Wang, and T.-S. Chua, “On generative ligence multiagent systems,” IEEE Transactions on Artificial Intelli-
agents in recommendation,” in Proceedings of the 47th international gence, 2025.
ACM SIGIR conference on research and development in Information [46] M. Adam, M. Wessel, and A. Benlian, “Ai-based chatbots in customer
Retrieval, pp. 1807–1817, 2024. service and their effects on user compliance,” Electronic Markets,
[25] S. Peng, E. Kalliamvakou, P. Cihon, and M. Demirer, “The impact vol. 31, no. 2, pp. 427–445, 2021.
of ai on developer productivity: Evidence from github copilot,” arXiv [47] D. Leocádio, L. Guedes, J. Oliveira, J. Reis, and N. Melão, “Customer
preprint arXiv:2302.06590, 2023. service with ai-powered human-robot collaboration (hrc): A literature
[26] J. Li, V. Lavrukhin, B. Ginsburg, R. Leary, O. Kuchaiev, J. M. Cohen, review,” Procedia Computer Science, vol. 232, pp. 1222–1232, 2024.
H. Nguyen, and R. T. Gadde, “Jasper: An end-to-end convolutional [48] T. Cao, Y. Q. Khoo, S. Birajdar, Z. Gong, C.-F. Chung, Y. Moghaddam,
neural acoustic model,” arXiv preprint arXiv:1904.03288, 2019. A. Xu, H. Mehta, A. Shukla, Z. Wang, et al., “Designing towards
[27] A. Jaruga-Rozdolska, “Artificial intelligence as part of future practices productivity: A centralized ai assistant concept for work,” The Human
in the architect’s work: Midjourney generative tool as part of a process Side of Service Engineering, p. 118, 2024.
of creating an architectural form,” Architectus, no. 3 (71, pp. 95–104, [49] Y. Huang and J. X. Huang, “Exploring chatgpt for next-generation in-
2022. formation retrieval: Opportunities and challenges,” in Web Intelligence,
[28] K. Basu, “Bridging knowledge gaps in llms via function calls,” in vol. 22, pp. 31–44, SAGE Publications Sage UK: London, England,
Proceedings of the 33rd ACM International Conference on Information 2024.
and Knowledge Management, pp. 5556–5557, 2024. [50] N. Holtz, S. Wittfoth, and J. M. Gómez, “The new era of knowledge
[29] Z. Liu, T. Hoang, J. Zhang, M. Zhu, T. Lan, J. Tan, W. Yao, Z. Liu, retrieval: Multi-agent systems meet generative ai,” in 2024 Portland In-
Y. Feng, R. RN, et al., “Apigen: Automated pipeline for generating ternational Conference on Management of Engineering and Technology
verifiable and diverse function-calling datasets,” Advances in Neural (PICMET), pp. 1–10, IEEE, 2024.
Information Processing Systems, vol. 37, pp. 54463–54482, 2024. [51] F. Poszler and B. Lange, “The impact of intelligent decision-support
[30] H. Yang, S. Yue, and Y. He, “Auto-gpt for online decision systems on humans’ ethical decision-making: A systematic literature
making: Benchmarks and additional opinions,” arXiv preprint review and an integrated framework,” Technological Forecasting and
arXiv:2306.02224, 2023. Social Change, vol. 204, p. 123403, 2024.
[31] I. Hettiarachchi, “Exploring generative ai agents: Architecture, applica- [52] F. Khemakhem, H. Ellouzi, H. Ltifi, and M. B. Ayed, “Agent-based
tions, and challenges,” Journal of Artificial Intelligence General science intelligent decision support systems: a systematic review,” IEEE Trans-
(JAIGS) ISSN: 3006-4023, vol. 8, no. 1, pp. 105–127, 2025. actions on Cognitive and Developmental Systems, vol. 14, no. 1,
[32] A. Das, S.-C. Chen, M.-L. Shyu, and S. Sadiq, “Enabling synergistic pp. 20–34, 2020.
knowledge sharing and reasoning in large language models with [53] S. Ringer, “Introducing computer use, a new claude 3.5 sonnet, and
collaborative multi-agents,” in 2023 IEEE 9th International Conference claude 3.5 haiku anthropic,” 10 2024.
[54] R. V. Florian, “Autonomous artificial intelligent agents,” Center for [77] A. Chowdhery, S. Narang, J. Devlin, M. Bosma, G. Mishra, A. Roberts,
Cognitive and Neural Studies (Coneural), Cluj-Napoca, Romania, P. Barham, H. W. Chung, C. Sutton, S. Gehrmann, et al., “Palm: Scaling
2003. language modeling with pathways,” Journal of Machine Learning
[55] T. Hellström, N. Kaiser, and S. Bensch, “A taxonomy of embodiment Research, vol. 24, no. 240, pp. 1–113, 2023.
in the ai era,” Electronics, vol. 13, no. 22, p. 4441, 2024. [78] H. Honda and M. Hagiwara, “Question answering systems with deep
[56] M. Wischnewski, “Attributing mental states to non-embodied au- learning-based symbolic processing,” IEEE Access, vol. 7, pp. 152368–
tonomous systems: A systematic review,” in Proceedings of the Ex- 152378, 2019.
tended Abstracts of the CHI Conference on Human Factors in Com- [79] N. Karanikolas, E. Manga, N. Samaridi, E. Tousidou, and M. Vassi-
puting Systems, pp. 1–8, 2025. lakopoulos, “Large language models versus natural language under-
[57] K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, standing and generation,” in Proceedings of the 27th Pan-Hellenic
“Not what you’ve signed up for: Compromising real-world llm- Conference on Progress in Computing and Informatics, pp. 278–290,
integrated applications with indirect prompt injection,” in Proceedings 2023.
of the 16th ACM Workshop on Artificial Intelligence and Security, [80] A. S. George, A. H. George, T. Baskar, and A. G. Martin, “Revolu-
pp. 79–90, 2023. tionizing business communication: Exploring the potential of gpt-4 in
[58] Y. Talebirad and A. Nadiri, “Multi-agent collaboration: Harnessing corporate settings,” Partners Universal International Research Journal,
the power of intelligent llm agents,” arXiv preprint arXiv:2306.03314, vol. 2, no. 1, pp. 149–157, 2023.
2023.
[81] K. I. Roumeliotis, N. D. Tselikas, and D. K. Nasiopoulos, “Think
[59] A. I. Hauptman, B. G. Schelble, N. J. McNeese, and K. C. Madathil, before you classify: The rise of reasoning large language models for
“Adapt and overcome: Perceptions of adaptive autonomous agents consumer complaint detection and classification,” Electronics 2025,
for human-ai teaming,” Computers in Human Behavior, vol. 138, Vol. 14, Page 1070, vol. 14, p. 1070, 3 2025.
p. 107451, 2023.
[60] N. Krishnan, “Advancing multi-agent systems through model con- [82] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal,
text protocol: Architecture, implementation, and applications,” arXiv G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., “Learning transferable
preprint arXiv:2504.21030, 2025. visual models from natural language supervision,” in International
conference on machine learning, pp. 8748–8763, PmLR, 2021.
[61] H. Padigela, C. Shah, and D. Juyal, “Ml-dev-bench: Comparative
analysis of ai agents on ml development workflows,” arXiv preprint [83] J. Li, D. Li, S. Savarese, and S. Hoi, “Blip-2: Bootstrapping language-
arXiv:2502.00964, 2025. image pre-training with frozen image encoders and large language
[62] M. Raees, I. Meijerink, I. Lykourentzou, V.-J. Khan, and K. Papangelis, models,” in International conference on machine learning, pp. 19730–
“From explainable to interactive ai: A literature review on current 19742, PMLR, 2023.
trends in human-ai interaction,” International Journal of Human- [84] S. Sontakke, J. Zhang, S. Arnold, K. Pertsch, E. Bıyık, D. Sadigh,
Computer Studies, p. 103301, 2024. C. Finn, and L. Itti, “Roboclip: One demonstration is enough to learn
[63] P. Formosa, “Robot autonomy vs. human autonomy: social robots, robot policies,” Advances in Neural Information Processing Systems,
artificial intelligence (ai), and the nature of autonomy,” Minds and vol. 36, pp. 55681–55693, 2023.
Machines, vol. 31, no. 4, pp. 595–616, 2021. [85] M. Elhenawy, H. I. Ashqar, A. Rakotonirainy, T. I. Alhadidi, A. Jaber,
[64] C. S. Eze and L. Shamir, “Analysis and prevention of ai-based phishing and M. A. Tami, “Vision-language models for autonomous driving:
email attacks,” Electronics, vol. 13, no. 10, p. 1839, 2024. Clip-based dynamic scene understanding,” Electronics, vol. 14, no. 7,
[65] D. Singh, V. Patel, D. Bose, and A. Sharma, “Enhancing email market- p. 1282, 2025.
ing efficacy through ai-driven personalization: Leveraging natural lan- [86] S. Park, M. Lee, J. Kang, H. Choi, Y. Park, J. Cho, A. Lee, and D. Kim,
guage processing and collaborative filtering algorithms,” International “Vlaad: Vision and language assistant for autonomous driving,” in
Journal of AI Advancements, vol. 9, no. 4, 2020. Proceedings of the IEEE/CVF Winter Conference on Applications of
[66] R. Khan, S. Sarkar, S. K. Mahata, and E. Jose, “Security threats in Computer Vision, pp. 980–987, 2024.
agentic ai system,” arXiv preprint arXiv:2410.14728, 2024. [87] S. H. Ahmed, S. Hu, and G. Sukthankar, “The potential of vision-
[67] C. G. Endacott, “Enacting machine agency when ai makes one’s day: language models for content moderation of children’s videos,” in
understanding how users relate to ai communication technologies for 2023 International Conference on Machine Learning and Applications
scheduling,” Journal of Computer-Mediated Communication, vol. 29, (ICMLA), pp. 1237–1241, IEEE, 2023.
no. 4, p. zmae011, 2024. [88] S. H. Ahmed, M. J. Khan, and G. Sukthankar, “Enhanced multimodal
[68] Z. Pawlak and A. Skowron, “Rudiments of rough sets,” Information content moderation of children’s videos using audiovisual fusion,”
sciences, vol. 177, no. 1, pp. 3–27, 2007. arXiv preprint arXiv:2405.06128, 2024.
[69] P. Ponnusamy, A. Ghias, Y. Yi, B. Yao, C. Guo, and R. Sarikaya, [89] K. I. Roumeliotis, R. Sapkota, M. Karkee, N. D. Tselikas, and
“Feedback-based self-learning in large-scale conversational ai agents,” D. K. Nasiopoulos, “Plant disease detection through multimodal large
AI magazine, vol. 42, no. 4, pp. 43–56, 2022. language models and convolutional neural networks,” 4 2025.
[70] A. Zagalsky, D. Te’eni, I. Yahav, D. G. Schwartz, G. Silverman, [90] P. Chitra and A. Saleem Raja, “Artificial intelligence (ai) algorithm and
D. Cohen, Y. Mann, and D. Lewinsky, “The design of reciprocal models for embodied agents (robots and drones),” in Building Embod-
learning between human and artificial intelligence,” Proceedings of the ied AI Systems: The Agents, the Architecture Principles, Challenges,
ACM on Human-Computer Interaction, vol. 5, no. CSCW2, pp. 1–36, and Application Domains, pp. 417–441, Springer, 2025.
2021.
[71] W. J. Clancey, “Heuristic classification,” Artificial intelligence, vol. 27, [91] S. Kourav, K. Verma, and M. Sundararajan, “Artificial intelligence
no. 3, pp. 289–350, 1985. algorithm models for agents of embodiment for drone applications,”
in Building Embodied AI Systems: The Agents, the Architecture Prin-
[72] S. Kapoor, B. Stroebl, Z. S. Siegel, N. Nadgir, and A. Narayanan, “Ai
ciples, Challenges, and Application Domains, pp. 79–101, Springer,
agents that matter,” arXiv preprint arXiv:2407.01502, 2024.
2025.
[73] X. Huang, J. Lian, Y. Lei, J. Yao, D. Lian, and X. Xie, “Recommender
ai agent: Integrating large language models for interactive recommen- [92] G. Natarajan, E. Elango, B. Sundaravadivazhagan, and S. Rethinam,
dations,” arXiv preprint arXiv:2308.16505, 2023. “Artificial intelligence algorithms and models for embodied agents:
[74] A. M. Baabdullah, A. A. Alalwan, R. S. Algharabat, B. Metri, and N. P. Enhancing autonomy in drones and robots,” in Building Embodied
Rana, “Virtual agents and flow experience: An empirical examination AI Systems: The Agents, the Architecture Principles, Challenges, and
of ai-powered chatbots,” Technological Forecasting and Social Change, Application Domains, pp. 103–132, Springer, 2025.
vol. 181, p. 121772, 2022. [93] K. Pandya and M. Holia, “Automating customer service using
[75] K. I. Roumeliotis, N. D. Tselikas, and D. K. Nasiopoulos, “Llms for langchain: Building custom open-source gpt chatbot for organizations,”
product classification in e-commerce: A zero-shot comparative study of arXiv preprint arXiv:2310.05421, 2023.
gpt and claude models,” Natural Language Processing Journal, vol. 11, [94] Q. Wu, G. Bansal, J. Zhang, Y. Wu, B. Li, E. Zhu, L. Jiang, X. Zhang,
p. 100142, 6 2025. S. Zhang, J. Liu, et al., “Autogen: Enabling next-gen llm applications
[76] J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, via multi-agent conversation,” arXiv preprint arXiv:2308.08155, 2023.
D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, et al., “Gpt-4 [95] L. Gabora and J. Bach, “A path to generative artificial selves,” in EPIA
technical report,” arXiv preprint arXiv:2303.08774, 2023. Conference on Artificial Intelligence, pp. 15–29, Springer, 2023.
[96] G. Pezzulo, T. Parr, P. Cisek, A. Clark, and K. Friston, “Generating [118] K. M. Yoo, D. Park, J. Kang, S.-W. Lee, and W. Park, “Gpt3mix:
meaning: active inference and the scope and limits of passive ai,” Leveraging large-scale language models for text augmentation,” arXiv
Trends in Cognitive Sciences, vol. 28, no. 2, pp. 97–112, 2024. preprint arXiv:2104.08826, 2021.
[97] J. Li, M. Zhang, N. Li, D. Weyns, Z. Jin, and K. Tei, “Generative ai [119] D. Zhou, X. Xue, X. Lu, Y. Guo, P. Ji, H. Lv, W. He, Y. Xu, Q. Li,
for self-adaptive systems: State of the art and research roadmap,” ACM and L. Cui, “A hierarchical model for complex adaptive system: From
Transactions on Autonomous and Adaptive Systems, vol. 19, no. 3, adaptive agent to ai society,” ACM Transactions on Autonomous and
pp. 1–60, 2024. Adaptive Systems, 2024.
[98] W. O’Grady and M. Lee, “Natural syntax, artificial intelligence and [120] H. Hao, Y. Wang, and J. Chen, “Empowering scenario planning with
language acquisition,” Information, vol. 14, no. 7, p. 418, 2023. artificial intelligence: A perspective on building smart and resilient
[99] X. Liu, J. Wang, J. Sun, X. Yuan, G. Dong, P. Di, W. Wang, cities,” Engineering, 2024.
and D. Wang, “Prompting frameworks for large language models: A [121] Y. Wang, J. Zhu, Z. Cheng, L. Qiu, Z. Tong, and J. Huang, “Intelligent
survey,” arXiv preprint arXiv:2311.12785, 2023. optimization method for real-time decision-making in laminated cool-
[100] E. T. Rolls, “The memory systems of the human brain and generative ing configurations through reinforcement learning,” Energy, vol. 291,
artificial intelligence,” Heliyon, vol. 10, no. 11, 2024. p. 130434, 2024.
[101] K. Alizadeh, S. I. Mirzadeh, D. Belenko, S. Khatamifard, M. Cho, C. C. [122] X. Xiang, J. Xue, L. Zhao, Y. Lei, C. Yue, and K. Lu, “Real-
Del Mundo, M. Rastegari, and M. Farajtabar, “Llm in a flash: Efficient time integration of fine-tuned large language model for improved
large language model inference with limited memory,” in Proceedings decision-making in reinforcement learning,” in 2024 International Joint
of the 62nd Annual Meeting of the Association for Computational Conference on Neural Networks (IJCNN), pp. 1–8, IEEE, 2024.
Linguistics (Volume 1: Long Papers), pp. 12562–12584, 2024. [123] Z. Li, H. Zhang, C. Peng, and R. Peiris, “Exploring large language
[102] D. Driess, F. Xia, M. S. Sajjadi, C. Lynch, A. Chowdhery, A. Wahid, model-driven agents for environment-aware spatial interactions and
J. Tompson, Q. Vuong, T. Yu, W. Huang, et al., “Palm-e: An embodied conversations in virtual reality role-play scenarios,” in 2025 IEEE
multimodal language model,” 2023. Conference Virtual Reality and 3D User Interfaces (VR), pp. 1–11,
[103] P. Denny, J. Leinonen, J. Prather, A. Luxton-Reilly, T. Amarouche, IEEE, 2025.
B. A. Becker, and B. N. Reeves, “Prompt problems: A new pro- [124] T. R. McIntosh, T. Susnjak, T. Liu, P. Watters, and M. N. Halgamuge,
gramming exercise for the generative ai era,” in Proceedings of the “The inadequacy of reinforcement learning from human feedback-
55th ACM Technical Symposium on Computer Science Education V. 1, radicalizing large language models via semantic vulnerabilities,” IEEE
pp. 296–302, 2024. Transactions on Cognitive and Developmental Systems, 2024.
[104] C. Chen, S. Lee, E. Jang, and S. S. Sundar, “Is your prompt detailed [125] S. Lee, G. Lee, W. Kim, J. Kim, J. Park, and K. Cho, “Human strategy
enough? exploring the effects of prompt coaching on users’ percep- learning-based multi-agent deep reinforcement learning for online team
tions, engagement, and trust in text-to-image generative ai tools,” in sports game,” IEEE Access, 2025.
Proceedings of the Second International Symposium on Trustworthy [126] Z. Shi, S. Gao, L. Yan, Y. Feng, X. Chen, Z. Chen, D. Yin, S. Ver-
Autonomous Systems, pp. 1–12, 2024. berne, and Z. Ren, “Tool learning in the wild: Empowering language
[105] OpenAI, “Introducing gpt-4.1 in the api,” 4 2025. models as automatic tool agents,” in Proceedings of the ACM on Web
[106] A. Pan, E. Jones, M. Jagadeesan, and J. Steinhardt, “Feedback loops Conference 2025, pp. 2222–2237, 2025.
with language models drive in-context reward hacking,” arXiv preprint [127] S. Yuan, K. Song, J. Chen, X. Tan, Y. Shen, R. Kan, D. Li, and D. Yang,
arXiv:2402.06627, 2024. “Easytool: Enhancing llm-based agents with concise tool instruction,”
[107] K. Nabben, “Ai as a constituted system: accountability lessons from arXiv preprint arXiv:2401.06201, 2024.
an llm experiment,” Data & policy, vol. 6, p. e57, 2024. [128] B. Xu, X. Liu, H. Shen, Z. Han, Y. Li, M. Yue, Z. Peng, Y. Liu, Z. Yao,
[108] P. J. Pesch, “Potentials and challenges of large language models (llms) and D. Xu, “Gentopia: A collaborative platform for tool-augmented
in the context of administrative decision-making,” European Journal llms,” arXiv preprint arXiv:2308.04030, 2023.
of Risk Regulation, pp. 1–20, 2025. [129] H. Lu, X. Li, X. Ji, Z. Kan, and Q. Hu, “Toolfive: Enhancing tool-
[109] C. Wang, Y. Deng, Z. Lyu, L. Zeng, J. He, S. Yan, and B. An, “Q*: augmented llms via tool filtering and verification,” in ICASSP 2025-
Improving multi-step reasoning for llms with deliberative planning,” 2025 IEEE International Conference on Acoustics, Speech and Signal
arXiv preprint arXiv:2406.14283, 2024. Processing (ICASSP), pp. 1–5, IEEE, 2025.
[110] H. Wei, Z. Zhang, S. He, T. Xia, S. Pan, and F. Liu, “Plangen- [130] Y. Song, F. Xu, S. Zhou, and G. Neubig, “Beyond browsing: Api-based
llms: A modern survey of llm planning capabilities,” arXiv preprint web agents,” arXiv preprint arXiv:2410.16464, 2024.
arXiv:2502.11221, 2025. [131] V. Tupe and S. Thube, “Ai agentic workflows and enterprise apis:
[111] A. Bandi, P. V. S. R. Adapa, and Y. E. V. P. K. Kuchi, “The power of Adapting api architectures for the age of ai agents,” arXiv preprint
generative ai: A review of requirements, models, input–output formats, arXiv:2502.17443, 2025.
evaluation metrics, and challenges,” Future Internet, vol. 15, no. 8, [132] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao,
p. 260, 2023. “React: Synergizing reasoning and acting in language models,” in
[112] Y. Liu, H. Du, D. Niyato, J. Kang, Z. Xiong, Y. Wen, and D. I. Kim, International Conference on Learning Representations (ICLR), 2023.
“Generative ai in data center networking: Fundamentals, perspectives, [133] OpenAI, “Introducing chatgpt search,” 10 2024.
and case study,” IEEE Network, 2025. [134] L. Ning, Z. Liang, Z. Jiang, H. Qu, Y. Ding, W. Fan, X.-y. Wei,
[113] C. Guo, F. Cheng, Z. Du, J. Kiessling, J. Ku, S. Li, Z. Li, M. Ma, S. Lin, H. Liu, P. S. Yu, et al., “A survey of webagents: Towards
T. Molom-Ochir, B. Morris, et al., “A survey: Collaborative hardware next-generation ai agents for web automation with large foundation
and software design in the era of large language models,” IEEE Circuits models,” arXiv preprint arXiv:2503.23350, 2025.
and Systems Magazine, vol. 25, no. 1, pp. 35–57, 2025. [135] M. W. U. Rahman, R. Nevarez, L. T. Mim, and S. Hariri, “Multi-
[114] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, agent actor-critic generative ai for query resolution and analysis,” IEEE
A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., “Language mod- Transactions on Artificial Intelligence, 2025.
els are few-shot learners,” Advances in neural information processing [136] J. Lála, O. O’Donoghue, A. Shtedritski, S. Cox, S. G. Rodriques,
systems, vol. 33, pp. 1877–1901, 2020. and A. D. White, “Paperqa: Retrieval-augmented generative agent for
[115] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, scientific research,” arXiv preprint arXiv:2312.07559, 2023.
T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, et al., “Llama: [137] Z. Wu, C. Yu, C. Chen, J. Hao, and H. H. Zhuo, “Models as agents:
Open and efficient foundation language models,” arXiv preprint Optimizing multi-step predictions of interactive local models in model-
arXiv:2302.13971, 2023. based multi-agent reinforcement learning,” in Proceedings of the AAAI
[116] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Conference on Artificial Intelligence, vol. 37, pp. 10435–10443, 2023.
Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning [138] Z. Feng, R. Xue, L. Yuan, Y. Yu, N. Ding, M. Liu, B. Gao, J. Sun, and
with a unified text-to-text transformer,” Journal of machine learning G. Wang, “Multi-agent embodied ai: Advances and future directions,”
research, vol. 21, no. 140, pp. 1–67, 2020. arXiv preprint arXiv:2505.05108, 2025.
[117] A. Yang, B. Xiao, B. Wang, B. Zhang, C. Bian, C. Yin, C. Lv, D. Pan, [139] A. Feriani and E. Hossain, “Single and multi-agent deep reinforcement
D. Wang, D. Yan, et al., “Baichuan 2: Open large-scale language learning for ai-enabled wireless networks: A tutorial,” IEEE Commu-
models,” arXiv preprint arXiv:2309.10305, 2023. nications Surveys & Tutorials, vol. 23, no. 2, pp. 1226–1252, 2021.
[140] R. Zhang, S. Tang, Y. Liu, D. Niyato, Z. Xiong, S. Sun, S. Mao, [161] S. Agashe, J. Han, S. Gan, J. Yang, A. Li, and X. E. Wang, “Agent s:
and Z. Han, “Toward agentic ai: generative information retrieval An open agentic framework that uses computers like a human,” arXiv
inspired intelligent communications and networking,” arXiv preprint preprint arXiv:2410.08164, 2024.
arXiv:2502.16866, 2025. [162] C. DeChant, “Episodic memory in ai agents poses risks that should be
[141] U. M. Borghoff, P. Bottoni, and R. Pareschi, “Human-artificial interac- studied and mitigated,” arXiv preprint arXiv:2501.11739, 2025.
tion in the age of agentic ai: a system-theoretical approach,” Frontiers [163] A. M. Nuxoll and J. E. Laird, “Enhancing intelligent agents with
in Human Dynamics, vol. 7, p. 1579166, 2025. episodic memory,” Cognitive Systems Research, vol. 17, pp. 34–48,
[142] E. Miehling, K. N. Ramamurthy, K. R. Varshney, M. Riemer, D. Boun- 2012.
effouf, J. T. Richards, A. Dhurandhar, E. M. Daly, M. Hind, P. Sat- [164] G. Sarthou, A. Clodic, and R. Alami, “Ontologenius: A long-term
tigeri, et al., “Agentic ai needs a systems theory,” arXiv preprint semantic memory for robotic agents,” in 2019 28th IEEE International
arXiv:2503.00237, 2025. Conference on Robot and Human Interactive Communication (RO-
[143] W. Xu, Z. Liang, K. Mei, H. Gao, J. Tan, and Y. Zhang, “A-mem: MAN), pp. 1–8, IEEE, 2019.
Agentic memory for llm agents,” arXiv preprint arXiv:2502.12110, [165] A.-e.-h. Munir and W. M. Qazi, “Artificial subjectivity: Personal se-
2025. mantic memory model for cognitive agents,” Applied Sciences, vol. 12,
[144] C. Riedl and D. De Cremer, “Ai for collective intelligence,” Collective no. 4, p. 1903, 2022.
Intelligence, vol. 4, no. 2, p. 26339137251328909, 2025. [166] A. Singh, A. Ehtesham, S. Kumar, and T. T. Khoei, “Agentic retrieval-
[145] L. Peng, D. Li, Z. Zhang, T. Zhang, A. Huang, S. Yang, and Y. Hu, augmented generation: A survey on agentic rag,” arXiv preprint
“Human-ai collaboration: Unraveling the effects of user proficiency arXiv:2501.09136, 2025.
and ai agent capability in intelligent decision support systems,” Inter- [167] R. Akkiraju, A. Xu, D. Bora, T. Yu, L. An, V. Seth, A. Shukla, P. Gun-
national Journal of Industrial Ergonomics, vol. 103, p. 103629, 2024. decha, H. Mehta, A. Jha, et al., “Facts about building retrieval aug-
[146] H. Shirado, K. Shimizu, N. A. Christakis, and S. Kasahara, “Realism mented generation-based chatbots,” arXiv preprint arXiv:2407.07858,
drives interpersonal reciprocity but yields to ai-assisted egocentrism in 2024.
a coordination experiment,” in Proceedings of the 2025 CHI Conference [168] G. Wang, Y. Xie, Y. Jiang, A. Mandlekar, C. Xiao, Y. Zhu, L. Fan,
on Human Factors in Computing Systems, pp. 1–21, 2025. and A. Anandkumar, “Voyager: An open-ended embodied agent with
[147] Y. Xiao, G. Shi, and P. Zhang, “Towards agentic ai networking in large language models,” arXiv preprint arXiv:2305.16291, 2023.
6g: A generative foundation model-as-agent approach,” arXiv preprint [169] G. Li, H. Hammoud, H. Itani, D. Khizbullin, and B. Ghanem, “Camel:
arXiv:2503.15764, 2025. Communicative agents for” mind” exploration of large language model
[148] P. R. Lewis and Ş. Sarkadi, “Reflective artificial intelligence,” Minds society,” Advances in Neural Information Processing Systems, vol. 36,
and Machines, vol. 34, no. 2, p. 14, 2024. pp. 51991–52008, 2023.
[149] C. Qian, W. Liu, H. Liu, N. Chen, Y. Dang, J. Li, C. Yang, W. Chen, [170] S. Reed, K. Zolna, E. Parisotto, S. G. Colmenarejo, A. Novikov,
Y. Su, X. Cong, et al., “Chatdev: Communicative agents for software G. Barth-Maron, M. Gimenez, Y. Sulsky, J. Kay, J. T. Springenberg,
development,” arXiv preprint arXiv:2307.07924, 2023. et al., “A generalist agent,” arXiv preprint arXiv:2205.06175, 2022.
[150] S. Hong, X. Zheng, J. Chen, Y. Cheng, J. Wang, C. Zhang,
[171] C. K. Thomas, C. Chaccour, W. Saad, M. Debbah, and C. S. Hong,
Z. Wang, S. K. S. Yau, Z. Lin, L. Zhou, et al., “Metagpt: Meta
“Causal reasoning: Charting a revolutionary course for next-generation
programming for multi-agent collaborative framework,” arXiv preprint
ai-native wireless networks,” IEEE Vehicular Technology Magazine,
arXiv:2308.00352, vol. 3, no. 4, p. 6, 2023.
2024.
[151] Y. Liang, C. Wu, T. Song, W. Wu, Y. Xia, Y. Liu, Y. Ou, S. Lu,
[172] Z. Tang, R. Wang, W. Chen, K. Wang, Y. Liu, T. Chen, and L. Lin,
L. Ji, S. Mao, et al., “Taskmatrix. ai: Completing tasks by connecting
“Towards causalgpt: A multi-agent approach for faithful knowledge
foundation models with millions of apis,” Intelligent Computing, vol. 3,
reasoning via promoting causal consistency in llms,” arXiv preprint
p. 0063, 2024.
arXiv:2308.11914, 2023.
[152] H. Hexmoor, J. Lammens, G. Caicedo, and S. C. Shapiro, Behaviour
based AI, cognitive processes, and emergent behaviors in autonomous [173] Z. Gekhman, J. Herzig, R. Aharoni, C. Elkind, and I. Szpektor,
agents, vol. 1. WIT Press, 2025. “Trueteacher: Learning factual consistency evaluation with large lan-
guage models,” arXiv preprint arXiv:2305.11171, 2023.
[153] H. Zhang, Z. Li, F. Liu, Y. He, Z. Cao, and Y. Zheng, “Design
and implementation of langchain-based chatbot,” in 2024 International [174] A. Wu, K. Kuang, M. Zhu, Y. Wang, Y. Zheng, K. Han, B. Li, G. Chen,
Seminar on Artificial Intelligence, Computer Technology and Control F. Wu, and K. Zhang, “Causality for large language models,” arXiv
Engineering (ACTCE), pp. 226–229, IEEE, 2024. preprint arXiv:2410.15319, 2024.
[154] E. Ephrati and J. S. Rosenschein, “A heuristic technique for multi-agent [175] S. Ashwani, K. Hegde, N. R. Mannuru, D. S. Sengar, M. Jindal,
planning,” Annals of Mathematics and Artificial Intelligence, vol. 20, K. C. R. Kathala, D. Banga, V. Jain, and A. Chadha, “Cause and effect:
pp. 13–67, 1997. can large language models truly understand causality?,” in Proceedings
[155] S. Kupferschmid, J. Hoffmann, H. Dierks, and G. Behrmann, “Adapting of the AAAI Symposium Series, vol. 4, pp. 2–9, 2024.
an ai planning heuristic for directed model checking,” in International [176] J. Richens and T. Everitt, “Robust agents learn causal world models,”
SPIN Workshop on Model Checking of Software, pp. 35–52, Springer, in The Twelfth International Conference on Learning Representations,
2006. 2024.
[156] W. Chen, Y. Su, J. Zuo, C. Yang, C. Yuan, C. Qian, C.-M. Chan, [177] A. Chan, R. Salganik, A. Markelius, C. Pang, N. Rajkumar,
Y. Qin, Y. Lu, R. Xie, et al., “Agentverse: Facilitating multi-agent D. Krasheninnikov, L. Langosco, Z. He, Y. Duan, M. Carroll, et al.,
collaboration and exploring emergent behaviors in agents,” arXiv “Harms from increasingly agentic algorithmic systems,” in Proceed-
preprint arXiv:2308.10848, vol. 2, no. 4, p. 6, 2023. ings of the 2023 ACM Conference on Fairness, Accountability, and
[157] T. Schick, J. Dwivedi-Yu, R. Dessı̀, R. Raileanu, M. Lomeli, E. Ham- Transparency, pp. 651–666, 2023.
bro, L. Zettlemoyer, N. Cancedda, and T. Scialom, “Toolformer: [178] A. Plaat, M. van Duijn, N. van Stein, M. Preuss, P. van der Putten,
Language models can teach themselves to use tools,” Advances in and K. J. Batenburg, “Agentic large language models, a survey,” arXiv
Neural Information Processing Systems, vol. 36, pp. 68539–68551, preprint arXiv:2503.23037, 2025.
2023. [179] J. Qiu, K. Lam, G. Li, A. Acharya, T. Y. Wong, A. Darzi, W. Yuan, and
[158] J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, E. J. Topol, “Llm-based agentic systems in medicine and healthcare,”
D. Zhou, et al., “Chain-of-thought prompting elicits reasoning in large Nature Machine Intelligence, vol. 6, no. 12, pp. 1418–1420, 2024.
language models,” Advances in neural information processing systems, [180] G. A. Gabison and R. P. Xian, “Inherent and emergent liability issues
vol. 35, pp. 24824–24837, 2022. in llm-based agentic systems: a principal-agent perspective,” arXiv
[159] S. Yao, D. Yu, J. Zhao, I. Shafran, T. Griffiths, Y. Cao, and preprint arXiv:2504.03255, 2025.
K. Narasimhan, “Tree of thoughts: Deliberate problem solving with [181] M. Dahl, V. Magesh, M. Suzgun, and D. E. Ho, “Large legal fictions:
large language models,” Advances in neural information processing Profiling legal hallucinations in large language models,” Journal of
systems, vol. 36, pp. 11809–11822, 2023. Legal Analysis, vol. 16, no. 1, pp. 64–93, 2024.
[160] J. Guo, N. Li, J. Qi, H. Yang, R. Li, Y. Feng, S. Zhang, and M. Xu, [182] Y. A. Latif, “Hallucinations in large language models and their
“Empowering working memory for large language model agents,” influence on legal reasoning: Examining the risks of ai-generated
arXiv preprint arXiv:2312.17259, 2023. factual inaccuracies in judicial processes,” Journal of Computational
Intelligence, Machine Reasoning, and Decision-Making, vol. 10, no. 2, augmented generation for knowledge-intensive nlp tasks,” Advances in
pp. 10–20, 2025. neural information processing systems, vol. 33, pp. 9459–9474, 2020.
[183] S. Tonmoy, S. Zaman, V. Jain, A. Rani, V. Rawte, A. Chadha, and [203] Y. Ma, Z. Gou, J. Hao, R. Xu, S. Wang, L. Pan, Y. Yang, Y. Cao,
A. Das, “A comprehensive survey of hallucination mitigation tech- A. Sun, H. Awadalla, et al., “Sciagent: Tool-augmented language
niques in large language models,” arXiv preprint arXiv:2401.01313, models for scientific reasoning,” arXiv preprint arXiv:2402.11451,
vol. 6, 2024. 2024.
[184] Z. Zhang, Y. Yao, A. Zhang, X. Tang, X. Ma, Z. He, Y. Wang, [204] K. Dev, S. A. Khowaja, K. Singh, E. Zeydan, and M. Debbah,
M. Gerstein, R. Wang, G. Liu, et al., “Igniting language intelligence: “Advanced architectures integrated with agentic ai for next-generation
The hitchhiker’s guide from chain-of-thought reasoning to language wireless networks,” arXiv preprint arXiv:2502.01089, 2025.
agents,” ACM Computing Surveys, vol. 57, no. 8, pp. 1–39, 2025. [205] A. Boyle and A. Blomkvist, “Elements of episodic memory: in-
[185] Y. Wan and K.-W. Chang, “White men lead, black women help? sights from artificial agents,” Philosophical Transactions B, vol. 379,
benchmarking language agency social biases in llms,” arXiv preprint no. 1913, p. 20230416, 2024.
arXiv:2404.10508, 2024. [206] Y. Du, W. Huang, D. Zheng, Z. Wang, S. Montella, M. Lapata, K.-F.
[186] A. Borah and R. Mihalcea, “Towards implicit bias detection and mitiga- Wong, and J. Z. Pan, “Rethinking memory in ai: Taxonomy, operations,
tion in multi-agent llm interactions,” arXiv preprint arXiv:2410.02584, topics, and future directions,” arXiv preprint arXiv:2505.00675, 2025.
2024. [207] K.-T. Tran, D. Dao, M.-D. Nguyen, Q.-V. Pham, B. O’Sullivan, and
[187] X. Liu, H. Yu, H. Zhang, Y. Xu, X. Lei, H. Lai, Y. Gu, H. Ding, H. D. Nguyen, “Multi-agent collaboration mechanisms: A survey of
K. Men, K. Yang, et al., “Agentbench: Evaluating llms as agents,” llms,” arXiv preprint arXiv:2501.06322, 2025.
arXiv preprint arXiv:2308.03688, 2023. [208] K. Tallam, “From autonomous agents to integrated systems, a
[188] G. He, G. Demartini, and U. Gadiraju, “Plan-then-execute: An empir- new paradigm: Orchestrated distributed intelligence,” arXiv preprint
ical study of user trust and team performance when using llm agents arXiv:2503.13754, 2025.
as a daily assistant,” in Proceedings of the 2025 CHI Conference on [209] Y. Lee, “Critique of artificial reason: Ontology of human and artificial
Human Factors in Computing Systems, pp. 1–22, 2025. intelligence,” Journal of Ecohumanism, vol. 4, no. 3, pp. 397–415,
2025.
[189] Z. Ke, F. Jiao, Y. Ming, X.-P. Nguyen, A. Xu, D. X. Long, M. Li,
[210] L. Ale, S. A. King, N. Zhang, and H. Xing, “Enhancing generative
C. Qin, P. Wang, S. Savarese, et al., “A survey of frontiers in llm
ai reliability via agentic ai in 6g-enabled edge computing,” Nature
reasoning: Inference scaling, learning to reason, and agentic systems,”
Reviews Electrical Engineering, pp. 1–3, 2025.
arXiv preprint arXiv:2504.09037, 2025.
[211] N. Shinn, F. Cassano, A. Gopinath, K. Narasimhan, and S. Yao,
[190] M. Luo, X. Shi, C. Cai, T. Zhang, J. Wong, Y. Wang, C. Wang,
“Reflexion: Language agents with verbal reinforcement learning,”
Y. Huang, Z. Chen, J. E. Gonzalez, et al., “Autellix: An efficient
Advances in Neural Information Processing Systems, vol. 36, pp. 8634–
serving engine for llm agents as general programs,” arXiv preprint
8652, 2023.
arXiv:2502.13965, 2025.
[212] F. Kamalov, D. S. Calonge, L. Smail, D. Azizov, D. R. Thadani,
[191] K. Hatalis, D. Christou, J. Myers, S. Jones, K. Lambert, A. Amos- T. Kwong, and A. Atif, “Evolution of ai in education: Agentic
Binks, Z. Dannenhauer, and D. Dannenhauer, “Memory matters: The workflows,” arXiv preprint arXiv:2504.20082, 2025.
need to improve long-term memory in llm-agents,” in Proceedings of [213] A. Sulc, T. Hellert, R. Kammering, H. Hoschouer, and J. S.
the AAAI Symposium Series, vol. 2, pp. 277–280, 2023. John, “Towards agentic ai on particle accelerators,” arXiv preprint
[192] H. Jin, X. Han, J. Yang, Z. Jiang, Z. Liu, C.-Y. Chang, H. Chen, and arXiv:2409.06336, 2024.
X. Hu, “Llm maybe longlm: Self-extend llm context window without [214] J. Yang, C. Jimenez, A. Wettig, K. Lieret, S. Yao, K. Narasimhan,
tuning,” arXiv preprint arXiv:2401.01325, 2024. and O. Press, “Swe-agent: Agent-computer interfaces enable automated
[193] M. Yu, F. Meng, X. Zhou, S. Wang, J. Mao, L. Pang, T. Chen, K. Wang, software engineering,” Advances in Neural Information Processing
X. Li, Y. Zhang, et al., “A survey on trustworthy llm agents: Threats Systems, vol. 37, pp. 50528–50652, 2024.
and countermeasures,” arXiv preprint arXiv:2503.09648, 2025. [215] S. Barua, “Exploring autonomous agents through the lens of large
[194] H. Chi, H. Li, W. Yang, F. Liu, L. Lan, X. Ren, T. Liu, and B. Han, language models: A review,” arXiv preprint arXiv:2404.04442, 2024.
“Unveiling causal reasoning in large language models: Reality or [216] A. Zhao, Y. Wu, Y. Yue, T. Wu, Q. Xu, M. Lin, S. Wang, Q. Wu,
mirage?,” Advances in Neural Information Processing Systems, vol. 37, Z. Zheng, and G. Huang, “Absolute zero: Reinforced self-play reason-
pp. 96640–96670, 2024. ing with zero data,” arXiv preprint arXiv:2505.03335, 2025.
[195] H. Wang, A. Zhang, N. Duy Tai, J. Sun, T.-S. Chua, et al., “Ali-agent:
Assessing llms’ alignment with human values via agent-based evalu-
ation,” Advances in Neural Information Processing Systems, vol. 37,
pp. 99040–99088, 2024.
[196] L. Hammond, A. Chan, J. Clifton, J. Hoelscher-Obermaier, A. Khan,
E. McLean, C. Smith, W. Barfuss, J. Foerster, T. Gavenčiak,
et al., “Multi-agent risks from advanced ai,” arXiv preprint
arXiv:2502.14143, 2025.
[197] D. Trusilo, “Autonomous ai systems in conflict: Emergent behavior and
its impact on predictability and reliability,” Journal of Military Ethics,
vol. 22, no. 1, pp. 2–17, 2023.
[198] M. Puvvadi, S. K. Arava, A. Santoria, S. S. P. Chennupati, and H. V.
Puvvadi, “Coding agents: A comprehensive survey of automated bug
fixing systems and benchmarks,” in 2025 IEEE 14th International
Conference on Communication Systems and Network Technologies
(CSNT), pp. 680–686, IEEE, 2025.
[199] C. Newton, J. Singleton, C. Copland, S. Kitchen, and J. Hudack,
“Scalability in modeling and simulation systems for multi-agent, ai, and
machine learning applications,” in Artificial Intelligence and Machine
Learning for Multi-Domain Operations Applications III, vol. 11746,
pp. 534–552, SPIE, 2021.
[200] H. D. Le, X. Xia, and Z. Chen, “Multi-agent causal discovery using
large language models,” arXiv preprint arXiv:2407.15073, 2024.
[201] Y. Shavit, S. Agarwal, M. Brundage, S. Adler, C. O’Keefe, R. Camp-
bell, T. Lee, P. Mishkin, T. Eloundou, A. Hickey, et al., “Practices for
governing agentic ai systems,” Research Paper, OpenAI, 2023.
[202] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal,
H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, et al., “Retrieval-

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy