Glossary

AI Agent Terminology

Everything you need to understand the world of AI agents, from fundamentals to advanced concepts.

Agent Evaluation

The systematic testing and measurement of AI agent performance against defined benchmarks, scenarios, and quality metrics.

Agent Handoff

The transfer of an ongoing task or conversation from one AI agent to another, including the relevant context needed for the receiving agent to continue seamlessly.

Agent Memory

The mechanisms by which an AI agent stores and retrieves information across interactions, enabling it to maintain context, learn from past actions, and build knowledge over time.

Agent Observability

The ability to understand what an AI agent is doing and why, through traces, logs, metrics, and visualizations of the agent's decision-making process.

Agent Orchestration

The coordination of multiple AI agents working together on a complex task, including routing, handoffs, shared memory, and workflow management.

Agent Planning

The ability of an AI agent to decompose a complex goal into a sequence of actionable steps and execute them in the right order, adapting the plan as new information emerges.

Agent Routing

The process of directing incoming requests or subtasks to the most appropriate specialized agent based on the content, intent, or requirements of the task.

Agent Runtime

The execution environment that runs AI agents, managing the loop of observation, reasoning, and action along with tool execution, memory, and error handling.

Agent Safety

The set of practices, mechanisms, and design patterns that ensure AI agents behave reliably, don't cause harm, and operate within defined boundaries.

Agentic AI

An approach to AI systems where models operate with agency — making autonomous decisions, using tools, and pursuing goals over multiple steps without constant human direction.

AI Agent

An autonomous software system that uses a large language model to perceive its environment, make decisions, and take actions to achieve specified goals.

Autonomous Agent

An AI agent that can operate independently over extended periods, making decisions, executing tasks, and recovering from errors without human intervention.

Chain of Thought

A prompting technique where an AI model is guided to break down complex problems into intermediate reasoning steps before arriving at a final answer.

Context Window

The maximum amount of text (measured in tokens) that a language model can process in a single interaction, including both input and output.

Embedding

A numerical vector representation of text, images, or other data that captures semantic meaning in a high-dimensional space, enabling similarity comparisons.

Fine-Tuning

The process of further training a pre-trained language model on a specific dataset to improve its performance on particular tasks or domains.

Function Calling

A capability of language models that allows them to generate structured function calls with typed parameters, enabling reliable interaction with external APIs and tools.

Guardrails

Safety mechanisms that constrain AI agent behavior, preventing harmful actions, enforcing policies, and ensuring outputs meet quality and compliance standards.

Hallucination

When an AI model generates information that sounds plausible but is factually incorrect, fabricated, or not grounded in the provided context.

Human-in-the-Loop

A design pattern where an AI agent pauses and requests human approval or input before taking high-stakes or irreversible actions.

Large Language Model (LLM)

A neural network trained on vast amounts of text data that can understand and generate human language, serving as the reasoning engine for AI agents.

Multi-Agent System

An architecture where multiple specialized AI agents collaborate to accomplish complex tasks, each handling a specific part of the workflow.

Prompt Engineering

The practice of designing and optimizing the instructions given to a language model to achieve desired outputs, including system prompts, few-shot examples, and formatting guidelines.

Prompt Injection

An attack where malicious input attempts to override an AI agent's instructions, causing it to ignore its system prompt and follow attacker-controlled instructions instead.

Retrieval-Augmented Generation (RAG)

A technique that enhances LLM responses by retrieving relevant documents from an external knowledge base and including them in the model's context.

Semantic Search

A search technique that finds results based on meaning rather than exact keyword matching, using vector embeddings to understand the intent behind queries.

Structured Output

A technique for constraining LLM responses to follow a specific format or schema, such as JSON, XML, or typed objects, ensuring reliable downstream processing.

Token

The basic unit of text that language models process — roughly corresponding to a word or word fragment — used to measure input length, output length, and API costs.

Tool Use

The ability of an AI agent to invoke external tools — APIs, databases, code interpreters, web browsers — to gather information or take actions in the real world.

Vector Database

A specialized database that stores and efficiently searches high-dimensional vector embeddings, enabling semantic similarity search for AI applications.