A New Discipline

Agent Engineering

Designing, evaluating, and orchestrating non-deterministic AI systems with explicit specification, feedback loops, and operational guardrails.

Agent Engineering exists because models don't read minds and production constraints are real. It is the discipline of specifying, designing, building, and orchestrating systems that behave reliably even when every input is an edge case. The work is about making non-deterministic AI dependable in the real world.

Core Disciplines

Code

Engineering

Context

Engineering

Evaluation

Engineering

Orchestration

Engineering

Memory

Engineering

What is Agent Engineering?

Agent Engineering is the discipline of designing, evaluating, and orchestrating non-deterministic AI systems with explicit specification, feedback loops, and operational guardrails.

Agentic AI emerges when context, memory, evaluation, orchestration, tooling, and infrastructure are intentionally designed as a modular, future-proof system. Intelligent specification, human-in-the-loop oversight, and strategic orchestration are foundational.

We focus on what actually works in production: real constraints, real tradeoffs, and real systems.

Core Engineering Themes

Theme 1

Code Engineering

How agents write, modify, and reason over code in real workflows

Theme 2

Eval Engineering

Behavioral testing, evaluation frameworks, and reliability guardrails

Theme 3

Memory Engineering

State, retrieval strategies, and personalization over time

Theme 4

Context & Skills Engineering

Context construction, compression, grounding, and MCP workflows

Theme 5

Harness Engineering

Execution environments, tools, policies, and sandboxes

Theme 6

Multi-Agent Engineering

Coordination, task decomposition, and agent collaboration

The Agent Engineering Mindset

Non-Determinism

Embrace unpredictability as a feature, not a flaw

  • Design for variance, retries, and fallbacks
  • Measure behavior, not just outputs

Intelligent Specification

Models do not read minds

  • Explicit specs, constraints, and success criteria
  • Better planning yields better agents

Every Input is an Edge Case

Ship to learn, not to be perfect

  • Production feedback loops over static tests
  • Real-world inputs define reliability

Strategic Orchestration

Allocate compute and review efficiently

  • Budget-aware routing and escalation
  • Human review where it matters most

The Reviewer Framework

Validation loops and automated gates

  • Reviewer agents for quality control
  • Automated PR gates and safety checks

Agent Networking

Agents coordinate and communicate

  • Parallel and sequential workflows
  • Prevent conflicts and overlap

Advanced Program Tracks

Advanced

Agent Optimization

Optimize agents across prompts, RAG, protocols, memory, and context

Advanced

Agent Experience (AX)

Design machine-readable interfaces and feedback loops that shape AX

Advanced

Agentic Coding

How development evolves as agents become first-class SDLC contributors

Advanced

Agentic Business Models

Rethink SaaS and explore models for agent-native businesses

The Agent Engineering Manifesto

Models Don't Read Minds

Principle

Agent engineering starts with explicit specification and clear intent

Agentic Systems Are Modular

Principle

Context, memory, evaluation, orchestration, and tooling must be designed together

Guardrails + Feedback Loops

Principle

Operational oversight and iterative feedback make agents production-ready

Why Agent Engineering is Different

Determinism

Traditional

Predictable outputs with fixed logic

Agent Engineering

Non-deterministic behavior that must be managed and observed

Specification

Traditional

Requirements live in code and tickets

Agent Engineering

Explicit specs and guardrails define agent behavior

Orchestration

Traditional

Single systems and linear workflows

Agent Engineering

Strategic routing across tools, models, and reviewers

Evaluation

Traditional

Predefined test cases

Agent Engineering

Behavioral evals and continuous feedback loops

The Agent Engineering Ecosystem

Key domains shaping production-grade agentic systems

Code

Agentic Coding

Coding agents, pair programming, code review, and spec-driven workflows

Tools

Agent & Tooling

Frameworks, orchestration platforms, SDKs, and multi-agent systems

Models

Models & Foundations

Frontier model providers, developer platforms, and applied AI tooling

Dev

Agent Dev Tools

Testing tools, evaluation frameworks, and MCP tooling

Ops

AgentOps & Traceability

Tracing, observability, model serving, inference engines, and sandboxes

Ent

Enterprise & Security

IAM for agents, tool-use guardrails, and compliance automation

Who Builds Agent Engineering

Agent engineering is a cross-functional discipline. Builders, ML engineers, platform teams, product leaders, and founders all contribute to making agents reliable in production.

Software Engineer / ML Engineer

Agent Engineer
Traditional Focus

Writes deterministic code for fixed logic and builds ML models

Agent Engineering Responsibilities

Agent Engineer responsibilities: Writing prompts and building tools for agents to use, tracing why an agent made specific tool calls, and refining the underlying models. Designs agent scaffolds with tools, memory, and reflection loops.

Key Tasks
  • Write prompts that drive agent behavior (often hundreds or thousands of lines)
  • Build tools and APIs for agents to interact with
  • Trace agent decision-making and tool call sequences
  • Refine models and prompts based on production insights

Product Manager

Agent Engineer
Traditional Focus

Manages user stories, backlogs, and product roadmaps

Agent Engineering Responsibilities

Agent Engineer responsibilities: Writing prompts, defining agent scope, and ensuring the agent solves the right problem. Deeply understands the 'job to be done' that the agent replicates and defines evaluations that test whether the agent performs as intended.

Key Tasks
  • Write prompts that shape agent behavior and scope
  • Define high-level intent and goal specifications
  • Ensure the agent solves the right problem
  • Define evaluations that test agent performance

Platform Engineer

Agent Engineer
Traditional Focus

Manages CI/CD pipelines, uptime, and infrastructure

Agent Engineering Responsibilities

Agent Engineer responsibilities: Building agent infrastructure that handles durable execution and human-in-the-loop workflows. Creates robust runtimes that handle durable execution, human-in-the-loop pauses, and memory management.

Key Tasks
  • Build agent infrastructure for durable execution
  • Design human-in-the-loop workflow systems
  • Create robust runtimes with memory management
  • Develop UI/UX for agent interactions with streaming and interrupt handling

Data Scientist

Agent Engineer
Traditional Focus

Builds ML models, analyzes data, and creates predictive insights

Agent Engineering Responsibilities

Agent Engineer responsibilities: Measuring agent reliability and identifying opportunities for improvement. Building systems (evals, A/B testing, monitoring) to measure agent performance and reliability, and analyzing usage patterns and error analysis.

Key Tasks
  • Build evaluation systems to measure agent performance
  • Run A/B tests and monitor agent reliability
  • Analyze usage patterns and error analysis
  • Identify opportunities for improvement based on production data

The best teams treat specification, evaluation, and orchestration as shared responsibilities. They combine human oversight with automated guardrails so agents stay aligned as they scale.

Production-Ready Agentic AI

Building agentic AI for production is fundamentally different from traditional software. Every input is an edge case, and reliability comes from explicit specification, feedback loops, and operational guardrails.

Agent Engineering is the differentiator as models become more capable and commoditized. The teams that win will orchestrate context, memory, evaluation, and tooling as a unified system.