Pipeline Enforcement and Expert Agents

How quality gates enforce ACD constraints and how expert validation agents extend the pipeline beyond standard tooling.

5 minute read

The pipeline is the enforcement mechanism for agentic continuous delivery (ACD). Standard quality gates handle mechanical checks. Expert validation agents handle the judgment calls that standard tools cannot make.

For the framework overview, see ACD. For the artifacts the pipeline enforces, see Agent Delivery Contract.

How Quality Gates Enforce ACD

The Pipeline Verification and Deployment stages of the ACD workflow are where the Pipeline Reference Architecture does the heavy lifting. Each pipeline stage enforces a specific ACD constraint:

Pre-commit gates (linting, type checking, secret scanning, SAST) catch the mechanical errors agents produce most often: style violations, type mismatches, and accidentally embedded secrets. These run in seconds and give the agent immediate feedback.
CI Stage 1 (build + unit tests) validates the acceptance criteria. If human-defined tests fail, the agent’s implementation is wrong regardless of how plausible the code looks.
CD Stage 1 (contract + schema tests) enforces the system constraints artifact at integration boundaries. Agent-generated code is particularly prone to breaking implicit contracts between modules or services.
CD Stage 2 (mutation testing, performance benchmarks, security integration tests) catches the subtle correctness issues that agents introduce: code that passes tests but violates non-functional requirements or leaves untested edge cases.
Acceptance tests validate the user-facing behavior artifact in a production-like environment. This is where the BDD scenarios become automated verification.
Production verification (canary deployment, health checks, SLO monitors with auto-rollback) provides the final safety net. If agent-generated code degrades production metrics, it rolls back automatically.

The Pre-Feature Baseline

The pre-feature baseline lists the required baseline gates that must be active before any feature work begins. These are a prerequisite for ACD. Without them passing on every commit, agent-generated changes bypass the minimum safety net.

See the pipeline patterns for concrete architectures that implement these gates:

Expert Validation Agents

Standard quality gates cover what conventional tooling can verify: linting, type checking, test execution, vulnerability scanning. But ACD introduces validation needs that standard tools cannot address. No conventional tool can verify that test code faithfully implements a human-defined test specification. No conventional tool can verify that an agent-generated implementation matches the architectural intent in a feature description.

Expert validation agents fill this gap. These are AI agents dedicated to a specific validation concern, running as pipeline gates alongside standard tools. The following are examples, not an exhaustive list - teams should create expert agents for whatever validation concerns their pipeline requires:

Example Agent	What It Validates	Catches	Artifact It Enforces
Test fidelity agent	Test code exercises the scenarios, edge cases, and assertions defined in the test specification	Agent-generated tests that omit edge cases or weaken assertions	Acceptance Criteria
Implementation coupling agent	Test code verifies observable behavior, not internal implementation details	Tests that break when implementation is refactored without any behavior change	Acceptance Criteria
Architectural conformance agent	Implementation follows the constraints in the feature description	Code that crosses a module boundary or uses a prohibited dependency	Feature Description
Intent alignment agent	The combined change addresses the problem stated in the intent description	Implementations that are technically correct but solve the wrong problem	Intent Description
Constraint compliance agent	Code respects system constraints that static analysis cannot check	Violations of logging standards, feature flag requirements, or audit rules	System Constraints

Adopting Expert Agents: The Same Replacement Cycle

Do not deploy expert agents and immediately reduce human review. Expert validation agents need calibration before they can replace human judgment. An agent that flags too many false positives trains the team to ignore it. An agent that misses real issues creates false confidence. Run expert agents in parallel with human review for at least 20 cycles before any reduction in human coverage.

Expert validation agents are new automated checks. Adopt them using the same replacement cycle that drives every brownfield CD migration:

Identify a manual validation currently performed by a human reviewer. For example, checking whether test code actually tests what the specification requires.
Automate the check by deploying an expert agent as a pipeline gate. The agent runs on every change and produces a pass/fail result with reasoning.
Validate by running the expert agent in parallel with the existing human review. Compare results across at least 20 review cycles. If the agent matches human decisions on 90%+ of cases and catches at least one issue the human missed, proceed to the removal step.
Remove the manual check once the expert agent has proven at least as effective as the human review it replaces.

Expert validation agents run on every change, immediately, eliminating the batching that manual review imposes. Humans steer; agents validate at pipeline speed.

With the pipeline and expert agents in place, the next question is what goes wrong and how to measure progress. See Pitfalls and Metrics.

Agentic Architecture Patterns - multi-agent pipeline patterns and hook design for enforcement workflows
ACD - the framework overview, eight constraints, and workflow
Agent Delivery Contract - the artifacts the pipeline enforces
Pipeline Reference Architecture - the full quality gate sequence
Replacing Manual Validations - the replacement cycle for adopting automated checks
Pitfalls and Metrics - what goes wrong and how to measure progress
AI Adoption Roadmap - the prerequisite sequence, especially Harden Guardrails and Reduce Delivery Friction

Pipeline Enforcement and Expert Agents

How Quality Gates Enforce ACD

The Pre-Feature Baseline

Expert Validation Agents

Adopting Expert Agents: The Same Replacement Cycle

Related Content