Systemic Defect Fixes

A catalog of defect sources across the delivery value stream with earliest detection points, AI shift-left opportunities, and systemic prevention strategies.

Defects do not appear randomly. They originate from specific, predictable sources in the delivery value stream. This reference catalogs those sources so teams can shift detection left, automate where possible, and apply AI where it adds real value to the feedback loop.

The goal is systems thinking: detect issues as early as possible in the value stream so feedback informs continuous improvement in how we work, not just reactive fixes to individual defects.

  • AI shifts detection earlier than current automation alone
  • Dark cells = current automation is sufficient; AI adds no additional value
  • No marker = AI assists at the current detection point but does not shift it earlier

How to Use This Catalog

  1. Pick your pain point. Find the category where your team loses the most time to defects or rework. Start there, not at the top.
  2. Focus on the Systemic Prevention column. Automated detection catches defects faster, but systemic prevention eliminates entire categories. Prioritize the prevention fix for each issue you selected.
  3. Measure before and after. Track defect escape rate by category and time-to-detection. If the systemic fix is working, both metrics improve within weeks.

Categories

CategoryWhat it covers
Product & DiscoveryWrong features, misaligned requirements, accessibility gaps - defects born before coding begins
Integration & BoundariesInterface mismatches, behavioral assumptions, race conditions at service boundaries
Knowledge & CommunicationImplicit domain knowledge, ambiguous requirements, tribal knowledge loss, divergent mental models
Change & ComplexityUnintended side effects, technical debt, feature interactions, configuration drift
Testing & Observability GapsUntested edge cases, missing contract tests, insufficient monitoring, environment parity
Process & DeploymentLong-lived branches, manual steps, large batches, inadequate rollback, work stacking
Data & StateSchema migration failures, null assumptions, concurrency issues, cache invalidation
Dependency & InfrastructureThird-party breaking changes, environment differences, network partition handling
Security & ComplianceVulnerabilities, secrets in source, auth gaps, injection, regulatory requirements, audit trails
Performance & ResilienceRegressions, resource leaks, capacity limits, missing timeouts, graceful degradation


Product & Discovery Defects

Defects that originate before a single line of code is written - the most expensive category because they compound through every downstream phase.

Integration & Boundaries Defects

Defects at system boundaries that are invisible to unit tests and often survive until production. Contract testing and deliberate boundary design are the primary defenses.

Knowledge & Communication Defects

Defects that emerge from gaps between what people know and what the code expresses - the hardest to detect with automated tools and the easiest to prevent with team practices.

Change & Complexity Defects

Defects caused by the act of changing existing code. The larger the change and the longer it lives outside trunk, the higher the risk.

Testing & Observability Gap Defects

Defects that survive because the safety net has holes. The fix is not more testing - it is better-targeted testing and observability that closes the specific gaps.

Process & Deployment Defects

Defects caused by the delivery process itself. Manual steps, large batches, and slow feedback loops create the conditions for failure.

Data & State Defects

Data defects are particularly dangerous because they can corrupt persistent state. Unlike code defects, data corruption often cannot be fixed by deploying a new version.

Dependency & Infrastructure Defects

Defects that originate outside your codebase but break your system. The fix is to treat external dependencies as untrusted boundaries.

Security & Compliance Defects

Security and compliance defects are silent until they are catastrophic. The gap between what the code does and what policy requires is invisible without deliberate, automated verification at every stage.

Performance & Resilience Defects

Performance defects degrade gradually, often hiding behind averages until a threshold tips and the system fails under real load. Detection requires baselines, budgets, and automated enforcement - not periodic manual testing.