Component Tests
Deterministic tests that verify a complete frontend component or backend service through its public interface, using test doubles for all external dependencies.
8 minute read
A test architecture that lets your pipeline deploy confidently, regardless of external system availability, is a core CD capability. The child pages cover each test type.
A CD pipeline’s job is to force every artifact to prove it is worthy of delivery. That proof only works when test changes ship with the code they validate. If a developer adds a feature but the corresponding tests arrive in a later commit, the pipeline approved an artifact it never actually verified. That is not a CD pipeline. It is a CI pipeline with a deploy step. Tests and production code must always travel together through the pipeline as a single unit of change.
The test pyramid says: write many fast unit tests at the base, fewer integration tests in the middle, and only a handful of end-to-end tests at the top. The underlying principle is sound - lower-level tests are faster, more deterministic, and cheaper to maintain.
The pyramid’s shape communicates a principle: prefer fast, deterministic tests that you fully control. Tests at the base are cheap to write, fast to run, and reliable. Tests at the top are slow, expensive, and depend on systems outside your control. The more weight you put at the base, the faster and more reliable your pipeline becomes - to a point. We also have the engineering goal of achieving the most functional coverage with the fewest number of tests. Every test costs money to maintain and adds time to the pipeline.
The testing trophy, popularized by Kent C. Dodds, rebalances the pyramid by putting component tests at the center. Where the pyramid emphasizes unit tests at the base, the trophy argues that component tests give you the most confidence per test because they exercise realistic user behavior through a component’s public interface while still using test doubles for external dependencies.
The trophy also makes static analysis explicit as the foundation. Linting, type checking, and formatting catch entire categories of defects for free - no test code to write or maintain.
Both models agree on the principle: keep end-to-end tests few and focused, and maximize fast, deterministic coverage. The trophy simply shifts where that coverage concentrates. For teams building component-heavy applications, the trophy distribution often produces better results than a strict pyramid.
Teams often miss this underlying principle and treat either shape as a metric. They count tests by type and debate ratios - “do we have enough unit tests?” or “are our integration tests too many?” - when the real question is:
Can our pipeline determine that a change is safe to deploy without depending on any system we do not control?
A pipeline that answers yes can deploy at any time - even when a downstream service is down, a third-party API is slow, or a partner team hasn’t shipped yet. That independence is what CD requires, and it is the reason the pyramid favors the base.
A test architecture that achieves this has three responsibilities:
Most teams that struggle with CD have inverted the pyramid - too many slow, flaky end-to-end tests and too few fast, focused ones. Manual gates block every release. The pipeline cannot give a fast, reliable answer, so deployments become high-ceremony events.
A test architecture is the deliberate structure of how different test types work together across your pipeline to give you deployment confidence. Use the table below to decide what type of test to write and where it runs. This is not a comprehensive list. It shows how common tests impact pipeline design and how teams should structure their suites. See the Pipeline Reference Architecture for a complete quality gate sequence.
| Pipeline Stage | What You Need to Verify | Test Type | Speed | Deterministic? | Blocks Deploy? |
|---|---|---|---|---|---|
| CI | A function or method behaves correctly | Unit | Milliseconds | Yes | ■ Yes |
| CI | A complete component or service works through its public interface | Component | Milliseconds to seconds | Yes | ■ Yes |
| CI | Your code correctly interacts with external system interfaces | Contract | Milliseconds to seconds | Yes | ■ Yes |
| CI | Code quality, security, and style compliance | Static Analysis | Seconds | Yes | ■ Yes |
| CI | UI meets WCAG accessibility standards | Static Analysis + Component | Seconds | Yes | ■ Yes |
| Acceptance Testing | Deployed artifact meets acceptance criteria | Deploy, Smoke, Load, Resilience, Compliance, etc. | Minutes | No | ■ Yes - gates production |
| Post-deploy (production) | Critical user journeys work in production | E2E smoke | Seconds to minutes | No | No - triggers rollback |
| Post-deploy (production) | Production health and SLOs | Synthetic monitoring | Continuous | No | No - triggers alerts |
| On demand/scheduled | Contract test doubles still match real external systems | Integration | Seconds to minutes | No | No - triggers review |
| Continuous | Unexpected behavior, edge cases, real-world workflows | Exploratory Testing | Varies | No | Never |
| Continuous | Real users can accomplish goals effectively | Usability Testing | Varies | No | Never |
The critical insight: everything that blocks merge is deterministic and under your control. Acceptance tests gate production promotion after verifying the deployed artifact. Everything that involves real external systems runs post-deployment. This is what gives you the independence to deploy any time, regardless of the state of the world around you.
The table maps to two distinct phases of your pipeline, each with different goals and constraints.
Pre-merge (before code lands on trunk): Run unit, component, and contract tests. These must all be deterministic and fast. Target: under 10 minutes total. This is the quality gate that every change must pass. If pre-merge tests are slow, developers batch up changes or skip local runs, both of which undermine continuous integration.
Post-merge (after code lands on trunk, before or after deployment): Re-run the full deterministic suite against the integrated trunk. Then run acceptance tests, E2E smoke tests, and synthetic monitoring post-deploy. Integration tests run separately in a test environment, on demand or on a schedule. Target: under 60 minutes for the full post-merge cycle.
Why re-run pre-merge tests post-merge? Two changes can each pass pre-merge independently but conflict when combined on trunk. The post-merge run catches these integration effects.
If a post-merge failure occurs, the team fixes it immediately. Trunk must always be releasable.
This post-merge re-run is what teams traditionally call regression testing: running all previous tests against the current artifact to confirm that existing behavior still works after a change. In CD, regression testing is not a separate test type or a special suite. Every test in the pipeline is a regression test. The deterministic suite runs on every commit, and the full suite runs post-merge. If all tests pass, the artifact has been regression-tested.
x and y, will the
result be z?” - not the sequence of internal calls that produce z. Avoid
white box testing that asserts on internals.Additional concepts drawn from Ham Vocke, The Practical Test Pyramid, and Toby Clemson, Testing Strategies in a Microservice Architecture.
Deterministic tests that verify a complete frontend component or backend service through its public interface, using test doubles for all external dependencies.
Deterministic tests that verify interface boundaries with external systems using test doubles. Also called narrow integration tests. Validated by integration tests running against real systems.
Tests that exercise two or more real components up to the full system. Non-deterministic by nature; never a pre-merge gate.
Why test suite speed matters for developer effectiveness and how cognitive limits set the targets.
Tests that exercise real external dependencies to validate that contract test doubles still match reality. Non-deterministic; never a pre-merge gate.
Code analysis tools that evaluate non-running code for security vulnerabilities, complexity, and best practice violations.
Patterns for isolating dependencies in tests: stubs, mocks, fakes, spies, and dummies.
Fast, deterministic tests that verify a unit of behavior through its public interface, asserting on what the code does rather than how it works.
Definitions for testing terms as they are used on this site.