This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Integration and Pipeline Problems

Code integration, merging, pipeline speed, and feedback loop problems.

Symptoms related to how code gets integrated, how the pipeline processes changes, and how fast the team gets feedback.

1 - Every Change Rebuilds the Entire Repository

A single repository with multiple applications and no selective build tooling. Any commit triggers a full rebuild of everything.

What you are seeing

The CI build takes 45 minutes for every commit because the pipeline rebuilds every application and runs every test regardless of what changed. The team chose a monorepo for good reasons - code sharing is simpler, cross-cutting changes are atomic, and dependency management is more coherent - but the pipeline has no awareness of what actually changed. Changing a comment in Service A triggers a full rebuild of Services B, C, D, and E.

Developers have adapted by batching changes to reduce the number of CI runs they wait through. One CI run per hour instead of one per commit. The batching reintroduces the integration problems the monorepo was supposed to solve: multiple changes combined in a single commit lose the ability to bisect failures to any individual change.

The build system treats the entire repository as a single unit. Service owners have added scripts to skip unmodified services, but the scripts are fragile and not consistently maintained. The CI system was not designed for selective builds, so every workaround is an unsupported hack on top of an ill-fitting tool.

Common causes

Missing deployment pipeline

Pipelines that understand which services changed - using build tools that model the dependency graph or change detection based on file paths - can selectively build and test only what was affected by a commit. Without this investment, pipelines treat the monorepo as a single unit and rebuild everything.

Tools like Nx, Bazel, or Turborepo provide dependency graph awareness for monorepos. A pipeline built on these tools builds only what needs to be rebuilt and runs only the tests that could be affected by the change. Feedback loops shorten from 45 minutes to 5.

Read more: Missing deployment pipeline

Manual deployments

When deployment is manual, there is no automated mechanism to determine which services changed and which need to be deployed. Manual review determines what to deploy, which is slow and inconsistent. Inconsistency leads to either over-deploying (deploying everything to be safe) or under-deploying (missing services that changed).

Automated deployment pipelines with change detection deploy exactly the services that changed, with evidence of what changed and why.

Read more: Manual deployments

How to narrow it down

  1. Does the pipeline build and test only the services affected by a change? If every commit triggers a full rebuild, change detection is not implemented. Start with Missing deployment pipeline.
  2. How long does a typical CI run take? If it takes more than 10 minutes regardless of what changed, the pipeline is not leveraging the monorepo’s dependency information. Start with Missing deployment pipeline.
  3. Can the team deploy a single service from the monorepo without triggering deployments of all services? If not, deployment automation does not understand the monorepo structure. Start with Manual deployments.

Ready to fix this? The most common cause is Missing deployment pipeline. Start with its How to Fix It section for week-by-week steps.

2 - Feedback Takes Hours Instead of Minutes

The time from making a change to knowing whether it works is measured in hours, not minutes. Developers batch changes to avoid waiting.

What you are seeing

A developer makes a change and wants to know if it works. They push to CI and wait 45 minutes for the pipeline. Or they open a PR and wait two days for a review. Or they deploy to staging and wait for a manual QA pass that happens next week. By the time feedback arrives, the developer has moved on to something else.

The slow feedback changes developer behavior. They batch multiple changes into a single commit to avoid waiting multiple times. They skip local verification and push larger, less certain changes. They start new work before the previous change is validated, juggling multiple incomplete tasks.

When feedback finally arrives and something is wrong, the developer must context-switch back. The mental model from the original change has faded. Debugging takes longer because the developer is working from memory rather than from active context. If multiple changes were batched, the developer must untangle which one caused the failure.

Common causes

Inverted Test Pyramid

When most tests are slow E2E tests, the test feedback loop is measured in tens of minutes rather than seconds. Unit tests provide feedback in seconds. E2E tests take minutes or hours. A team with a fast unit test suite can verify a change in under a minute. A team whose testing relies on E2E tests cannot get feedback faster than those tests can run.

Read more: Inverted Test Pyramid

Integration Deferred

When the team does not integrate frequently (at least daily), the feedback loop for integration problems is as long as the branch lifetime. A developer working on a two-week branch does not discover integration conflicts until they merge. Daily integration catches conflicts within hours. Continuous integration catches them within minutes.

Read more: Integration Deferred

Manual Testing Only

When there are no automated tests, the only feedback comes from manual verification. A developer makes a change and must either test it manually themselves (slow) or wait for someone else to test it (slower). Automated tests provide feedback in the pipeline without requiring human effort or scheduling.

Read more: Manual Testing Only

Long-Lived Feature Branches

When pull requests wait days for review, the code review feedback loop dominates total cycle time. A developer finishes a change in two hours, then waits two days for review. The review feedback loop is 24 times longer than the development time. Long-lived branches produce large PRs, and large PRs take longer to review. Fast feedback requires fast reviews, which requires small PRs, which requires short-lived branches.

Read more: Long-Lived Feature Branches

Manual Regression Testing Gates

When every change must pass through a manual QA gate, the feedback loop includes human scheduling. The QA team has a queue. The change waits in line. When the tester gets to it, days have passed. Automated testing in the pipeline replaces this queue with instant feedback.

Read more: Manual Regression Testing Gates

How to narrow it down

  1. How fast can the developer verify a change locally? If the local test suite takes more than a few minutes, the test strategy is the bottleneck. Start with Inverted Test Pyramid.
  2. How frequently does the team integrate to main? If developers work on branches for days before integrating, the integration feedback loop is the bottleneck. Start with Integration Deferred.
  3. Are there automated tests at all? If the only feedback is manual testing, the lack of automation is the bottleneck. Start with Manual Testing Only.
  4. How long do PRs wait for review? If review turnaround is measured in days, the review process is the bottleneck. Start with Long-Lived Feature Branches.
  5. Is there a manual QA gate in the pipeline? If changes wait in a QA queue, the manual gate is the bottleneck. Start with Manual Regression Testing Gates.

Ready to fix this? The most common cause is Inverted Test Pyramid. Start with its How to Fix It section for week-by-week steps.


3 - Merging Is Painful and Time-Consuming

Integration is a dreaded, multi-day event. Teams delay merging because it is painful, which makes the next merge even worse.

What you are seeing

A developer has been working on a feature branch for two weeks. They open a pull request and discover dozens of conflicts across multiple files. Other developers have changed the same areas of the codebase. Resolving the conflicts takes a full day. Some conflicts are straightforward (two people edited adjacent lines), but others are semantic (two people changed the same function’s behavior in different ways). The developer must understand both changes to merge correctly.

After resolving conflicts, the tests fail. The merged code compiles but does not work because the two changes are logically incompatible. The developer spends another half-day debugging the interaction. By the time the branch is merged, the developer has spent more time integrating than they spent building the feature.

The team knows merging is painful, so they delay it. The delay makes the next merge worse because more code has diverged. The cycle repeats until someone declares a “merge day” and the team spends an entire day resolving accumulated drift.

Common causes

Long-Lived Feature Branches

When branches live for weeks or months, they accumulate divergence from the main line. The longer the branch lives, the more changes happen on main that the branch does not include. At merge time, all of that divergence must be reconciled at once. A branch that is one day old has almost no conflicts. A branch that is two weeks old may have dozens.

Read more: Long-Lived Feature Branches

Integration Deferred

When the team does not practice continuous integration (integrating to main at least daily), each developer’s work diverges independently. The build may be green on each branch but broken when branches combine. CI means integrating continuously, not running a build server. Without frequent integration, merge pain is inevitable.

Read more: Integration Deferred

Monolithic Work Items

When work items are too large to complete in a day or two, developers must stay on a branch for the duration. A story that takes a week forces a week-long branch. Breaking work into smaller increments that can be integrated daily eliminates the divergence window that causes painful merges.

Read more: Monolithic Work Items

How to narrow it down

  1. How long do branches typically live before merging? If branches live longer than two days, the branch lifetime is the primary driver of merge pain. Start with Long-Lived Feature Branches.
  2. Does the team integrate to main at least once per day? If developers work in isolation for days before integrating, they are not practicing continuous integration regardless of whether a CI server exists. Start with Integration Deferred.
  3. How large are the typical work items? If stories take a week or more, the work decomposition forces long branches. Start with Monolithic Work Items.

Ready to fix this? The most common cause is Long-Lived Feature Branches. Start with its How to Fix It section for week-by-week steps.

4 - Each Language Has Its Own Ad Hoc Pipeline

Services in five languages with five build tools and no shared pipeline patterns. Each service is a unique operational snowflake.

What you are seeing

The Java service has a Jenkins pipeline set up four years ago. The Python service has a GitHub Actions workflow written by a consultant. The Go service has a Makefile. The Node.js service deploys from a developer’s laptop. The Ruby service has no deployment automation at all. Each service is a different discipline, maintained by whoever last touched it.

Onboarding a new engineer requires learning five different deployment systems. Fixing a security vulnerability in the dependency scanning step requires five separate changes across five pipeline definitions, each with different syntax. A compliance requirement that all services log deployment events requires five separate implementations, each time reinventing the pattern.

The team knows consolidation would help but cannot agree on a standard. The Java developers prefer their workflow. The Python developers prefer theirs. The effort to migrate any service to a common pattern feels risky because the current approach, however ad hoc, is known to work.

Common causes

Missing deployment pipeline

Without an organizational standard for pipeline design, each team or individual who sets up a service makes an independent choice based on personal familiarity. Establishing a standard pipeline pattern - even a minimal one - gives new services a starting point and gives existing services a target to migrate toward. Each service that adopts the standard is one fewer ad hoc pipeline to maintain separately.

Read more: Missing deployment pipeline

Knowledge silos

Each pipeline is understood only by the person who built it. Changes require that person. Debugging requires that person. When that person leaves, the pipeline becomes a black box that nobody wants to touch. The knowledge of “how the Ruby service deploys” is not shared across the team.

When pipeline patterns are standardized and documented, any team member can understand, debug, and improve any service’s pipeline. The knowledge is in the pattern, not in the person.

Read more: Knowledge silos

Manual deployments

Services that start with manual deployment accumulate automation piecemeal, in whatever form the person adding automation prefers. Without a standard, each automation effort produces a different result. The accumulation of five different automation approaches is harder to maintain than one standard approach applied to five services.

Read more: Manual deployments

How to narrow it down

  1. Does the team have a standard pipeline pattern that all services follow? If each service has a unique pipeline structure, start with establishing the standard. Start with Missing deployment pipeline.
  2. Can any engineer on the team deploy any service? If deploying a specific service requires the person who set it up, the pipeline knowledge is siloed. Start with Knowledge silos.
  3. Are there services with no deployment automation at all? Start with those services. Start with Manual deployments.

Ready to fix this? The most common cause is Missing deployment pipeline. Start with its How to Fix It section for week-by-week steps.

5 - Pull Requests Sit for Days Waiting for Review

Pull requests queue up and wait. Authors have moved on by the time feedback arrives.

What you are seeing

A developer opens a pull request and waits. Hours pass. A day passes. They ping someone in chat. Eventually, comments arrive, but the author has moved on to something else and has to reload context to respond. Another round of comments. Another wait. The PR finally merges two or three days after it was opened.

The team has five or more open PRs at any time. Some are days old. Developers start new work while they wait, which creates more PRs, which creates more review load, which slows reviews further.

Common causes

Long-Lived Feature Branches

When developers work on branches for days, the resulting PRs are large. Large PRs take longer to review because reviewers need more time to understand the scope of the change. A 300-line PR is daunting. A 50-line PR takes 10 minutes. The branch length drives the PR size, which drives the review delay.

Read more: Long-Lived Feature Branches

Knowledge Silos

When only specific individuals can review certain areas of the codebase, those individuals become bottlenecks. Their review queue grows while other team members who could review are not considered qualified. The constraint is not review capacity in general but review capacity for specific code areas concentrated in too few people.

Read more: Knowledge Silos

Push-Based Work Assignment

When work is assigned to individuals, reviewing someone else’s code feels like a distraction from “my work.” Every developer has their own assigned stories to protect. Helping a teammate finish their work by reviewing their PR competes with the developer’s own assignments. The incentive structure deprioritizes collaboration.

Read more: Push-Based Work Assignment

How to narrow it down

  1. Are PRs larger than 200 lines on average? If yes, the reviews are slow because the changes are too large to review quickly. Start with Long-Lived Feature Branches and the work decomposition that feeds them.
  2. Are reviews waiting on specific individuals? If most PRs are assigned to or waiting on one or two people, the team has a knowledge bottleneck. Start with Knowledge Silos.
  3. Do developers treat review as lower priority than their own coding work? If yes, the team’s norms do not treat review as a first-class activity. Start with Push-Based Work Assignment and establish a team working agreement that reviews happen before starting new work.

Ready to fix this? The most common cause is Long-Lived Feature Branches. Start with its How to Fix It section for week-by-week steps.

6 - The Team Resists Merging to the Main Branch

Developers feel unsafe committing to trunk. Feature branches persist for days or weeks before merge.

What you are seeing

Everyone still has long-lived feature branches. The team agreed to try trunk-based development, but three sprints later “merge to trunk when the feature is done” is the informal rule. Branches live for days or weeks. When developers finally merge, there are conflicts. The conflicts take hours to resolve. Everyone agrees this is a problem but nobody knows how to break the cycle.

The core objection is safety: “I’m not going to push half-finished code to main.” This is a reasonable concern in the current environment. The main branch has no automated test suite that would catch regressions quickly. There is no feature flag infrastructure to let partially-built features live in production in a dormant state. Trunk-based development feels reckless because the prerequisites for it are not in place.

The team is not wrong to feel unsafe. They are wrong to believe long-lived branches are safer. The longer a branch lives, the larger the eventual merge, the more conflicts, and the more risk concentrated into the merge event. The fear of merging to trunk is rational, but the response makes the underlying problem worse.

Common causes

Manual testing only

Without a fast automated test suite, merging to trunk means accepting unknown risk. Developers protect themselves by deferring the merge until they have done sufficient manual verification - which takes days. Teams with a fast automated suite that runs in minutes find the resistance dissolves. When a broken commit is caught in five minutes, committing to trunk stops feeling reckless and starts feeling like the obvious way to work.

Read more: Manual testing only

Manual regression testing gates

When a manual QA phase gates each release, trunk is never truly releasable. Merging to trunk does not mean the code is production-ready - it still has to pass manual testing. This reduces the psychological pressure to keep trunk releasable. The team does not feel the cost of a broken trunk immediately because it is not the signal they monitor.

When trunk is the thing that gates production, a broken trunk is a fire drill - every minute it is broken is a minute the team cannot ship. That urgency is what makes developers take frequent integration seriously. Without it, the resistance to committing to trunk has no natural counter-pressure.

Read more: Manual regression testing gates

Long-lived feature branches

Feature branch habits are self-reinforcing. Teams with ingrained feature branch practices have calibrated their workflows, tools, and feedback loops to the batching model. Switching to trunk-based development requires changing all of those workflows simultaneously, which is disorienting.

The habits that make long-lived branches feel safe - waiting to merge until the feature is complete, doing final testing on the branch, getting full review before touching trunk - are the same habits that keep the resistance alive. Small, deliberate workflow changes - reviewing smaller units, integrating while work is in progress, getting feedback from the pipeline rather than a gated review - reduce the resistance step by step rather than requiring an all-at-once mindset shift.

Read more: Long-lived feature branches

Monolithic work items

Large work items cannot be integrated to trunk incrementally without deliberate design. A story that takes three weeks requires either keeping a branch for three weeks, or learning to hide in-progress work behind feature flags, dark launch patterns, or abstraction layers. Without those techniques, large items force long-lived branches.

Decomposing work into smaller items that can be integrated to trunk in a day or two makes trunk-based development natural rather than effortful.

Read more: Monolithic work items

How to narrow it down

  1. Does the team have an automated test suite that runs in under 10 minutes? If not, the feedback loop needed to make frequent trunk commits safe does not exist. Start with Manual testing only.
  2. Is trunk always releasable? If releases require a manual QA phase regardless of trunk state, there is no incentive to keep trunk releasable. Start with Manual regression testing gates.
  3. Do work items typically take more than two days to complete? If items take longer than two days, integrating to trunk daily requires techniques for hiding in-progress work. Start with Monolithic work items.

Ready to fix this? The most common cause is Long-lived feature branches. Start with its How to Fix It section for week-by-week steps.

7 - Pipelines Take Too Long

Pipelines take 30 minutes or more. Developers stop waiting and lose the feedback loop.

What you are seeing

A developer pushes a commit and waits. Thirty minutes pass. An hour. The pipeline is still running. The developer context-switches to another task, and by the time the pipeline finishes (or fails), they have moved on mentally. If the build fails, they must reload context, figure out what went wrong, fix it, push again, and wait another 30 minutes.

Developers stop running the full test suite locally because it takes too long. They push and hope. Some developers batch multiple changes into a single push to avoid waiting multiple times, which makes failures harder to diagnose. Others skip the pipeline entirely for small changes and merge with only local verification.

The pipeline was supposed to provide fast feedback. Instead, it provides slow feedback that developers work around rather than rely on.

Common causes

Inverted Test Pyramid

When most of the test suite consists of end-to-end or integration tests rather than unit tests, the pipeline is dominated by slow, resource-intensive test execution. E2E tests launch browsers, spin up services, and wait for network responses. A test suite with thousands of unit tests (that run in seconds) and a small number of targeted E2E tests is fast. A suite with hundreds of E2E tests and few unit tests is slow by construction.

Read more: Inverted Test Pyramid

Snowflake Environments

When pipeline environments are not standardized or reproducible, builds include extra time for environment setup, dependency installation, and configuration. Caching is unreliable because the environment state is unpredictable. A pipeline that spends 15 minutes downloading dependencies because there is no reliable cache layer is slow for infrastructure reasons, not test reasons.

Read more: Snowflake Environments

Tightly Coupled Monolith

When the codebase has no clear module boundaries, every change triggers a full rebuild and a full test run. The pipeline cannot selectively build or test only the affected components because the dependency graph is tangled. A change to one module might affect any other module, so the pipeline must verify everything.

Read more: Tightly Coupled Monolith

Manual Regression Testing Gates

When the pipeline includes a manual testing phase, the wall-clock time from push to green includes human wait time. A pipeline that takes 10 minutes to build and test but then waits two days for manual sign-off is not a 10-minute pipeline. It is a two-day pipeline with a 10-minute automated prefix.

Read more: Manual Regression Testing Gates

How to narrow it down

  1. What percentage of pipeline time is spent running tests? If test execution dominates and most tests are E2E or integration tests, the test strategy is the bottleneck. Start with Inverted Test Pyramid.
  2. How much time is spent on environment setup and dependency installation? If the pipeline spends significant time on infrastructure before any tests run, the build environment is the bottleneck. Start with Snowflake Environments.
  3. Can the pipeline build and test only the changed components? If every change triggers a full rebuild, the architecture prevents selective testing. Start with Tightly Coupled Monolith.
  4. Does the pipeline include any manual steps? If a human must approve or act before the pipeline completes, the human is the bottleneck. Start with Manual Regression Testing Gates.

Ready to fix this? The most common cause is Inverted Test Pyramid. Start with its How to Fix It section for week-by-week steps.

8 - The Team Is Caught Between Shipping Fast and Not Breaking Things

A cultural split between shipping speed and production stability. Neither side sees how CD resolves the tension.

What you are seeing

The team is divided. Developers want to ship often and trust that fast feedback will catch problems. Operations and on-call engineers want stability and fewer changes to reason about during incidents. Both positions are defensible. The conflict is real and recurs in every conversation about deployment frequency, change windows, and testing requirements.

The team has reached an uncomfortable equilibrium. Developers batch changes to deploy less often, which partially satisfies the stability concern but creates larger, riskier releases. Operations accepts the change window constraints, which gives them predictability but means the team cannot respond quickly to urgent fixes. Nobody is getting what they actually want.

What neither side sees is that the conflict is a symptom of the current deployment system, not an inherent tradeoff. Deployments are risky because they are large and infrequent. They are large and infrequent because of the process and tooling around them. A system that makes deployments small, fast, automated, and reversible changes the equation: frequent small changes are less risky than infrequent large ones.

Common causes

Manual deployments

Manual deployments are slow and error-prone, which makes the stability concern rational. When deployments require hours of careful manual execution, limiting their frequency does reduce overall human error exposure. The stability faction’s instinct is correct given the current deployment mechanism.

Automated deployments that execute the same steps identically every time eliminate most human error from the deployment process. When the deployment mechanism is no longer a variable, the speed-vs-stability argument shifts from “how often should we deploy” to “how good is the code we are deploying” - a question both sides can agree on.

Read more: Manual deployments

Missing deployment pipeline

Without a pipeline with automated tests, health checks, and rollback capability, the stability concern is valid. Each deployment is a manual, unverified process that could go wrong in novel ways. A pipeline that enforces quality gates before production and detects problems immediately after deployment changes the risk profile of frequent deployments fundamentally.

When the team can deploy with high confidence and roll back automatically if something goes wrong, the frequency of deployments stops being a risk factor. The risk per deployment is low when each deployment is small, tested, and reversible.

Read more: Missing deployment pipeline

Pressure to skip testing

When testing is perceived as an obstacle to shipping speed, teams cut tests to go faster. This worsens stability, which intensifies the stability faction’s resistance to more frequent deployments. The speed-vs-stability tension is partly created by the belief that quality and speed are in opposition - a belief reinforced by the experience of shipping faster by skipping tests and then dealing with the resulting production incidents.

Read more: Pressure to skip testing

Deadline-driven development

When velocity is measured by features shipped to a deadline, every hour spent on test infrastructure, deployment automation, or operational excellence is an hour not spent on the deadline. The incentive structure creates the tension by rewarding speed while penalizing the investment that would make speed safe.

Read more: Deadline-driven development

How to narrow it down

  1. Is the deployment process automated and consistent? If deployments are manual and variable, the stability concern is about process risk, not just code risk. Start with Manual deployments.
  2. Does the team have automated testing and fast rollback? Without these, deploying frequently is genuinely riskier than deploying infrequently. Start with Missing deployment pipeline.
  3. Does management pressure the team to ship faster by cutting testing? If yes, the tension is being created from above rather than within the team. Start with Pressure to skip testing.

Ready to fix this? The most common cause is Manual deployments. Start with its How to Fix It section for week-by-week steps.