A service catalog tells you what services exist. A dependency map tells you what happens when one of them changes. The difference between those two questions is the difference between knowing you have a payment service and knowing what happens to your order flow when someone renames a field in its checkout response.
The gap between inventory and impact
Most platform teams arrive at the same moment: the catalog is humming, Backstage is set up, CODEOWNERS is populated, and someone on the payments squad renames a field in their OpenAPI spec. Two days later an on-call rotation fires because order-processor is returning 500s on every checkout. The postmortem eventually finds the renamed field. The action item reads: "improve communication between squads."
Communication wasn't the problem. The dependency was undocumented — not in Backstage, not in any CODEOWNERS entry, and not in anyone's mental model. The field rename was a perfectly reasonable internal refactor. The consumer was a service owned by a different squad that had been quietly calling the old field name for eight months.
Consider a concrete version of this pattern: a platform team running 60 services across five squads. The payments squad adds a new checkout endpoint and, as part of a cleanup PR, renames chargeAmount to amount in the response body. Their OpenAPI spec is updated, CI passes, the change ships on a Tuesday. On Thursday morning, the order processing squad gets paged — their service is 500-ing on every checkout. The root cause takes 90 minutes to trace: order-processor was deserializing the checkout response by field name, the field name no longer exists, and nobody on either squad knew the dependency existed because it was never declared anywhere.
This is the gap between a service catalog and a dependency map. A catalog answers "who owns service X?" A dependency map answers "if I change contract Y in service X, which other services break, degrade, or remain unaffected?" The catalog couldn't have prevented that incident. A dependency map would have surfaced the consumer at PR time.
Why runtime instrumentation isn't the answer
The most common approach to closing this gap is runtime service mesh telemetry: instrument everything, collect call graphs from production traffic, build the dependency map from observed behavior. Istio, Linkerd, and distributed tracing platforms can all reconstruct call topology from live traffic.
There are three problems with using runtime data as your primary dependency map:
- It's reactive, not predictive. The dependency graph reflects what has happened in production, not what will happen when you deploy a breaking change. If a consumer calls a deprecated endpoint infrequently — say, a reconciliation batch job that runs on the first of each month — it may not appear in recent traffic samples. You won't discover it until the job runs post-deploy.
- It captures call paths, not contract obligations. A service can call
GET /payments/{id}successfully today even while consuming a field in the response body that you're about to remove. Runtime telemetry records the call path; it doesn't track which specific fields in a response body the consumer extracts and depends on. Those are two different dependency types, and only schema-level analysis captures the second. - It requires production risk to build the map. By definition, you have to deploy a change before you know its downstream impact. A runtime-derived graph is only as complete as your observed traffic — undocumented consumers that haven't called recently are invisible until they break.
Schema-first static analysis: same graph, zero production risk
The alternative is static schema analysis: parse the contract declarations that already live in your repos and build the dependency graph from them before anything runs in production.
Every serious platform team already has the raw material. OpenAPI specs describe REST contract shapes. AsyncAPI files describe event schemas on Kafka or Pub/Sub topics. gRPC proto files describe the full RPC surface of each service. Internal SDK packages often ship type declaration files that represent the contract between a shared library and its callers.
Static analysis reads these files and constructs a directed graph: every node is a service, every edge is a declared contract dependency. When order-processor imports payment-api's generated OpenAPI client and deserializes the chargeAmount field, that dependency is a graph edge — captured in the schema import declaration, visible before any code runs.
The critical property: this graph exists before you push. When the payments engineer opens a PR that renames chargeAmount to amount, static analysis walks the graph, finds every consumer that has a declared reference to chargeAmount, and classifies the change. The impact report appears in the PR check before merge. The 90-minute incident postmortem never happens.
What a dependency map needs to be useful
Not all static analysis tools produce maps that are actionable in a platform engineering context. A useful dependency map has four properties that separate it from a theoretical graph:
Multi-schema coverage. Your services don't all speak OpenAPI. A payments service might expose a REST API, a gRPC billing RPC, and publish three Kafka event topics. A dependency map that only covers one schema type is blind to the others. Useful maps parse OpenAPI 3.x, AsyncAPI 2.x and 3.x, proto3, and at minimum the package.json or go.mod dependency declarations that proxy internal SDK contracts.
Consumer-side field-level resolution. An edge in the graph has two ends: the producer that defines the contract, and the consumer that depends on it. Static analysis has to resolve both sides — not just "who publishes this schema" but "which consuming services reference which specific fields from it." Field-level resolution is what separates a map that catches renames from one that only catches removed endpoints.
Team ownership as a first-class node attribute. An impact result that says "order-processor will break" is useful. One that also says "order-processor is owned by the Checkout Platform squad, on-call is @squad-checkout-oncall, their SLA tier is P1" is actionable. The map should carry ownership metadata so the impacted team can be notified in the same CI check output, without anyone needing to open Backstage separately.
Semantic schema diff as input. The map isn't useful in isolation — it becomes useful when combined with a proposed change diff. The workflow is: PR opens → diff is extracted from changed schema files → diff is evaluated against the dependency graph → impact report is generated. This requires the analysis engine to understand schema semantics, not just string diffs. A field rename has a different impact profile than a type change, which has a different profile than a field removal. The diff engine has to classify these correctly to score risk accurately.
The catalog remains the right tool — for its question
We're not saying your service catalog is the wrong investment. The catalog is still the right place to document service purpose, SLAs, runbooks, and team contacts. Backstage in particular is well-suited for the human-navigable service inventory layer.
What the catalog is not designed to answer — and shouldn't be expected to answer — is "what will break if I make this change?" That question requires graph traversal against a current dependency snapshot, evaluated against a specific diff. A static documentation page can't do that regardless of how carefully it's maintained.
The practical integration: use your catalog as the source of truth for team ownership metadata, and feed that metadata into the nodes of your dependency map. When the map surfaces an impacted consumer, it resolves the owning team from catalog data and surfaces the contact information directly in the impact report. Both tools become more valuable in combination than either is alone.
Getting started without a complete schema inventory
The most common objection to static dependency mapping is: "we don't have complete OpenAPI specs for all our services." This is usually true. It's also not a blocker — and treating it as one is the reason many teams defer this indefinitely.
Prioritize the services where schemas already exist and where the impact of breaking changes is highest: typically your core transactional services (payment, order, auth) and any service with more than three downstream consumers. A partial graph covering 60–70% of your services will catch the highest-risk changes. The remaining gap is not a failure state — it's a prioritized backlog of schema documentation work, each item backed by a concrete business case: "this service's consumers are invisible, which means any contract change is an undeclared blast radius."
The goal isn't a perfect graph before you start. It's a graph that improves pre-deploy visibility week over week, progressively reducing the frequency of "unknown downstream consumer" as a postmortem root cause. That's a measurable outcome, and it starts on day one of partial coverage, not when you've achieved 100%.
Buildpathio parses your existing OpenAPI, AsyncAPI, and proto files to build a dependency graph that updates on every PR. No runtime agents required.
Start Free Trial