Backstage and Developer Portals
The developer portal is the front door, ownership registry, and golden-path launcher that makes everything else discoverable. Backstage established the pattern; the question is whether to operate it or use a commercial alternative.
The developer portal is not a convenience layer on top of CI/CD — it is the front door, the ownership registry, and the golden-path launcher that makes everything else discoverable. Backstage established the pattern; the question is whether your organisation has the engineering capacity to operate it, or whether a commercial alternative gets you 80% of the value with 20% of the maintenance burden.
Core Properties
| Property | Value |
|---|---|
| Origin | Spotify internal tool (~2016), open-sourced March 2020 |
| CNCF status | Incubating (accepted Sept 2020, promoted March 2022) |
| Language | TypeScript / React (frontend) + Node.js (backend) |
| Primary database | PostgreSQL (SQLite for local dev only) |
| Plugin count | 230+ official + community plugins (as of 2024) |
| Public adopters | 3,400+ organisations, 270+ publicly listed |
| Core features | Software Catalog, Scaffolder, TechDocs, Search, Plugins |
| Hosting options | Self-hosted (K8s), Roadie (managed), Spotify Portal (commercial) |
When to Use / Avoid
Use When
- You have 50+ engineers across multiple teams where service ownership is becoming opaque — “who owns this?” is a daily question.
- You want a catalogue-first foundation before investing in golden-path templates and runbooks — start with just
catalog-info.yamlfiles. - Your team has at least 2-3 TypeScript/React engineers willing to own the portal as a product, not a side project.
- You need deep customisation through plugin development — org-specific workflows, custom scorecards, internal tooling embedded in the portal.
- You are already operating in a CNCF/Kubernetes-native ecosystem and want integration with ArgoCD, Kubernetes dashboards, GitHub Actions, PagerDuty.
Avoid When
- Your team is under 30 engineers — the catalogue won’t pay for its operational overhead yet.
- You have no TypeScript capacity — Backstage is fundamentally a TypeScript/React monorepo that you fork and maintain. Without that skill, it becomes a perpetual project.
- You need something live in 4 weeks — Backstage takes 6-12 months to reach genuine production quality with real coverage. Use Port or OpsLevel for fast time-to-value.
- Your primary need is service maturity scoring / DORA metrics — Cortex or OpsLevel solve this more directly without the operational weight.
- You want a no-code configuration model — Backstage’s entire surface area is code.
Backstage Architecture
Backstage is not an off-the-shelf application. It is a framework — a React + Node.js monorepo you fork, customise, and operate. This distinction matters enormously when planning adoption.
graph TD
Browser[Browser / Developer] --> FE[Frontend — React SPA]
FE --> BE[Backend — Node.js]
BE --> DB[(PostgreSQL)]
BE --> P1[Catalog Plugin]
BE --> P2[Scaffolder Plugin]
BE --> P3[TechDocs Plugin]
BE --> P4[Search Plugin]
BE --> P5[Custom Plugins]
P1 --> EP[Entity Providers<br/>GitHub / GitLab / AWS / Custom]
The frontend is a React single-page application composed of plugin UI contributions — each plugin registers its own routes, sidebar items, and entity tabs. The backend is a Node.js application that hosts plugin backends via the New Backend System (post-v1.24, 2023): plugins are declared via backend.add() rather than being wired together manually. The Plugin Registry is the mechanism that makes this composable.
Crucially, Backstage does not manage infrastructure — it is a metadata layer. It reads catalog-info.yaml files from your SCM, surfaces relationships, and provides a UI for navigating and acting on them. The actual infrastructure runs elsewhere.
Source: Backstage Architecture Overview — frontend/backend plugin composition and the New Backend System module pattern.
The New Backend System (post-2023)
Pre-2023, wiring up Backstage backend plugins required significant boilerplate — manual service injection, bespoke registration code. The New Backend System unified this into a declarative pattern:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import { createBackend } from '@backstage/backend-defaults';
/**
* Minimal Backstage backend setup using the New Backend System.
* Each backend.add() call registers a plugin or module.
* No manual dependency injection needed — the framework resolves it.
*/
const backend = createBackend();
backend.add(import('@backstage/plugin-catalog-backend'));
backend.add(import('@backstage/plugin-catalog-backend-module-github'));
backend.add(import('@backstage/plugin-scaffolder-backend'));
backend.add(import('@backstage/plugin-techdocs-backend'));
backend.add(import('@backstage/plugin-search-backend'));
backend.add(import('@backstage/plugin-auth-backend'));
backend.add(import('@backstage/plugin-auth-backend-module-guest-provider'));
backend.start();
Extension points let plugins expose customisation hooks (custom Scaffolder actions, custom catalog processors) without requiring consumers to fork plugin internals.
The Software Catalog — The Actual Product
The Software Catalog is not just a feature — it is the control plane that everything else in Backstage depends on. Scaffolder templates, TechDocs, permissions, and third-party plugins all reason over catalog entities. Stale or incomplete catalog data silently undermines every other investment.
Entity Kinds
graph LR
Domain --> System
System --> Component
System --> Resource
Component --> API
User --> Group
Group -->|ownsAll| Component
Group -->|ownsAll| API
| Kind | What it represents | Key spec fields |
|---|---|---|
Component |
A deployable unit — service, website, library, ML model | type, lifecycle, owner, system |
API |
An interface contract — OpenAPI, gRPC, GraphQL, AsyncAPI | type, lifecycle, owner, definition |
Resource |
Infrastructure — databases, S3 buckets, GKE clusters | type, owner, system |
System |
A collection of related components + resources | owner, domain |
Domain |
A business domain grouping systems | owner |
User |
An individual person | profile, memberOf |
Group |
A team or org unit | type, parent, children, members |
The catalog-info.yaml Model in Depth
Every entity is declared in a catalog-info.yaml file, co-located with the source code it describes. The format deliberately mirrors Kubernetes manifests.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: payment-service
description: Handles checkout payment processing via Stripe and Adyen
annotations:
# Links Backstage to the GitHub repo for auto-discovery
github.com/project-slug: acme/payment-service
# Links TechDocs build to this entity
backstage.io/techdocs-ref: dir:.
# Links to PagerDuty service for on-call info
pagerduty.com/service-id: P1234AB
tags:
- payments
- critical-path
links:
- url: https://grafana.acme.com/d/payments
title: Payments Dashboard
icon: dashboard
spec:
type: service
lifecycle: production # sandbox | development | production | deprecated
owner: group:payments-team
system: checkout-system
dependsOn:
- component:order-service
- resource:payments-postgres-db
consumesApis:
- stripe-payment-api
- adyen-payment-api
providesApis:
- internal-payments-api
The relations section in a fully processed entity looks like this after the catalog processes it:
ownedBy—payments-teamowns this componentpartOf— this component is part ofcheckout-systemconsumesApi— this component callsstripe-payment-apidependsOn— runtime dependency onorder-service
Why the model is the actual product: Once catalog-info.yaml files exist across all services with accurate owner, lifecycle, system, and dependency data, the following become possible automatically: blast-radius analysis for incidents, change-owner notifications, deprecated-dependency detection, and permission policies scoped to ownership. Without catalog quality, none of these work.
Source: Backstage Descriptor Format — full entity kinds and relations specification.
Entity Providers: Keeping the Catalog Fresh
Static catalog-info.yaml registration works for bootstrapping but does not scale. Entity providers run on a schedule and pull entities automatically:
- GitHub Discovery — crawls a GitHub org for all repos containing
catalog-info.yaml, registers them automatically. Supports filtering by visibility and archival status. - GitLab Discovery — same pattern; supports GitLab groups and subgroups.
- AWS S3 — discovers entities from S3 bucket prefixes.
- Custom providers — for internal registries, CMDB systems, cloud asset inventories.
The key operational question is: where does the truth live? The answer should always be “in the repo, in catalog-info.yaml, owned by the team that owns the service.” Providers that pull from external CMDBs or spreadsheets create a second source of truth and tend to drift.
Core Features
Software Templates (Scaffolder)
The Scaffolder is Backstage’s golden-path materialiser. A template defines the steps to create a new service: scaffold a repo from a skeleton, create a catalog-info.yaml, set up CI, configure secrets, register the entity in the catalog. Templates are YAML with embedded actions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: python-service-template
title: Python Microservice
description: Scaffolds a new Python service with FastAPI, CI, and observability defaults
spec:
owner: platform-team
type: service
parameters:
- title: Service Information
required: [name, owner]
properties:
name:
title: Service Name
type: string
pattern: '^[a-z][a-z0-9-]*$'
owner:
title: Owner Team
type: string
ui:field: OwnerPicker
steps:
- id: fetch-template
name: Fetch Skeleton
action: fetch:template
input:
url: ./skeleton
values:
name: $
- id: publish
name: Create GitHub Repo
action: publish:github
input:
repoUrl: github.com?owner=acme&repo=$
- id: register
name: Register in Catalog
action: catalog:register
input:
repoContentsUrl: $
catalogInfoPath: /catalog-info.yaml
The Scaffolder enforces catalog discipline at creation time — every new service starts with a valid catalog-info.yaml including owner, lifecycle, and system placement. This is the single highest-leverage thing you can do to improve catalog quality.
TechDocs
TechDocs implements docs-as-code: Markdown files live in the service repo alongside code, an MkDocs pipeline generates static HTML, and Backstage serves it in context alongside the catalog entity. Documentation is versioned with the service, not floating in a wiki that drifts.
The pipeline: markdown in repo → TechDocs builder (MkDocs) → static files published to cloud storage (GCS / S3) → TechDocs plugin serves them in the portal.
Search
Backstage’s pluggable search indexes catalog entities, TechDocs pages, and any plugin that registers a search collator. The default backend uses Lunr (in-memory, no external dep) for development; production deployments should switch to Elasticsearch or OpenSearch for adequate performance above ~10K entities.
Plugins
The plugin ecosystem is where Backstage’s value compounds. A plugin is a frontend + backend pair — frontend registers UI contributions (entity tabs, sidebar items, cards), backend provides API endpoints. Notable plugins with direct production value:
| Plugin | What it adds |
|---|---|
| Kubernetes | Live pod status, rollout progress on entity pages |
| ArgoCD | Deployment sync status, application health per service |
| GitHub Actions | Recent workflow runs surfaced on entity page |
| PagerDuty | On-call schedule, recent incidents per service |
| Datadog / Grafana | Service dashboards embedded in entity view |
| Lighthouse | Automated accessibility audits linked to website entities |
| Cost Insights | Cloud cost surfaced per team / service |
| Tech Insights | Service maturity scorecards (Roadie-pioneered, now OSS) |
Backstage vs Commercial Alternatives
flowchart TD
Q1{Do you have TypeScript<br/>engineers to own it?} -->|Yes| Q2{Need deep custom<br/>plugins or white-label?}
Q1 -->|No| COM[Commercial portal<br/>Port / OpsLevel / Cortex]
Q2 -->|Yes| BS[Self-hosted Backstage]
Q2 -->|No — but want OSS base| RD[Roadie — hosted Backstage]
COM --> Q3{Primary need?}
Q3 -->|Fast onboarding<br/>no-code data model| PORT[Port]
Q3 -->|Service maturity<br/>scorecards| CX[Cortex]
Q3 -->|Lightweight catalog<br/>+ quick DX| OL[OpsLevel]
| Dimension | Backstage (self-hosted) | Roadie | Port | Cortex | OpsLevel |
|---|---|---|---|---|---|
| Setup time | 6–12 months to production quality | 2–4 weeks | 3–6 months | 4–8 weeks | 4–6 weeks |
| Engineering cost | 4+ FTE engineers ongoing | ~0.5 FTE | 1–2 FTE | ~0.5 FTE | ~0.5 FTE |
| Customisation | Unlimited (plugin code) | High (plugin UI, no infra) | High (Blueprints, no-code) | Medium | Medium |
| Data model | Fixed entity kinds + custom extensions | Same as Backstage | Fully custom Blueprints | Service-centric | Service-centric |
| Scorecards | Via Tech Insights plugin | Yes (Tech Insights) | Yes | First-class feature | Yes |
| Scaffolding | Yes (Scaffolder) | Yes | Yes (self-service actions) | No | Yes (Actions) |
| Price (200 devs) | ~$150K/yr engineering time | ~$52K/yr ($22/dev/mo) | ~$72K/yr ($30+/dev/mo) | ~$156K/yr ($65/dev/mo) | Lower |
| Lock-in risk | None (OSS) | Low (Backstage-based) | High (proprietary) | High (proprietary) | High (proprietary) |
| Best fit | Large eng orgs with platform team | Mid-size, want OSS without ops | Fast onboarding, no-code | Maturity scoring focus | Simple catalog + quick wins |
Source: Tasrie IT: Port vs Backstage vs Cortex comparison (2026) and Roadie: 7 Best Developer Portals — cost estimates, setup timelines, and capability comparison.
Adoption Maturity Curve
The single most common Backstage failure mode is attempting to deploy it with a full plugin suite and custom development on day one before any catalog quality exists. The model that works:
graph LR
P1[Phase 1<br/>Catalog only] --> P2[Phase 2<br/>Scaffolder]
P2 --> P3[Phase 3<br/>TechDocs + Plugins]
P3 --> P4[Phase 4<br/>Custom plugins]
P1 --- N1["catalog-info.yaml in every repo<br/>Entity providers for auto-discovery<br/>Owner + lifecycle + system fields"]
P2 --- N2["Golden-path templates for new services<br/>Scaffolder enforces catalog standards<br/>Templates create repo + CI + catalog entry"]
P3 --- N3["TechDocs co-located with code<br/>Kubernetes / ArgoCD / PagerDuty plugins<br/>Tech Insights scorecards"]
P4 --- N4["Org-specific workflows as plugins<br/>Internal tooling embedded in portal<br/>Cost Insights, custom dashboards"]
Phase 1 is the foundation and the hardest. Getting catalog-info.yaml into every active repo with accurate owner, lifecycle, system, and at least one dependency relation requires a sustained campaign — incentives, automation, PR templates, linting. Without this, Phase 2 and beyond are built on sand.
Phase 2 pays for Phase 1. Once Scaffolder templates exist, every new service starts with a correct catalog-info.yaml. Organic growth fills in the catalog without manual campaigns.
Phase 3 compounds the value. Plugins that surface Kubernetes state, ArgoCD sync status, and PagerDuty incidents in the context of a catalog entity are genuinely useful — developers stop switching between six dashboards. TechDocs co-located with code reduces documentation drift.
Phase 4 is where Backstage outcompetes commercial portals. Custom plugins for org-specific workflows — deployment approvals, internal certificate management, cost attribution, feature flag management — cannot be bought off-the-shelf. This is the moat.
Production Operations
Infrastructure Requirements
A production Backstage instance needs:
- PostgreSQL — required for all production deployments. SQLite is development-only. Each plugin gets its own schema within a shared Postgres instance (
pluginDivisionMode: schema) or separate databases. The Catalog plugin alone can grow to millions of rows at scale. - Authentication — OAuth / OIDC backed by your identity provider (Google Workspace, Okta, Azure AD, GitHub). Guest mode is for local dev only.
- Object storage — required for TechDocs static file hosting (GCS or S3). Without it, TechDocs pages are re-generated on every request.
- Search backend — Lunr is memory-resident and unsuitable above ~5K indexed documents. Switch to Elasticsearch or OpenSearch early.
- Container deployment — Backstage runs as a Docker container (frontend served by backend), deployed to Kubernetes. Plan for at least 2 replicas behind a load balancer.
Scaling Characteristics
A Backstage instance becomes performance-sensitive above roughly 15K entities in the catalog. Known scaling failure modes:
- Provider contention — multiple entity providers running simultaneously cause catalog processing queue saturation. Stagger provider schedules.
- Identity auth overhead — at 14K+ user entities,
IdentityAuthInjectorFetchMiddlewarecan time out on slow identity provider responses. - Multi-region read/write split — at global scale (multiple regions), teams split catalog read and write paths: primary writes to Postgres in one region, read replicas serve catalog queries in others.
Operational Cost Honesty
Running a mature Backstage instance with real adoption is a full-time product. Spotify’s own deployment has a dedicated team. External adopters consistently report needing 2-4 engineers minimum for: plugin upgrades (Backstage releases weekly), catalog quality campaigns, auth configuration, custom plugin development, and user support. The perpetual “we’re rolling out Backstage” syndrome is real — organisations that don’t staff it properly stall at Phase 1 indefinitely.
How Real Systems Use Backstage
Spotify — Origin and Scale
Backstage originated at Spotify around 2016 as an internal tool to solve a specific problem: ~2,000 engineers across ~2,000 microservices, and nobody could answer “who owns this?” or “what depends on this?” in under 5 minutes. The original system unified infrastructure tooling, services, documentation, and team pages into a single coherent frontend. Open-sourced in March 2020 via Stefan Ålund’s blog post “What the Heck Is Backstage Anyway?” — donated to CNCF in September 2020. Spotify reports 99% internal adoption. The internal instance serves as the canonical demonstration of what Phase 4 looks like: custom plugins for internal tooling, golden-path templates that create fully instrumented services, and a catalog that is the authoritative record of every service in production.
Source: Stefan Ålund, What the Heck Is Backstage Anyway?, Spotify Engineering, March 2020
American Airlines — “Runway” in 6 Minutes
American Airlines built their internal developer portal “Runway” on top of Backstage, starting development in May 2020 — almost immediately after Backstage’s open-source release. The headline metric: teams can deploy applications with public ingress in under 6 minutes using Runway’s golden-path templates. The portal covers the airline’s polyglot service estate. This is the case study that demonstrates the Scaffolder’s real-world value — not just discoverability, but accelerated time-to-production via templates that encode the compliance, networking, and CI requirements already.
Expedia Group — 5,000 Developers, 20,000 Services
Expedia Group adopted Backstage in 2020 as the foundation of their Developer Experience platform, scaling it to 5,000+ developers across 15+ brands (Expedia, Hotels.com, Vrbo, etc.) managing approximately 20,000 microservices. Their approach used GitHub Discovery for automatic catalog population across the entire organisation’s repos. Expedia’s scale validates the entity provider approach — manually registering 20,000 entities would be operationally impossible; the GitHub provider crawls and registers continuously. The case study also shows Phase 3 adoption: embedded Kubernetes plugin for pod health, CI/CD status surfaces directly on service pages.
Source: Roadie: Expedia Case Study — GitHub Discovery at 20K+ service scale.
Mercedes-Benz — Enterprise IDP at Automotive Scale
Mercedes-Benz Tech Innovation uses Backstage as the developer portal for their internal platform serving thousands of engineers across automotive software and cloud services adjacent to the MB.OS in-car operating system. Their implementation covers Software Catalog, TechDocs, Scaffolding, and custom plugins specific to the automotive software lifecycle — components that map to vehicle software domains alongside cloud services. The Mercedes-Benz case study is significant for a non-web-native, traditional enterprise context: platform engineering applied where the software estate includes both cloud-native services and automotive embedded software, demonstrating that the catalog model generalises beyond pure web services.
Source: CNCF Case Study: Mercedes-Benz
Netflix — Targeted Plugin Integration
Netflix did not adopt Backstage as its primary developer portal — their internal platform predates it. However, Netflix engineering built an internal Backstage plugin to surface real-time canary analysis results from their Atlas metrics system directly in entity pages. This illustrates the integration pattern for organisations with existing internal platforms: Backstage as the UI aggregation layer, not a wholesale replacement. Netflix’s canary analysis plugin pulls Atlas time-series data and renders it in context of a deployment entity — a workflow that would require significant custom development in any commercial portal.
LinkedIn — Catalog at Professional Network Scale
LinkedIn is a publicly listed Backstage adopter operating at a scale that stresses every component. LinkedIn’s engineering org spans thousands of services across their data-intensive platform (feeds, job matching, search). Their Backstage adoption focuses on the catalog as an ownership and dependency registry — understanding blast radius for their massive service graph, where a single foundational service can have hundreds of downstream dependents. At this scale, the entity graph becomes a critical operational tool during incidents, not just a discovery interface during development.
Anti-Patterns
Vanity portal syndrome. Deploying Backstage because the industry expects it, without a catalog quality campaign. The portal goes live with 40 entities, a handful of plugins, and 5% adoption. Nobody uses it; the platform team claims success by pointing at the deployment.
Stale catalog. Registering services manually without entity providers. Repos get created and never registered; registered services go stale as teams change and ownership shifts. A catalog where 30% of entries have incorrect owners is worse than no catalog — it actively misleads incident responders.
Plugin sprawl. Installing 40 plugins at launch because they are available. Each plugin adds frontend bundle weight and backend processing overhead. Start with 3-5 plugins that solve real daily pain. Earn the right to add more.
Treating Backstage as a CMS. Teams use TechDocs to create elaborate portals full of PDFs and embedded SharePoint links, rather than co-located Markdown. The result is indistinguishable from Confluence — a documentation graveyard, not a developer interface.
Perpetual rollout. The most common failure mode at scale. The platform team announces “we’re rolling out Backstage” and spends 18 months never finishing Phase 1. Root cause: no dedicated team, no adoption incentives, no scorecard measuring catalog completeness, no exec sponsor who cares. The fix is treating catalog completeness as an engineering metric with a named owner and a quarterly target.
References
- 📄 Stefan Ålund, What the Heck Is Backstage Anyway?, Spotify Engineering (2020) — Original open-source announcement with Spotify’s problem statement and vision
- 📄 Pia Nilsson, How We Use Backstage at Spotify (2020) — Spotify’s internal adoption journey and golden-path realisation
- 📖 Backstage Documentation — Descriptor Format — Complete entity kinds, relations, and catalog-info.yaml reference
- 📖 Backstage Architecture Overview — Plugin composition, New Backend System, frontend/backend split
- 🔗 CNCF Project: Backstage — CNCF status, metrics, governance
- 🔗 Roadie: Expedia Group Case Study — 5,000 developers, 20,000 microservices, GitHub Discovery at scale
- 🔗 CNCF Case Study: Mercedes-Benz — Enterprise IDP for automotive software at scale
- 🔗 Roadie — Hosted Backstage — Managed Backstage at ~$22/developer/month
- 🔗 Tasrie IT: Port vs Backstage vs Cortex comparison (2026) — Feature and cost comparison across portal options
- 🔗 The New Stack: Five Years In, Backstage Is Just Getting Started — 2025 state of adoption and community trajectory
- 🎥 QCon London 2024: Everything Is a Plugin — Backstage Architecture at Spotify and Beyond — Plugin architecture deep dive from the Backstage maintainer team
- 🔗 Port — Agentic Internal Developer Portal — Blueprints-based no-code alternative
- 🔗 Cortex — Service Maturity Scoring — Scorecard-first developer portal
- 🔗 OpsLevel — Prescriptive catalog and maturity tracking
- 🔗 Backstage Backend System Architecture — New Backend System module pattern reference