---
manifest_version: 1
service: MediRoutes Atlas — AI Snapshot API
last_updated: 2026-04-24
---

# MediRoutes Atlas — AI Snapshot API

You are connected to the **MediRoutes Atlas AI server**. Atlas is a read-only health and topology dashboard for the MediRoutes Azure environment — ~115 applications, their dependencies, metrics, restarts, and Application Insights telemetry, all aggregated into one snapshot. This server exposes that snapshot as JSON for AI agents to read.

You can use this API to:
- Find resources that are unhealthy or generating errors right now.
- Pull the top exceptions on a resource, with stack traces, to find the offending code in the linked source repo.
- Reason about blast radius via dependencies / dependents.
- Compare current state across resources or resource groups.

This API is **read-only**. There are no write endpoints.

## Authentication

Every endpoint under `/api/ai/*` requires a Personal API Token (PAT) in the `X-API-Key` header:

```
X-API-Key: atlas_pat_<base64url(32 bytes)>
```

The user generates the token in the main Atlas UI (Settings → AI → API Tokens) and gives it to you alongside the URL of this manifest. Tokens are per-user and revocable. If you receive `401 Invalid or revoked API token`, the user must generate a new one.

This manifest endpoint (`/` and `/manifest.md`) and `/healthz` are the only routes that do NOT require the header.

## Endpoints

| Method | Path | Returns | When to use |
|---|---|---|---|
| GET | `/api/ai/index` | Lightweight sitemap: every resource as `{key, displayName, type, resourceGroup, health, criticality, hasRepository, detailUrl}` | First call to discover what resources exist. Cheap (~10 KB). Poll this if you only need to know "what's new." |
| GET | `/api/ai/snapshot` | Full environment dump: every resource with metrics + restarts + App Insights telemetry + dependencies + dependents | One shot to ingest everything. Larger payload (~1 MB for ~115 resources). |
| GET | `/api/ai/resource/:key` | Deep detail for one resource (same shape as a single entry from `/snapshot`, but fetched on demand) | Drill into one resource without pulling the whole snapshot. Use the `key` field from `/index`. |
| GET | `/api/ai/status` | `{generatedAt, ttlSeconds, cacheStatus, resourceCount}` — current snapshot freshness | Check freshness without pulling data. |

All four are GET-only and idempotent.

## Response shape — the highlights

You'll learn the full shape from the responses themselves; these are the fields that matter most:

**On every resource (`/index` and `/snapshot`):**
- `key` — opaque string identifier. Pass to `/api/ai/resource/:key`.
- `displayName`, `type`, `resourceGroup`, `region`, `tags` — basic identity.
- `criticality.level` — one of `critical | important | standard | unknown`. Derived from role + health + dependent count. Read `criticality.reason` for the human-readable why.
- `health.status` — one of `healthy | degraded | critical | unknown`.
- `repository.url` + `repository.platform` — link to the source repo (e.g. `azure-devops`, `github`). Use this to find code for a fix.

**On full snapshot entries only:**
- `metrics` — 1h window: `errorCount`, `errorRatePercent`, `avgResponseTimeMs`, `cpuAvgPercent`, `memoryAvgPercent`, `requestCount`. Any may be `null` if not available.
- `restarts` — 24h window: `count` plus `events[]` with timestamps + reasons.
- `applicationInsights.failedRequests.topExceptions[]` — up to 5 exception groups per resource, each with `type`, `count`, `sampleMessage`, `sampleStackTrace` (truncated to ~8 KB), `sampleOperation`, `firstSeen`, `lastSeen`. **This is the field to read when finding bugs.**
- `applicationInsights.workspaceIngestion` — workspace + daily cap status. Watch for `ingestionStatus: 'RespectQuota'` (data being dropped).
- `dependencies[]` — resources this one depends on.
- `dependents[]` — resources that depend on this one.

**On `/snapshot` top-level:**
- `summary.{resourceCount, healthyCount, degradedCount, criticalCount, totalErrorsLastHour}`.
- `cacheStatus` — `fresh | cached | stale-refreshing`.

Predictable nulls: when a field is unavailable (e.g., App Insights not configured for a resource), it's set to `null` rather than omitted. Don't assume missing-vs-null; just check for non-null.

## Common workflows

### Find unhealthy resources

1. `GET /api/ai/snapshot`.
2. Filter `resources[]` where `health.status === 'critical' || health.status === 'degraded'`.
3. Sort by `criticality.level` (critical → important → standard).

### Find the code for a failing exception

1. `GET /api/ai/snapshot` (or `/api/ai/resource/:key` if you already know the resource).
2. For a resource with errors, read `applicationInsights.failedRequests.topExceptions[]`. Each entry has `type`, `sampleMessage`, `sampleStackTrace`, and `sampleOperation` (the HTTP route that triggered it).
3. Read `repository.url` for the repo URL. Combine: stack trace's top frame + the resource's repo URL → the file/method to inspect.
4. The `sampleOperation` (e.g. `PUT /api/integrations/foo`) is the route. Find the matching controller in the repo.

### Triage by impact

1. `GET /api/ai/index`.
2. For each `criticality === 'critical'` resource, check `health` to see if any are unhealthy NOW.
3. The intersection (critical importance + bad health) is the priority queue.

### Cross-resource pattern detection

1. `GET /api/ai/snapshot`.
2. Aggregate `applicationInsights.failedRequests.topExceptions[].type` across resources.
3. Same exception type appearing on multiple resources usually means a shared library/service bug.

## Limits & quirks

- **Cache TTL:** the snapshot is cached server-side for 10 minutes (configurable via `AI_SNAPSHOT_TTL_SECONDS`). Calls during a refresh return the previous cached value.
- **Time windows:** metrics are over the last 1 hour; restarts over the last 24 hours. These are fixed in v1.
- **Unknown health on VMSS resources:** the snapshot doesn't query VM Scale Set telemetry yet. Don't infer "VMSS is broken" from `health: unknown`.
- **No write operations.** Don't expect to mutate config, restart resources, or create incidents through this API. Future versions may add write operations under a different version field.
- **Rate limiting:** none enforced today, but every call to `/snapshot` may trigger Azure Monitor / App Insights queries on cache miss. Don't poll `/snapshot` faster than every minute.

## Versioning

This manifest's `manifest_version` field at the top will increment when the API surface or response shapes change in ways that could break clients. v1 is the initial release. If you see `manifest_version: 2` and don't recognize new fields/endpoints, fall back to the v1 surface and surface a "manifest version mismatch" warning to the user.

## Errors

- `401 Missing X-API-Key header` — you didn't send the header. Add it.
- `401 Invalid or revoked API token` — token is wrong or revoked. Ask the user for a new one.
- `404 Resource not found` — the `:key` you passed to `/api/ai/resource/:key` doesn't match any resource. Use `/api/ai/index` to list valid keys.
- `503 AI tokens feature is not provisioned yet` — DB migration not run. Tell the user to contact whoever maintains Atlas.
- `500` with `detail` — server-side issue. Surface the detail to the user.
