Health Checks

ZinTrust ships health endpoints suitable for container orchestrators (Kubernetes, ECS, Nomad) and load balancers.

These routes are registered by registerHealthRoutes(...) in routes/health.ts:

GET /health — "overall" service health (includes a database ping)
GET /health/live — liveness (process is running)
GET /health/ready — readiness (dependencies are reachable and responding)

Which endpoint to use

Use the endpoints with intent:

/health/live answers: "Should this process be restarted?" It does not touch external dependencies.
/health/ready answers: "Should this instance receive traffic?" It probes dependencies.
/health is a simpler dependency-aware health check and is also used in some examples/containers.

In Kubernetes terms:

livenessProbe → /health/live
readinessProbe → /health/ready

Endpoint behavior (what ZinTrust actually does)

`GET /health`

Implementation summary:

Resolves the DB instance via useDatabase().
If the adapter supports it, it will call db.isConnected() and then db.connect() if needed.
Performs a DB liveness ping using QueryBuilder.ping(db).
Returns:
- 200 with { status: 'healthy', database: 'connected', ... } on success
- 503 with { status: 'unhealthy', database: 'disconnected', error: ... } on failure

Response shape (success):

json

{
  "status": "healthy",
  "timestamp": "2026-01-01T00:00:00.000Z",
  "uptime": 123.456,
  "database": "connected",
  "environment": "development"
}

Response shape (failure):

json

{
  "status": "unhealthy",
  "timestamp": "2026-01-01T00:00:00.000Z",
  "database": "disconnected",
  "error": "Service unavailable"
}

Notes:

environment is computed from Env.NODE_ENV ?? 'development'.
uptime uses process.uptime() when available, otherwise 0 (useful in edge runtimes).
On failure, the log line is emitted via Logger.error('Health check failed:', error).

`GET /health/live`

Implementation summary:

Returns process liveness only.
Never probes DB / cache.
Always returns 200.

Response shape:

json

{
  "status": "alive",
  "timestamp": "2026-01-01T00:00:00.000Z",
  "uptime": 123.456
}

`GET /health/ready`

Implementation summary:

Uses appConfig.environment (not Env.NODE_ENV) for the environment field.
Probes the DB via QueryBuilder.ping(db) (including the same connect-if-needed logic as /health).
Optionally probes cache:
- Calls RuntimeHealthProbes.pingKvCache(2000).
- If it returns a number, cache is included under dependencies.
- If it returns null, the cache dependency is omitted.
Returns:
- 200 + status: 'ready' on success
- 503 + status: 'not_ready' on failure

Response shape (success):

json

{
  "status": "ready",
  "timestamp": "2026-01-01T00:00:00.000Z",
  "environment": "development",
  "dependencies": {
    "database": { "status": "ready", "responseTime": 5 },
    "cache": { "status": "ready", "responseTime": 12 }
  }
}

Response shape (failure):

json

{
  "status": "not_ready",
  "timestamp": "2026-01-01T00:00:00.000Z",
  "environment": "production",
  "dependencies": {
    "database": { "status": "unavailable", "responseTime": 100 },
    "cache": { "status": "unavailable", "responseTime": 100 }
  },
  "error": "Service unavailable"
}

Notes:

On failure, the route logs Logger.error('Readiness check failed:', error).
Cache is only included on failure when RuntimeHealthProbes.getCacheDriverName() === 'kv'.

Production behavior (error redaction)

ZinTrust intentionally reduces error detail in production-like environments.

/health: treats both production and prod as production.
/health/ready: treats production as production.

In production, error becomes a generic "Service unavailable". In non-production, error is the thrown error message.

This reduces information disclosure to unauthenticated callers and keeps probes safe to expose.

Example probe commands

bash

curl -sS http://localhost:7777/health | jq
curl -sS http://localhost:7777/health/live | jq
curl -sS http://localhost:7777/health/ready | jq

Kubernetes example

yaml

livenessProbe:
	httpGet:
		path: /health/live
		port: 3000
	initialDelaySeconds: 5
	periodSeconds: 10

readinessProbe:
	httpGet:
		path: /health/ready
		port: 3000
	initialDelaySeconds: 5
	periodSeconds: 10

Troubleshooting

If /health or /health/ready returns 503:

Check DB connectivity and credentials (and whether your adapter supports connect() / isConnected()).
Verify migrations have run (a DB can be reachable but unusable).
If you expect cache probing, confirm your cache driver is configured to kv so it will be probed/reported.

If probes are flapping:

Increase readiness probe timeouts in your orchestrator.
Consider warming connections at boot (see startup config validation and boot-time checks).

Health Checks ​

Which endpoint to use ​

Endpoint behavior (what ZinTrust actually does) ​

GET /health ​

GET /health/live ​

GET /health/ready ​

Production behavior (error redaction) ​

Example probe commands ​

Kubernetes example ​

Troubleshooting ​

Health Checks

Which endpoint to use

Endpoint behavior (what ZinTrust actually does)

`GET /health`

`GET /health/live`

`GET /health/ready`

Production behavior (error redaction)

Example probe commands

Kubernetes example

Troubleshooting