Health Checks
ZinTrust ships health endpoints suitable for container orchestrators (Kubernetes, ECS, Nomad) and load balancers.
These routes are registered by registerHealthRoutes(...) in routes/health.ts:
GET /health— "overall" service health (includes a database ping)GET /health/live— liveness (process is running)GET /health/ready— readiness (dependencies are reachable and responding)
Which endpoint to use
Use the endpoints with intent:
/health/liveanswers: "Should this process be restarted?" It does not touch external dependencies./health/readyanswers: "Should this instance receive traffic?" It probes dependencies./healthis a simpler dependency-aware health check and is also used in some examples/containers.
In Kubernetes terms:
livenessProbe→/health/livereadinessProbe→/health/ready
Endpoint behavior (what ZinTrust actually does)
GET /health
Implementation summary:
- Resolves the DB instance via
useDatabase(). - If the adapter supports it, it will call
db.isConnected()and thendb.connect()if needed. - Performs a DB liveness ping using
QueryBuilder.ping(db). - Returns:
200with{ status: 'healthy', database: 'connected', ... }on success503with{ status: 'unhealthy', database: 'disconnected', error: ... }on failure
Response shape (success):
{
"status": "healthy",
"timestamp": "2026-01-01T00:00:00.000Z",
"uptime": 123.456,
"database": "connected",
"environment": "development"
}Response shape (failure):
{
"status": "unhealthy",
"timestamp": "2026-01-01T00:00:00.000Z",
"database": "disconnected",
"error": "Service unavailable"
}Notes:
environmentis computed fromEnv.NODE_ENV ?? 'development'.uptimeusesprocess.uptime()when available, otherwise0(useful in edge runtimes).- On failure, the log line is emitted via
Logger.error('Health check failed:', error).
GET /health/live
Implementation summary:
- Returns process liveness only.
- Never probes DB / cache.
- Always returns
200.
Response shape:
{
"status": "alive",
"timestamp": "2026-01-01T00:00:00.000Z",
"uptime": 123.456
}GET /health/ready
Implementation summary:
- Uses
appConfig.environment(notEnv.NODE_ENV) for theenvironmentfield. - Probes the DB via
QueryBuilder.ping(db)(including the same connect-if-needed logic as/health). - Optionally probes cache:
- Calls
RuntimeHealthProbes.pingKvCache(2000). - If it returns a number,
cacheis included underdependencies. - If it returns
null, thecachedependency is omitted.
- Calls
- Returns:
200+status: 'ready'on success503+status: 'not_ready'on failure
Response shape (success):
{
"status": "ready",
"timestamp": "2026-01-01T00:00:00.000Z",
"environment": "development",
"dependencies": {
"database": { "status": "ready", "responseTime": 5 },
"cache": { "status": "ready", "responseTime": 12 }
}
}Response shape (failure):
{
"status": "not_ready",
"timestamp": "2026-01-01T00:00:00.000Z",
"environment": "production",
"dependencies": {
"database": { "status": "unavailable", "responseTime": 100 },
"cache": { "status": "unavailable", "responseTime": 100 }
},
"error": "Service unavailable"
}Notes:
- On failure, the route logs
Logger.error('Readiness check failed:', error). - Cache is only included on failure when
RuntimeHealthProbes.getCacheDriverName() === 'kv'.
Production behavior (error redaction)
ZinTrust intentionally reduces error detail in production-like environments.
/health: treats bothproductionandprodas production./health/ready: treatsproductionas production.
In production, error becomes a generic "Service unavailable". In non-production, error is the thrown error message.
This reduces information disclosure to unauthenticated callers and keeps probes safe to expose.
Example probe commands
curl -sS http://localhost:3000/health | jq
curl -sS http://localhost:3000/health/live | jq
curl -sS http://localhost:3000/health/ready | jqKubernetes example
livenessProbe:
httpGet:
path: /health/live
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10Troubleshooting
If /health or /health/ready returns 503:
- Check DB connectivity and credentials (and whether your adapter supports
connect()/isConnected()). - Verify migrations have run (a DB can be reachable but unusable).
- If you expect cache probing, confirm your cache driver is configured to
kvso it will be probed/reported.
If probes are flapping:
- Increase readiness probe timeouts in your orchestrator.
- Consider warming connections at boot (see startup config validation and boot-time checks).