Skip to content

Heartbeat & Health Monitoring

The Heartbeat module provides health monitoring with threshold-based status transitions and proactive wake capabilities.

Overview

Monitor the health of your agents and services:

  • Health Checks - HTTP, TCP, custom checks
  • Threshold-Based Status - Graceful degradation handling
  • Proactive Wake - Wake agents when work is pending
  • Event System - React to status changes

Installation

typescript
import {
  HeartbeatService,
  createHeartbeatService,
  HealthChecker,
  ProactiveWake,
} from '@aiversum/aiv-agents/heartbeat';

Quick Start

typescript
const heartbeat = createHeartbeatService({
  checkInterval: 30000, // 30 seconds
});

// Register a health check
heartbeat.registerCheck('api', {
  type: 'http',
  url: 'http://localhost:3000/health',
  timeout: 5000,
});

// Start monitoring
heartbeat.start();

// Listen for status changes
heartbeat.on('status:change', ({ component, status, previousStatus }) => {
  console.log(`${component}: ${previousStatus} → ${status}`);
});

Health Check Types

HTTP Check

typescript
heartbeat.registerCheck('api', {
  type: 'http',
  url: 'https://api.example.com/health',
  method: 'GET',
  timeout: 5000,
  expectedStatus: 200,
  headers: {
    'Authorization': 'Bearer token',
  },
});

TCP Check

typescript
heartbeat.registerCheck('database', {
  type: 'tcp',
  host: 'localhost',
  port: 5432,
  timeout: 3000,
});

Custom Check

typescript
heartbeat.registerCheck('cache', {
  type: 'custom',
  handler: async () => {
    const redis = await getRedisClient();
    const pong = await redis.ping();
    return pong === 'PONG';
  },
});

Status Transitions

Health status follows a threshold-based model:

          successThreshold
               reached
    ┌───────────────────────────┐
    │                           │
    ▼                           │
┌───────┐    failure    ┌──────────┐    failureThreshold    ┌───────────┐
│HEALTHY│──────────────▸│ DEGRADED │────────────────────────▸│ UNHEALTHY │
└───────┘               └──────────┘                         └───────────┘
    ▲                           │                                  │
    │                           │                                  │
    └───────────────────────────┴──────────────────────────────────┘
                            successThreshold reached

Configuration

typescript
heartbeat.registerCheck('api', {
  // ... check config ...

  // Thresholds
  failureThreshold: 3,   // Failures before unhealthy
  successThreshold: 2,   // Successes before healthy
  degradedThreshold: 1,  // Failures before degraded
});

Proactive Wake

Wake agents when there's pending work:

typescript
import { createProactiveWake } from '@aiversum/aiv-agents/heartbeat';

const wake = createProactiveWake({
  checkInterval: 60000, // Check every minute
});

// Register a work source
wake.registerSource('inbox', async () => {
  const count = await getUnreadEmailCount();
  return {
    hasPendingWork: count > 0,
    priority: count > 10 ? 'high' : 'normal',
    metadata: { count },
  };
});

// React to pending work
wake.on('work:detected', async ({ source, priority, metadata }) => {
  console.log(`Work detected from ${source}: ${metadata.count} items`);

  if (priority === 'high') {
    await startAgent('email-processor');
  }
});

wake.start();

Events

HeartbeatService Events

typescript
heartbeat.on('check:start', ({ component }) => {
  console.log(`Checking ${component}...`);
});

heartbeat.on('check:complete', ({ component, healthy, duration }) => {
  console.log(`${component}: ${healthy ? '✓' : '✗'} (${duration}ms)`);
});

heartbeat.on('status:change', ({ component, status, previousStatus }) => {
  if (status === 'unhealthy') {
    sendAlert(`${component} is unhealthy!`);
  }
});

heartbeat.on('error', ({ component, error }) => {
  console.error(`Error checking ${component}:`, error);
});

ProactiveWake Events

typescript
wake.on('work:detected', ({ source, priority, metadata }) => {
  // Handle pending work
});

wake.on('work:cleared', ({ source }) => {
  // Work queue is empty
});

wake.on('check:error', ({ source, error }) => {
  // Handle check failure
});

Full Configuration

typescript
interface HeartbeatConfig {
  // Timing
  checkInterval: number;     // ms between checks (default: 30000)
  startDelay?: number;       // Delay before first check

  // Logging
  logLevel?: 'debug' | 'info' | 'warn' | 'error';

  // Global thresholds (can be overridden per check)
  defaultFailureThreshold?: number;
  defaultSuccessThreshold?: number;
}

interface HealthCheckConfig {
  // Check type
  type: 'http' | 'tcp' | 'custom';

  // HTTP options
  url?: string;
  method?: string;
  headers?: Record<string, string>;
  expectedStatus?: number;
  expectedBody?: string | RegExp;

  // TCP options
  host?: string;
  port?: number;

  // Custom handler
  handler?: () => Promise<boolean>;

  // Common options
  timeout?: number;
  enabled?: boolean;

  // Thresholds
  failureThreshold?: number;
  successThreshold?: number;
  degradedThreshold?: number;
}

Getting Status

typescript
// Get single component status
const status = heartbeat.getStatus('api');
// {
//   component: 'api',
//   status: 'healthy' | 'degraded' | 'unhealthy',
//   lastCheck: Date,
//   lastSuccess: Date,
//   consecutiveFailures: number,
//   consecutiveSuccesses: number,
// }

// Get all statuses
const all = heartbeat.getAllStatuses();
// Map<string, ComponentStatus>

// Check if all healthy
const allHealthy = heartbeat.isHealthy();

Integration Example

Combining heartbeat with other modules:

typescript
import { createHeartbeatService } from '@aiversum/aiv-agents/heartbeat';
import { createConfigManager } from '@aiversum/aiv-agents/config';

const config = createConfigManager({ configPath: './config.json' });
await config.load();

const heartbeat = createHeartbeatService({
  checkInterval: config.getValue('heartbeat.interval') || 30000,
});

// Register checks based on config
const endpoints = config.getValue('heartbeat.endpoints') || [];
for (const endpoint of endpoints) {
  heartbeat.registerCheck(endpoint.name, {
    type: 'http',
    url: endpoint.url,
    timeout: endpoint.timeout,
  });
}

// React to config changes
config.on('config:change', ({ changes }) => {
  const intervalChange = changes.find(c => c.path === 'heartbeat.interval');
  if (intervalChange) {
    heartbeat.setInterval(intervalChange.newValue);
  }
});

heartbeat.start();

API Reference

See the HeartbeatService API Reference for complete documentation.

Testing

bash
# Run heartbeat module tests
npm test -- --grep "heartbeat"

# Test count: 92 tests

Released under the MIT License.