Terminal.skills
Skills/gatus
>

gatus

Expert guidance for Gatus, the lightweight, self-hosted health check and status page tool written in Go. Helps developers set up endpoint monitoring with conditions, alerting, and a beautiful status page — all configured via a single YAML file with no database required.

#uptime#health-check#status-page#self-hosted#lightweight
terminal-skillsv1.0.0
Works with:claude-codeopenai-codexgemini-clicursor
Source

Usage

$
✓ Installed gatus v1.0.0

Getting Started

  1. Install the skill using the command above
  2. Open your AI coding agent (Claude Code, Codex, Gemini CLI, or Cursor)
  3. Reference the skill in your prompt
  4. The AI will use the skill's capabilities automatically

Example Prompts

  • "Deploy the latest build to the staging environment and run smoke tests"
  • "Check the CI pipeline status and summarize any recent failures"

Documentation

Overview

Gatus, the lightweight, self-hosted health check and status page tool written in Go. Helps developers set up endpoint monitoring with conditions, alerting, and a beautiful status page — all configured via a single YAML file with no database required.

Instructions

Configuration

yaml
# config.yaml — Complete Gatus configuration
# Single YAML file defines all endpoints, conditions, and alerts.

# Storage (optional — for persistence across restarts)
storage:
  type: sqlite
  path: /data/gatus.db

# Web UI settings
web:
  port: 8080

# Alert providers
alerting:
  slack:
    webhook-url: "${SLACK_WEBHOOK_URL}"
    default-alert:
      enabled: true
      failure-threshold: 3             # Alert after 3 consecutive failures
      success-threshold: 2             # Recover after 2 consecutive successes
      send-on-resolved: true
  
  pagerduty:
    integration-key: "${PAGERDUTY_KEY}"
    default-alert:
      enabled: true
      failure-threshold: 5
      send-on-resolved: true

  email:
    from: "gatus@example.com"
    host: "smtp.example.com"
    port: 587
    username: "${SMTP_USER}"
    password: "${SMTP_PASS}"
    default-alert:
      enabled: false                    # Only enable on critical endpoints

# Endpoints to monitor
endpoints:
  # --- API Health ---
  - name: API Gateway
    group: backend
    url: "https://api.example.com/health"
    interval: 30s
    conditions:
      - "[STATUS] == 200"
      - "[RESPONSE_TIME] < 2000"       # Under 2 seconds
      - "[BODY].status == healthy"     # JSON body check
    alerts:
      - type: slack
      - type: pagerduty

  - name: Auth Service
    group: backend
    url: "https://api.example.com/auth/health"
    interval: 30s
    conditions:
      - "[STATUS] == 200"
      - "[RESPONSE_TIME] < 1000"

  # --- Frontend ---
  - name: Website
    group: frontend
    url: "https://example.com"
    interval: 60s
    conditions:
      - "[STATUS] == 200"
      - "[RESPONSE_TIME] < 3000"
      - "[BODY] contains Welcome"      # Verify page renders
    alerts:
      - type: slack

  # --- Database ---
  - name: PostgreSQL
    group: infrastructure
    url: "tcp://db.example.com:5432"
    interval: 30s
    conditions:
      - "[CONNECTED] == true"
    alerts:
      - type: slack
      - type: pagerduty

  # --- Redis ---
  - name: Redis
    group: infrastructure
    url: "tcp://redis.example.com:6379"
    interval: 15s
    conditions:
      - "[CONNECTED] == true"

  # --- DNS ---
  - name: DNS Resolution
    group: infrastructure
    url: "dns://8.8.8.8"
    dns:
      query-name: "example.com"
      query-type: "A"
    conditions:
      - "[DNS_RCODE] == NOERROR"
      - "[RESPONSE_TIME] < 500"

  # --- SSL Certificate ---
  - name: SSL Certificate
    group: security
    url: "https://example.com"
    interval: 1h
    conditions:
      - "[CERTIFICATE_EXPIRATION] > 720h"    # Alert if < 30 days

  # --- External Dependencies ---
  - name: Stripe API
    group: external
    url: "https://api.stripe.com/v1"
    interval: 5m
    conditions:
      - "[STATUS] == 401"               # Unauthenticated is expected (API is up)
      - "[RESPONSE_TIME] < 3000"

  # --- GraphQL ---
  - name: GraphQL API
    group: backend
    url: "https://api.example.com/graphql"
    method: POST
    headers:
      Content-Type: application/json
    body: '{"query": "{ __typename }"}'
    interval: 30s
    conditions:
      - "[STATUS] == 200"
      - "[BODY].data.__typename == Query"

Deployment

yaml
# docker-compose.yml — Self-hosted Gatus
version: "3.8"
services:
  gatus:
    image: twinproduction/gatus:latest
    ports:
      - "8080:8080"
    volumes:
      - ./config.yaml:/config/config.yaml
      - gatus-data:/data
    environment:
      - SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL}
      - PAGERDUTY_KEY=${PAGERDUTY_KEY}
    restart: unless-stopped

volumes:
  gatus-data:
yaml
# Kubernetes deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gatus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gatus
  template:
    metadata:
      labels:
        app: gatus
    spec:
      containers:
        - name: gatus
          image: twinproduction/gatus:latest
          ports:
            - containerPort: 8080
          volumeMounts:
            - name: config
              mountPath: /config
          envFrom:
            - secretRef:
                name: gatus-secrets
      volumes:
        - name: config
          configMap:
            name: gatus-config

Installation

bash
# Docker
docker run -p 8080:8080 -v $(pwd)/config.yaml:/config/config.yaml twinproduction/gatus

# Binary
go install github.com/TwiN/gatus/v5@latest

# Helm
helm repo add gatus https://twin.github.io/helm-charts
helm install gatus gatus/gatus

Examples

Example 1: Setting up Gatus for a microservices project

User request:

I have a Node.js API and a React frontend running in Docker. Set up Gatus for monitoring/deployment.

The agent creates the necessary configuration files based on patterns like # config.yaml — Complete Gatus configuration, sets up the integration with the existing Docker setup, configures appropriate defaults for a Node.js + React stack, and provides verification commands to confirm everything is working.

Example 2: Troubleshooting deployment issues

User request:

Gatus is showing errors in our deployment. Here are the logs: [error output]

The agent analyzes the error output, identifies the root cause by cross-referencing with common Gatus issues, applies the fix (updating configuration, adjusting resource limits, or correcting syntax), and verifies the resolution with appropriate health checks.

Guidelines

  1. Single YAML — Keep all monitoring in one config file; version control it alongside your infrastructure
  2. Group endpoints — Use groups (backend, frontend, infrastructure) for organized status page display
  3. Multiple conditions — Check status code AND response time AND body content; a 200 with wrong content is still broken
  4. Failure thresholds — Set failure-threshold to 2-3 to avoid false alarms from transient network blips
  5. Check dependencies — Monitor external services (Stripe, AWS) separately; know when the issue is upstream vs yours
  6. SSL monitoring — Check certificate expiration weekly; alert at 30 days to give time for renewal
  7. Lightweight deployment — Gatus uses ~15MB RAM; run it on any machine, even a Raspberry Pi
  8. SQLite for history — Enable SQLite storage for uptime history across restarts; no external database needed

Information

Version
1.0.0
Author
terminal-skills
Category
DevOps
License
Apache-2.0