Set Up Uptime Monitoring with Slack and Email Alerts — AI Workflow

Deploy Uptime Kuma for self-hosted monitoring — HTTP, TCP, and DNS checks, Slack and email alerts, status pages for customers, and maintenance windows.

The Problem

A startup runs six services: a marketing site, a customer-facing API, a dashboard SPA, a webhook processing worker, a cron scheduler, and a PostgreSQL database. The team finds out about outages from customer complaints — a Slack message from a frustrated user saying "is the API down?" followed by 10 minutes of frantic investigation to figure out which service is actually broken.

Last month, the database went down at 2 AM due to disk space. Nobody noticed until 8 AM when the first developer logged in. Six hours of downtime, 340 failed webhook deliveries, and an angry enterprise customer who lost confidence in the platform. The CEO's email started with "how did we not know the database was down for six hours?"

The team tried a SaaS monitoring service, but at $29/month for 50 monitors with 1-minute intervals, the cost adds up — especially when they need to monitor internal services that aren't publicly accessible. They need something that monitors both public endpoints and internal services, sends alerts to Slack and email, and provides a customer-facing status page — all self-hosted on infrastructure they already pay for.

The Solution

Deploy Uptime Kuma — a self-hosted monitoring tool — using the uptime-kuma skill. It monitors HTTP endpoints, TCP ports, DNS records, and even Docker containers. Alerts go to Slack, email, and any webhook. A public status page shows customers real-time service health without exposing internal infrastructure details.

Step-by-Step Walkthrough

Step 1: Deploy Uptime Kuma

Deploy Uptime Kuma on our existing VPS for monitoring our 6 services. I need 
1-minute check intervals, Slack alerts for the engineering channel, email alerts 
for the on-call engineer, and a public status page for customers.

bash

docker run -d --name uptime-kuma --restart always \
  -p 3001:3001 \
  -v uptime-kuma-data:/app/data \
  louislam/uptime-kuma:1

Dashboard available at http://server-ip:3001. Set admin credentials on first visit.

For production, put it behind Caddy or Nginx with HTTPS:

caddyfile

status.example.com {
    reverse_proxy localhost:3001
}

Step 2: Configure Monitors

Add monitors through the dashboard (Settings → Add New Monitor) or the API:

HTTP monitors — check that endpoints return 200:

Monitor	URL	Interval	Timeout
Marketing Site	`https://example.com`	60s	10s
API Health	`https://api.example.com/health`	60s	5s
Dashboard	`https://app.example.com`	60s	10s
Webhook Worker	`https://api.example.com/webhooks/health`	60s	5s

TCP monitors — check that ports are reachable:

Monitor	Host	Port	Interval
PostgreSQL	`db-server`	5432	60s
Redis	`redis-server`	6379	60s

Keyword monitors — verify response contains expected content:

Monitor	URL	Expected keyword	Note
API Version	`https://api.example.com/health`	`"status":"ok"`	Catches 200 responses with error body

For the API health endpoint, a keyword check is more reliable than a simple HTTP check. A reverse proxy might return 200 even when the backend is down — the keyword check verifies the actual application is responding.

Step 3: Set Up Notifications

Slack — alerts to #engineering-alerts channel:

Create a Slack Incoming Webhook at api.slack.com/apps
In Uptime Kuma: Settings → Notifications → Add → Slack
Paste the webhook URL
Test the notification

Email — for on-call engineer:

Settings → Notifications → Add → SMTP
Configure: smtp.gmail.com, port 587, TLS
From: monitoring@example.com
To: oncall@example.com

Custom webhook — for PagerDuty or custom alerting:

Settings → Notifications → Add → Webhook
URL: https://events.pagerduty.com/v2/enqueue
Body template with {{ msg }} and {{ monitorJSON }}

Step 4: Create the Status Page

Uptime Kuma's built-in status page shows customers which services are operational without exposing internal details:

Go to Status Pages → Add
Title: "Platform Status"
Slug: status (accessible at status.example.com/status/status)
Add groups:
- Core Platform: API, Dashboard
- Website: Marketing site
- Integrations: Webhook processing

Each group shows a green/yellow/red indicator and uptime percentage. Customers see "API — Operational (99.97% uptime)" without knowing about your database server, Redis cache, or internal monitoring tools.

Custom domain: point status.example.com to Uptime Kuma and configure the status page path.

Step 5: Add Maintenance Windows

When deploying updates or running database migrations, set maintenance windows so monitoring doesn't fire false alerts:

Go to Maintenance → Add
Title: "Database maintenance"
Affected monitors: PostgreSQL, API Health
Schedule: one-time or recurring
Duration: 30 minutes

During maintenance, the status page shows "Scheduled Maintenance" instead of "Down", and no alerts fire. This prevents the 2 AM deploy from waking up the on-call engineer with false alarms.

Step 6: Monitor Internal Services via Docker

If Uptime Kuma runs on the same Docker network as your services, it can monitor containers directly:

yaml

# docker-helper.yml
services:
  uptime-kuma:
    image: louislam/uptime-kuma:1
    restart: always
    ports:
      - "3001:3001"
    volumes:
      - uptime-kuma-data:/app/data
      - /var/run/docker.sock:/var/run/docker.sock:ro  # Docker monitoring
    networks:
      - internal

  api:
    image: myapp/api:latest
    networks:
      - internal

  postgres:
    image: postgres:16
    networks:
      - internal

networks:
  internal:

volumes:
  uptime-kuma-data:

With Docker socket access, Uptime Kuma can monitor container health status directly — not just port availability, but whether the container's health check passes.

Real-World Example

The ops lead deploys Uptime Kuma on a Friday afternoon — 15 minutes for Docker setup, 30 minutes to add all 6 monitors and configure Slack + email notifications. The status page goes live at status.example.com with a link from the app's footer.

Monday at 3:17 AM, the Redis container runs out of memory and crashes. Within 60 seconds, Uptime Kuma detects the TCP port is unreachable. The on-call engineer gets a Slack notification and an email simultaneously. They SSH in, check Docker logs, increase the memory limit, and restart Redis. Total downtime: 8 minutes. The status page showed "Redis Cache — Degraded" during the incident and auto-recovered to "Operational" when the check passed again.

The following week, the team schedules a 20-minute maintenance window for a database migration at 2 AM. The status page shows "Scheduled Maintenance" starting at 1:55 AM. No alerts fire during the migration. Customers who check the status page see the maintenance notice instead of a scary red "Down" indicator.

After one month, the dashboard shows 99.97% uptime across all services. The single Redis incident is visible in the uptime graph, and the three maintenance windows are clearly marked. The CEO forwards the status page to the enterprise customer who complained about the 6-hour outage — "we've fixed this."

Related Skills

uptime-kuma -- Advanced Uptime Kuma configuration, API, and Docker monitoring
coolify -- Self-hosted deployment platform (can run Uptime Kuma)