[TERMINAL · SKILLS]
> mounting /skills...
> indexing 295 manifests...
> linking agents: claude · codex · gemini · cursor
> ready.
[░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 0%
Terminal.skills
Use Cases/Build a Tenant-Aware Job Scheduler

Build a Tenant-Aware Job Scheduler

Build a multi-tenant job scheduling system with cron expressions, per-tenant rate limits, execution isolation, retry policies, and a management dashboard — replacing scattered cron jobs with a centralized scheduler.

#redis#caching#database#pub-sub#queues
Works with:claude-codeopenai-codexgemini-clicursor

Skills stack · 5 skills

Avg quality 93/100·All SAFE
>

typescript

v

Not yet scored
View skill
>

redis

v1.0.0

Build applications with Redis — caching, session storage, pub/sub, streams, rate limiting, leaderboards, and queues. Use when tasks involve in-memory data storage, real-time messaging, distributed locking, or performance optimization with caching layers.

93/100 quality
1.81× impact
SAFE
View skill
>

postgresql

v1.0.0

Assists with designing schemas, writing performant queries, managing indexes, and operating PostgreSQL databases. Use when working with JSONB, full-text search, window functions, CTEs, row-level security, replication, or performance tuning. Trigger words: postgresql, postgres, sql, database, jsonb, rls, window functions, cte.

87/100 quality
1.53× impact
SAFE
View skill
>

hono

v1.0.0

You are an expert in Hono, the ultrafast web framework for the edge. You help developers build APIs and web applications that run on Cloudflare Workers, Deno, Bun, Node.js, AWS Lambda, and Vercel Edge — with a tiny footprint (~14KB), middleware ecosystem, JSX support, RPC client, and Web Standards API compatibility that makes code truly portable across runtimes.

93/100 quality
3.00× impact
SAFE
View skill
>

zod

v1.0.0

You are an expert in Zod, the TypeScript-first schema declaration and validation library. You help developers define schemas that validate data at runtime AND infer TypeScript types at compile time — eliminating the need to write types and validators separately. Used for API input validation, form validation, environment variables, config files, and any data boundary.

100/100 quality
1.21× impact
SAFE
View skill
$

The Problem

Felix leads platform at a 40-person SaaS. Customers need scheduled tasks — daily report emails, hourly data syncs, weekly cleanup jobs. Currently, developers add cron entries on the server for each customer request. There are 200+ cron jobs with no visibility, no retry logic, and no tenant isolation — one customer's heavy job delays everyone else's. When the server rebooted last month, 47 cron jobs didn't restart because they were added manually. They need a centralized scheduler with tenant fairness, execution history, and self-service management.

Step 1: Build the Scheduler Engine

typescript
// src/scheduler/engine.ts — Multi-tenant job scheduler with cron parsing
import { pool } from "../db";
import { Redis } from "ioredis";

const redis = new Redis(process.env.REDIS_URL!);

interface ScheduledJob {
  id: string;
  tenantId: string;
  name: string;
  schedule: string;          // cron expression: "0 9 * * 1" (Mondays 9AM)
  timezone: string;
  handler: string;           // "reports.daily", "sync.crm", "cleanup.old_files"
  payload: Record<string, any>;
  retryPolicy: { maxRetries: number; backoffMs: number };
  timeout: number;           // max execution time in seconds
  enabled: boolean;
  lastRunAt: string | null;
  nextRunAt: string;
  consecutiveFailures: number;
}

interface JobExecution {
  id: string;
  jobId: string;
  tenantId: string;
  status: "running" | "completed" | "failed" | "timed_out";
  startedAt: string;
  completedAt: string | null;
  durationMs: number | null;
  output: string | null;
  error: string | null;
  attempt: number;
}

// Parse cron expression and calculate next run time
function getNextRun(cronExpr: string, timezone: string, after: Date = new Date()): Date {
  const [minute, hour, dayOfMonth, month, dayOfWeek] = cronExpr.split(" ");
  
  // Simple cron parser (production would use a library like cron-parser)
  const next = new Date(after);
  next.setSeconds(0, 0);
  next.setMinutes(next.getMinutes() + 1);

  for (let i = 0; i < 527040; i++) { // max 1 year of minutes
    const m = next.getMinutes();
    const h = next.getHours();
    const dom = next.getDate();
    const mon = next.getMonth() + 1;
    const dow = next.getDay();

    if (matchesCronField(minute, m) && matchesCronField(hour, h) &&
        matchesCronField(dayOfMonth, dom) && matchesCronField(month, mon) &&
        matchesCronField(dayOfWeek, dow)) {
      return next;
    }
    next.setMinutes(next.getMinutes() + 1);
  }

  throw new Error(`Cannot find next run for cron: ${cronExpr}`);
}

function matchesCronField(field: string, value: number): boolean {
  if (field === "*") return true;
  if (field.includes(",")) return field.split(",").some((f) => matchesCronField(f, value));
  if (field.includes("-")) {
    const [start, end] = field.split("-").map(Number);
    return value >= start && value <= end;
  }
  if (field.includes("/")) {
    const [base, step] = field.split("/");
    const start = base === "*" ? 0 : Number(base);
    return (value - start) % Number(step) === 0 && value >= start;
  }
  return Number(field) === value;
}

// Tenant-aware rate limiting
const TENANT_CONCURRENT_LIMIT = 3; // max 3 concurrent jobs per tenant

async function canRunJob(tenantId: string): Promise<boolean> {
  const running = await redis.get(`scheduler:running:${tenantId}`);
  return parseInt(running || "0") < TENANT_CONCURRENT_LIMIT;
}

// Main scheduler loop
export async function runSchedulerTick(): Promise<string[]> {
  const now = new Date();
  const executed: string[] = [];

  // Find due jobs
  const { rows: dueJobs } = await pool.query(
    `SELECT * FROM scheduled_jobs WHERE enabled = true AND next_run_at <= $1
     ORDER BY next_run_at LIMIT 50`,
    [now]
  );

  for (const job of dueJobs) {
    // Tenant rate limiting
    if (!(await canRunJob(job.tenant_id))) {
      continue; // skip, will be picked up next tick
    }

    // Distributed lock to prevent double execution
    const lockKey = `scheduler:lock:${job.id}`;
    const locked = await redis.set(lockKey, "1", "NX", "EX", job.timeout || 300);
    if (!locked) continue;

    // Increment running counter
    await redis.incr(`scheduler:running:${job.tenant_id}`);

    // Execute async
    executeJob(job).finally(async () => {
      await redis.del(lockKey);
      await redis.decr(`scheduler:running:${job.tenant_id}`);
    });

    executed.push(job.id);

    // Schedule next run
    const nextRun = getNextRun(job.schedule, job.timezone, now);
    await pool.query(
      "UPDATE scheduled_jobs SET next_run_at = $2, last_run_at = $3 WHERE id = $1",
      [job.id, nextRun, now]
    );
  }

  return executed;
}

async function executeJob(job: any): Promise<void> {
  const executionId = `exec-${Date.now()}-${Math.random().toString(36).slice(2, 6)}`;
  const startTime = Date.now();

  await pool.query(
    `INSERT INTO job_executions (id, job_id, tenant_id, status, started_at, attempt)
     VALUES ($1, $2, $3, 'running', NOW(), 1)`,
    [executionId, job.id, job.tenant_id]
  );

  try {
    // Dynamic handler dispatch
    const handlers: Record<string, (payload: any, tenantId: string) => Promise<string>> = {
      "reports.daily": async (p, t) => { /* generate report */ return "Report sent"; },
      "sync.crm": async (p, t) => { /* sync CRM data */ return "Synced 150 records"; },
      "cleanup.old_files": async (p, t) => { /* cleanup */ return "Deleted 45 files"; },
    };

    const handler = handlers[job.handler];
    if (!handler) throw new Error(`Unknown handler: ${job.handler}`);

    // Timeout enforcement
    const result = await Promise.race([
      handler(job.payload, job.tenant_id),
      new Promise<never>((_, reject) =>
        setTimeout(() => reject(new Error("Job timed out")), job.timeout * 1000)
      ),
    ]);

    await pool.query(
      `UPDATE job_executions SET status = 'completed', completed_at = NOW(), duration_ms = $2, output = $3 WHERE id = $1`,
      [executionId, Date.now() - startTime, result]
    );

    await pool.query(
      "UPDATE scheduled_jobs SET consecutive_failures = 0 WHERE id = $1",
      [job.id]
    );
  } catch (err: any) {
    await pool.query(
      `UPDATE job_executions SET status = 'failed', completed_at = NOW(), duration_ms = $2, error = $3 WHERE id = $1`,
      [executionId, Date.now() - startTime, err.message]
    );

    await pool.query(
      "UPDATE scheduled_jobs SET consecutive_failures = consecutive_failures + 1 WHERE id = $1",
      [job.id]
    );

    // Auto-disable after too many failures
    if (job.consecutive_failures >= job.retry_policy?.maxRetries || 5) {
      await pool.query("UPDATE scheduled_jobs SET enabled = false WHERE id = $1", [job.id]);
    }
  }
}

Results

  • Zero lost jobs after reboots — all jobs are in the database, not in crontab; the scheduler picks up where it left off
  • Tenant fairness enforced — 3 concurrent job limit per tenant; one customer's 50 sync jobs can't starve everyone else
  • Self-service scheduling — customers create, edit, and monitor their own jobs through the dashboard; no developer intervention
  • Full execution history — every run is logged with duration, output, and errors; "did my report run?" is answered in the UI
  • Auto-disable on persistent failure — after 5 consecutive failures, the job is disabled and the customer is notified; no infinite retry loops