Terminal.skills
Skills/rate-limiter
>

rate-limiter

Designs and implements API rate limiting middleware and abuse protection. Use when you need to add request throttling, per-user quotas, IP-based blocking, sliding window or token bucket algorithms, Redis-backed distributed counters, or 429 response handling. Trigger words: rate limit, throttle, API abuse, too many requests, request quota, DDoS protection, brute force prevention, credential stuffing defense.

#rate-limiting#api-security#middleware
terminal-skillsv1.0.0
Works with:claude-codeopenai-codexgemini-clicursor
Source

Usage

$
✓ Installed rate-limiter v1.0.0

Getting Started

  1. Install the skill using the command above
  2. Open your AI coding agent (Claude Code, Codex, Gemini CLI, or Cursor)
  3. Reference the skill in your prompt
  4. The AI will use the skill's capabilities automatically

Example Prompts

  • "Deploy the latest build to the staging environment and run smoke tests"
  • "Check the CI pipeline status and summarize any recent failures"

Documentation

Overview

This skill enables AI agents to design, implement, and configure production-grade rate limiting for APIs. It covers algorithm selection, middleware generation, Redis-backed distributed counting, abuse pattern detection, and proper HTTP response headers.

Instructions

1. Assess the API Surface

Before writing any code, analyze the target application:

  • List all public endpoints and their HTTP methods
  • Classify endpoints by sensitivity: authentication (highest), write operations (high), read operations (medium), static/health (low)
  • Identify existing middleware stack and framework (Express, Fastify, Django, Gin, etc.)
  • Check if Redis or another shared store is available for distributed rate limiting

2. Choose the Right Algorithm

Select based on the use case:

AlgorithmBest ForTrade-off
Fixed WindowSimple per-minute capsBurst at window edges
Sliding Window LogPrecise per-user limitsHigher memory per key
Sliding Window CounterBalance of accuracy and memorySlight approximation
Token BucketAPIs with burst allowanceMore complex to tune
Leaky BucketSmooth output rateDelays rather than rejects

Default recommendation: Sliding Window Counter — it handles 95% of use cases with good accuracy and reasonable memory usage.

3. Implement Layered Limits

Always implement at least two layers:

Layer 1 — Global IP limit: Catches volumetric abuse before authentication. Typical: 100-300 req/min per IP.

Layer 2 — Endpoint-specific limits: Different limits per endpoint category. Auth endpoints get the strictest limits (3-10 req/min).

Layer 3 — Authenticated user quotas (if applicable): Daily or hourly caps per API key or user ID. Return quota status in response headers.

4. Response Headers

Always include these headers on EVERY response (not just 429s):

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1708012800
Retry-After: 30  (only on 429 responses)

5. Abuse Detection Patterns

Beyond simple counting, detect:

  • Credential stuffing: Many unique usernames from one IP on auth endpoints
  • Scraping: Sequential access patterns on listing endpoints
  • Header anomalies: Missing User-Agent, rotating headers on rapid requests
  • Distributed attacks: Same user-agent/fingerprint across many IPs

6. Graceful Degradation

Always implement a fallback when Redis is unavailable:

  • Fall back to in-memory rate limiting per process
  • Log the Redis failure prominently
  • Do NOT fail open (allowing unlimited requests) unless explicitly configured

7. Avoid Common Pitfalls

  • Trust proxy configuration: Always verify how X-Forwarded-For is parsed. Behind a load balancer, the client IP may not be req.ip
  • Shared NAT: Don't set per-IP limits too low — corporate networks may have thousands of users behind one IP. Minimum 100 req/min for non-auth endpoints
  • Key design: Use composite keys (ip:endpoint, userId:endpoint) not just IP alone
  • Time sync: Use Redis server time (TIME command) not local clock for distributed setups

Examples

Example 1: Express middleware with Redis sliding window

Prompt: "Add rate limiting to my Express API. Auth endpoints should allow 5 requests per minute per IP. All other endpoints 60 per minute per authenticated user."

Output: The agent creates src/middleware/rateLimiter.ts:

typescript
import { Redis } from 'ioredis';
import { Request, Response, NextFunction } from 'express';

const redis = new Redis(process.env.REDIS_URL);

interface RateLimitConfig {
  windowMs: number;
  max: number;
  keyGenerator: (req: Request) => string;
}

const configs: Record<string, RateLimitConfig> = {
  auth: {
    windowMs: 60_000,
    max: 5,
    keyGenerator: (req) => `rl:auth:${req.ip}`,
  },
  default: {
    windowMs: 60_000,
    max: 60,
    keyGenerator: (req) => `rl:api:${(req as any).userId ?? req.ip}`,
  },
};

export function rateLimit(category: keyof typeof configs) {
  const config = configs[category];
  return async (req: Request, res: Response, next: NextFunction) => {
    const key = config.keyGenerator(req);
    const now = Date.now();
    const windowStart = now - config.windowMs;

    const multi = redis.multi();
    multi.zremrangebyscore(key, 0, windowStart);
    multi.zadd(key, now.toString(), `${now}:${Math.random()}`);
    multi.zcard(key);
    multi.expire(key, Math.ceil(config.windowMs / 1000));
    const results = await multi.exec();

    const count = results![2][1] as number;
    res.set('X-RateLimit-Limit', config.max.toString());
    res.set('X-RateLimit-Remaining', Math.max(0, config.max - count).toString());
    res.set('X-RateLimit-Reset', Math.ceil((now + config.windowMs) / 1000).toString());

    if (count > config.max) {
      res.set('Retry-After', Math.ceil(config.windowMs / 1000).toString());
      return res.status(429).json({ error: 'Too many requests' });
    }
    next();
  };
}

Example 2: Abuse detection for credential stuffing

Prompt: "Detect credential stuffing on my login endpoint. Block IPs that try more than 20 unique usernames in 5 minutes."

Output: The agent adds a detection middleware that tracks unique username attempts per IP using a Redis HyperLogLog, blocking IPs that exceed the threshold and logging the event with full request metadata for incident response.

Guidelines

  • Always test rate limiting with concurrent requests — race conditions in counter logic are common
  • Include integration tests that verify 429 responses and header values
  • Document the rate limits in your API documentation or OpenAPI spec
  • Consider offering rate limit increase for paid tiers via configuration, not code changes
  • Log all rate limit events in structured format for security monitoring

Information

Version
1.0.0
Author
terminal-skills
Category
DevOps
License
Apache-2.0