video-subtitles
Generate and burn subtitles into videos. Use when a user asks to add subtitles to a video, generate captions, transcribe audio to SRT or VTT, auto-transcribe with Whisper, burn in subtitles, create closed captions, or translate subtitles. Supports SRT, VTT, and ASS subtitle formats.
Usage
Getting Started
- Install the skill using the command above
- Open your AI coding agent (Claude Code, Codex, Gemini CLI, or Cursor)
- Reference the skill in your prompt
- The AI will use the skill's capabilities automatically
Example Prompts
- "Process all PDFs in the uploads folder and extract invoice data"
- "Set up a workflow that converts uploaded spreadsheets to formatted reports"
Documentation
Overview
Generate, edit, and burn subtitles into video files. Auto-transcribe audio to timed subtitle files using OpenAI Whisper, convert between SRT/VTT/ASS formats, style and position subtitles, and hardcode them into video. Supports batch processing for multiple videos.
Instructions
When a user asks for subtitle help, determine the task:
Task A: Auto-transcribe audio to subtitles with Whisper
- Install Whisper:
pip install openai-whisper
# or for faster inference with GPU
pip install faster-whisper
- Generate subtitles from a video:
# Basic transcription to SRT
whisper input.mp4 --model medium --output_format srt --output_dir ./subs/
# Specify language for better accuracy
whisper input.mp4 --model medium --language en --output_format srt
# Generate multiple formats at once
whisper input.mp4 --model medium --output_format all
# Use faster-whisper for speed (same accuracy)
# pip install faster-whisper
- Python API for more control:
import whisper
model = whisper.load_model("medium") # tiny, base, small, medium, large
result = model.transcribe("input.mp4", language="en")
# Write SRT file
def write_srt(segments: list, output_path: str):
with open(output_path, "w", encoding="utf-8") as f:
for i, seg in enumerate(segments, 1):
start = format_timestamp(seg["start"])
end = format_timestamp(seg["end"])
f.write(f"{i}\n{start} --> {end}\n{seg['text'].strip()}\n\n")
def format_timestamp(seconds: float) -> str:
h = int(seconds // 3600)
m = int((seconds % 3600) // 60)
s = int(seconds % 60)
ms = int((seconds % 1) * 1000)
return f"{h:02d}:{m:02d}:{s:02d},{ms:03d}"
write_srt(result["segments"], "output.srt")
Task B: Burn subtitles into a video with ffmpeg
# Burn SRT subtitles (soft-styled)
ffmpeg -i input.mp4 -vf "subtitles=subs.srt" -c:a copy output.mp4
# Burn with custom styling
ffmpeg -i input.mp4 -vf "subtitles=subs.srt:force_style='\
FontSize=24,FontName=Arial,PrimaryColour=&H00FFFFFF,\
OutlineColour=&H00000000,Outline=2,Shadow=1,\
MarginV=30'" -c:a copy output.mp4
# Burn ASS subtitles (preserves advanced styling)
ffmpeg -i input.mp4 -vf "ass=subs.ass" -c:a copy output.mp4
# Burn subtitles with hardware acceleration
ffmpeg -i input.mp4 -vf "subtitles=subs.srt" \
-c:v libx264 -preset fast -crf 18 -c:a copy output.mp4
Task C: Create and edit SRT files manually
SRT format:
1
00:00:01,000 --> 00:00:04,000
Welcome to this tutorial.
2
00:00:04,500 --> 00:00:08,200
Today we will learn about
video subtitle creation.
3
00:00:09,000 --> 00:00:12,500
Let's get started!
VTT format:
WEBVTT
00:00:01.000 --> 00:00:04.000
Welcome to this tutorial.
00:00:04.500 --> 00:00:08.200
Today we will learn about
video subtitle creation.
Task D: Convert between subtitle formats
def srt_to_vtt(srt_path: str, vtt_path: str):
"""Convert SRT to WebVTT format."""
with open(srt_path, "r", encoding="utf-8") as f:
content = f.read()
# Replace comma with dot in timestamps
import re
content = re.sub(r"(\d{2}:\d{2}:\d{2}),(\d{3})", r"\1.\2", content)
# Remove sequence numbers
lines = content.strip().split("\n")
vtt_lines = ["WEBVTT", ""]
for line in lines:
if line.strip().isdigit():
continue
vtt_lines.append(line)
with open(vtt_path, "w", encoding="utf-8") as f:
f.write("\n".join(vtt_lines))
def vtt_to_srt(vtt_path: str, srt_path: str):
"""Convert WebVTT to SRT format."""
with open(vtt_path, "r", encoding="utf-8") as f:
content = f.read()
# Remove WEBVTT header
content = content.replace("WEBVTT", "").strip()
# Replace dot with comma in timestamps
import re
content = re.sub(r"(\d{2}:\d{2}:\d{2})\.(\d{3})", r"\1,\2", content)
# Add sequence numbers
blocks = content.strip().split("\n\n")
srt_blocks = []
for i, block in enumerate(blocks, 1):
if block.strip():
srt_blocks.append(f"{i}\n{block.strip()}")
with open(srt_path, "w", encoding="utf-8") as f:
f.write("\n\n".join(srt_blocks) + "\n")
Task E: Shift subtitle timing
# Delay all subtitles by 2.5 seconds using ffmpeg
ffmpeg -itsoffset 2.5 -i subs.srt -c copy subs_shifted.srt
# Python: shift all timestamps
def shift_srt(input_path: str, output_path: str, offset_sec: float):
"""Shift all SRT timestamps by offset_sec (positive = delay)."""
import re
with open(input_path, "r", encoding="utf-8") as f:
content = f.read()
def shift_ts(match):
parts = match.group().split(":")
total = int(parts[0]) * 3600 + int(parts[1]) * 60
sec_ms = parts[2].split(",")
total += int(sec_ms[0]) + int(sec_ms[1]) / 1000
total += offset_sec
total = max(0, total)
h = int(total // 3600)
m = int((total % 3600) // 60)
s = int(total % 60)
ms = int((total % 1) * 1000)
return f"{h:02d}:{m:02d}:{s:02d},{ms:03d}"
result = re.sub(r"\d{2}:\d{2}:\d{2},\d{3}", shift_ts, content)
with open(output_path, "w", encoding="utf-8") as f:
f.write(result)
shift_srt("subs.srt", "subs_delayed.srt", offset_sec=2.5)
Examples
Example 1: Auto-subtitle a lecture recording
User request: "Transcribe my 1-hour lecture and burn subtitles into the video"
# Step 1: Transcribe with Whisper (medium model for good accuracy)
whisper lecture.mp4 --model medium --language en --output_format srt --output_dir ./
# Step 2: Review and fix the generated SRT file if needed
# Step 3: Burn subtitles into the video
ffmpeg -i lecture.mp4 -vf "subtitles=lecture.srt:force_style='\
FontSize=22,FontName=Arial,MarginV=25'" -c:a copy lecture_subtitled.mp4
Example 2: Generate subtitles in multiple languages
User request: "Create English and Spanish subtitles for my video"
# English transcription
whisper video.mp4 --model medium --language en --output_format srt \
--output_dir ./ && mv video.srt video_en.srt
# Spanish transcription (Whisper handles translation)
whisper video.mp4 --model medium --task translate --language es \
--output_format srt --output_dir ./ && mv video.srt video_es.srt
Example 3: Batch subtitle generation for a video series
User request: "Generate subtitles for all episodes in this folder"
for video in ./episodes/*.mp4; do
name=$(basename "${video%.*}")
echo "Transcribing: $name"
whisper "$video" --model medium --language en \
--output_format srt --output_dir ./subs/
echo " Done: ./subs/${name}.srt"
done
echo "All episodes transcribed"
Guidelines
- Use the
mediumWhisper model as a default; it balances accuracy and speed well. Uselargeonly for difficult audio. - Always review auto-generated subtitles before burning them in; Whisper can make errors with technical terms and proper nouns.
- Burn subtitles with
-c:a copyto avoid re-encoding audio unnecessarily. - For web delivery, use VTT format (browsers support it natively); for video files, use SRT.
- Keep subtitle lines under 42 characters and display no more than 2 lines at a time for readability.
- Each subtitle should be visible for at least 1 second and no more than 7 seconds.
- Use UTF-8 encoding for all subtitle files to support international characters.
- Test subtitle positioning on different screen sizes before final rendering.
Information
- Version
- 1.0.0
- Author
- terminal-skills
- Category
- Automation
- License
- Apache-2.0