The Problem
Your web app looks perfect in Chrome, but customers on Safari report broken layouts, and Firefox users see a flash of unstyled content on page load. Your QA process is a developer manually opening the app in three browsers before each release, clicking through key flows, and hoping they catch visual regressions. This takes 3-4 hours per release, misses subtle CSS differences, and is the first thing skipped when deadlines are tight. Last month, a Safari flexbox bug shipped to production and stayed there for two weeks because nobody tested on WebKit.
The Solution
Use the test-generator skill to create comprehensive cross-browser test suites from your existing pages, the coding-agent to configure Playwright for multi-browser execution, and the web-scraper to capture baseline screenshots for visual comparison.
Step-by-Step Walkthrough
Step 1: Analyze Your App for Cross-Browser Risk Areas
Our app is a Next.js dashboard at http://localhost:3000. Key user flows:
login, dashboard overview, create report, export to PDF. We have zero
cross-browser tests. Set up automated testing across Chrome, Firefox,
and Safari. Focus on the areas most likely to break.
Not every part of the app carries the same cross-browser risk. Standard layouts and form submissions work fine everywhere. The danger lives in newer CSS features and JavaScript APIs where browser implementations diverge.
Step 2: Identify High-Risk Components
The codebase analysis flags specific CSS features and JavaScript APIs with known browser differences:
High risk (CSS features with inconsistent support):
| Component | CSS Feature | Browser Issue |
|---|---|---|
| Dashboard grid | CSS subgrid | No Firefox support before v117 |
| Date picker | :has() selector | Safari 15.3 does not support it |
| Export modal | backdrop-filter | Partial Firefox support |
| PDF preview | Container queries | Inconsistent sizing across engines |
Medium risk (JavaScript behavior differences):
- File upload drag-and-drop — Safari handles the
DataTransferAPI differently - Rich text editor — clipboard paste handling varies, especially for formatted content
- Infinite scroll —
IntersectionObserverthreshold behavior differs subtly
Low risk items (standard layouts, form submissions, navigation) get basic coverage but do not need the same scrutiny.
Step 3: Generate Playwright Test Suites
The test suites target five browser configurations, not just three — because mobile rendering introduces its own issues:
tests/
cross-browser/
login.spec.ts # Auth flow across all browsers
dashboard-layout.spec.ts # Grid rendering, responsive breakpoints
create-report.spec.ts # Form interactions, date picker
export-pdf.spec.ts # Modal, file download behavior
visual-regression.spec.ts # Screenshot comparisons per browser
playwright.config.ts # Multi-browser configuration
fixtures/
test-data.ts # Shared test data
The Playwright config defines the browser matrix:
// playwright.config.ts
export default defineConfig({
projects: [
{ name: 'chromium', use: { ...devices['Desktop Chrome'] } },
{ name: 'firefox', use: { ...devices['Desktop Firefox'] } },
{ name: 'webkit', use: { ...devices['Desktop Safari'] } },
{ name: 'mobile-safari', use: { ...devices['iPhone 14'] } },
{ name: 'mobile-chrome', use: { ...devices['Pixel 7'] } },
],
});
Each test file focuses on one user flow but runs across all five configurations. The dashboard layout test, for example, verifies that the CSS subgrid renders correctly in Chrome, falls back gracefully in older Firefox, and does not break the entire layout in Safari.
Step 4: Capture Visual Baselines and Fix Existing Issues
Baseline screenshots get captured across all five browser configurations. This is where existing cross-browser bugs surface immediately — before writing any new code:
| Page | Browsers Match? | Issue Found |
|---|---|---|
| Login page | All 5 match | No issues |
| Dashboard overview | Firefox differs | Subgrid fallback renders a 2px gap between grid items |
| Create report modal | All match | No issues |
| Export dialog | Safari differs | backdrop-filter blur effect not rendering |
Both issues get fixed with CSS fallbacks:
/* dashboard.module.css — subgrid fallback for older Firefox */
.dashboardGrid {
display: grid;
grid-template-columns: repeat(3, 1fr);
gap: 1rem;
}
@supports (grid-template-rows: subgrid) {
.dashboardGrid {
grid-template-rows: subgrid;
}
}
/* modal.module.css — webkit prefix for Safari backdrop-filter */
.modalOverlay {
-webkit-backdrop-filter: blur(8px);
backdrop-filter: blur(8px);
}
After fixes, new baselines are captured. These screenshots become the reference point — any future CSS change that causes a visual difference in any browser will be caught.
Step 5: Integrate Tests into CI
The CI workflow runs cross-browser tests on every pull request:
# .github/workflows/test.yml
jobs:
cross-browser:
runs-on: ubuntu-latest
strategy:
matrix:
browser: [chromium, firefox, webkit]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- run: npm ci
- run: npx playwright install --with-deps ${{ matrix.browser }}
- run: npx playwright test --project=${{ matrix.browser }}
- uses: actions/upload-artifact@v4
if: failure()
with:
name: diff-screenshots-${{ matrix.browser }}
path: test-results/
Key details that keep CI fast and useful:
- Parallel execution: 3 browsers times 5 test files equals 15 jobs running simultaneously
- Visual regression: compares every screenshot against the committed baseline
- Flake detection: retries failed tests once before reporting — cross-browser tests can be sensitive to timing
- Diff artifacts: on failure, the actual screenshot, expected screenshot, and pixel-diff image get uploaded for immediate debugging
Total CI time impact: +4 minutes, running in parallel with the existing test suite. That is 4 minutes to test what previously took a developer 3-4 hours of manual clicking — and actually catches the subtle CSS differences that humans miss.
Real-World Example
Aisha is a frontend lead at a 20-person SaaS startup. Their product is a data dashboard used by enterprise clients, some of whom mandate Safari on corporate MacBooks. Every other release breaks something in Safari or Firefox, and the team only finds out from support tickets — usually 3-5 days after the release, after the fix has become urgent.
She sets up cross-browser testing for the four critical user flows. During baseline capture, the agent immediately finds two rendering issues that have been live in production: a subgrid gap in Firefox and a missing blur effect in Safari. Both get fixed before writing a single new test — the setup process itself catches existing bugs.
Over the next quarter, the cross-browser tests catch 6 regressions before they reach production. A CSS grid change that looked fine in Chrome broke the dashboard layout in Safari. A date picker library update changed keyboard behavior in Firefox. A new modal animation used a property that WebKit renders differently. All caught in CI, all fixed before any customer noticed.
Safari-related support tickets drop to zero. The 3-4 hours of manual cross-browser testing per release disappear entirely. And the confidence to ship on Fridays — something the team never had before — becomes routine.