EGCA EGCA Engineering Handbook
Internal reference · v1
Home Onboarding Git Review Prototype ↔ Prod
Home 7. Testing

04 — Testing

Rule: every project ships with tests. No test files in a PR = PR blocked. Prototypes included — one smoke test is enough there, but it exists.

We got burned before by “we’ll add tests later.” Later never came. The cost of writing a single happy-path test while you’re building is lower than any other point in the project’s life.

The pyramid for this team

Top-down:

  • Thin E2E on the 2–5 critical flows (auth, checkout, the main CRUD loop). Not every page.
  • Heavy integration — API route + real DB (Docker Postgres in CI). This is where most bugs live and where tests have the best ROI.
  • Light unit tests on pure logic (parsers, calculators, reducers). Skip trivial units — no “test getter returns value.”

Mocked-unit-heavy pyramids look tidy and miss real bugs. Integration first.

Required test files per project type

Every new project must ship with at least these, on day one:

Next.js app

  • tests/smoke.spec.ts (Playwright) — loads / and one auth’d page. Confirms the app boots.
  • tests/integration/*.test.ts — one test per API route / Server Action that touches the DB.
  • tests/unit/*.test.ts — for any non-trivial pure function.

Backend service

  • tests/smoke.test.ts — hits /health, confirms DB connection.
  • tests/integration/*.test.ts — one per endpoint, real DB.
  • Contract tests (Zod/Pydantic schemas) — verify every boundary validates.

Data / ML / AI

  • tests/smoke.test.ts — pipeline runs end-to-end on a fixture input.
  • evals/ — golden-set JSON + runner. See playbooks/data-ml-ai.

Mobile

  • __tests__/smoke.test.ts — app renders the root screen without crashing.
  • One Maestro flow for sign-in + home screen.

CLI / script

  • tests/smoke.test.ts — invoking the CLI with --help exits 0. One test per command.

Stacks

Language / runtimeTest runnerE2EHTTP mockingDB
TypeScript / Node / Next.jsVitestPlaywrightMSWDocker Postgres in CI via testcontainers or compose
Pythonpytest + pytest-asyncio(see above — test via HTTP)httpx + respxtestcontainers-python (preferred) or pytest-postgresql
React NativeJest + React Native Testing LibraryMaestroMSWn/a

Don’t mix test runners in one project.

What to test (and what not to)

Test:

  • Anything that takes user input (form handlers, API routes) — validation + the happy path + one failure case.
  • Anything that writes to the DB — integration test with a real DB.
  • Pure functions with branching logic.
  • Critical flows end-to-end.
  • Bug fixes — a test that reproduces the bug before you fix it. Prevents regression.

Don’t test:

  • Framework behavior (you don’t need to test that Next.js returns a response).
  • Getters / setters / trivial mappers.
  • Third-party libraries.
  • Implementation details (internal function calls). Test behavior, not internals.

Coverage

  • No hard percentage target. Chasing a number produces useless tests.
  • Target: every critical path has a test. If a bug happens in prod, a test should have caught it — or adding one is part of the fix.

Flaky tests

  • Flaky > broken in terms of team damage. A green-ish suite trains people to ignore failures.
  • First flake: add a comment, re-run once.
  • Second flake on same test: tag @flaky, file a ticket, fix within the sprint, or delete the test.
  • Never merge a flaky test and forget it.

Running tests

  • Locally: pnpm test / pytest — should run in < 30s for unit + integration on a warm cache.
  • Watch mode during dev: pnpm test --watch.
  • Full suite (incl. E2E) in CI. E2E is slow; run it on PR and main only, not on every push.

Test data

  • Factories (e.g. @faker-js/faker + a tiny factory helper), not fixtures copy-pasted between tests.
  • Reset DB state between integration tests — truncate tables or use a transaction rollback per test.
  • Seed data in a prisma/seed.ts / drizzle/seed.ts / scripts/seed.py — runnable locally.

Reviewing tests in PRs

When reviewing, ask:

  1. If I delete this test and keep the code, would a real bug slip through?
  2. Does this test exercise behavior or just re-describe the implementation?
  3. Is the failure message going to tell a future you what broke?

If any answer is “no,” ask for changes.