04 — Testing
Rule: every project ships with tests. No test files in a PR = PR blocked. Prototypes included — one smoke test is enough there, but it exists.
We got burned before by “we’ll add tests later.” Later never came. The cost of writing a single happy-path test while you’re building is lower than any other point in the project’s life.
The pyramid for this team
Top-down:
- Thin E2E on the 2–5 critical flows (auth, checkout, the main CRUD loop). Not every page.
- Heavy integration — API route + real DB (Docker Postgres in CI). This is where most bugs live and where tests have the best ROI.
- Light unit tests on pure logic (parsers, calculators, reducers). Skip trivial units — no “test getter returns value.”
Mocked-unit-heavy pyramids look tidy and miss real bugs. Integration first.
Required test files per project type
Every new project must ship with at least these, on day one:
Next.js app
tests/smoke.spec.ts(Playwright) — loads/and one auth’d page. Confirms the app boots.tests/integration/*.test.ts— one test per API route / Server Action that touches the DB.tests/unit/*.test.ts— for any non-trivial pure function.
Backend service
tests/smoke.test.ts— hits/health, confirms DB connection.tests/integration/*.test.ts— one per endpoint, real DB.- Contract tests (Zod/Pydantic schemas) — verify every boundary validates.
Data / ML / AI
tests/smoke.test.ts— pipeline runs end-to-end on a fixture input.evals/— golden-set JSON + runner. See playbooks/data-ml-ai.
Mobile
__tests__/smoke.test.ts— app renders the root screen without crashing.- One Maestro flow for sign-in + home screen.
CLI / script
tests/smoke.test.ts— invoking the CLI with--helpexits 0. One test per command.
Stacks
| Language / runtime | Test runner | E2E | HTTP mocking | DB |
|---|---|---|---|---|
| TypeScript / Node / Next.js | Vitest | Playwright | MSW | Docker Postgres in CI via testcontainers or compose |
| Python | pytest + pytest-asyncio | (see above — test via HTTP) | httpx + respx | testcontainers-python (preferred) or pytest-postgresql |
| React Native | Jest + React Native Testing Library | Maestro | MSW | n/a |
Don’t mix test runners in one project.
What to test (and what not to)
Test:
- Anything that takes user input (form handlers, API routes) — validation + the happy path + one failure case.
- Anything that writes to the DB — integration test with a real DB.
- Pure functions with branching logic.
- Critical flows end-to-end.
- Bug fixes — a test that reproduces the bug before you fix it. Prevents regression.
Don’t test:
- Framework behavior (you don’t need to test that Next.js returns a response).
- Getters / setters / trivial mappers.
- Third-party libraries.
- Implementation details (internal function calls). Test behavior, not internals.
Coverage
- No hard percentage target. Chasing a number produces useless tests.
- Target: every critical path has a test. If a bug happens in prod, a test should have caught it — or adding one is part of the fix.
Flaky tests
- Flaky > broken in terms of team damage. A green-ish suite trains people to ignore failures.
- First flake: add a comment, re-run once.
- Second flake on same test: tag
@flaky, file a ticket, fix within the sprint, or delete the test. - Never merge a flaky test and forget it.
Running tests
- Locally:
pnpm test/pytest— should run in < 30s for unit + integration on a warm cache. - Watch mode during dev:
pnpm test --watch. - Full suite (incl. E2E) in CI. E2E is slow; run it on PR and main only, not on every push.
Test data
- Factories (e.g.
@faker-js/faker+ a tiny factory helper), not fixtures copy-pasted between tests. - Reset DB state between integration tests — truncate tables or use a transaction rollback per test.
- Seed data in a
prisma/seed.ts/drizzle/seed.ts/scripts/seed.py— runnable locally.
Reviewing tests in PRs
When reviewing, ask:
- If I delete this test and keep the code, would a real bug slip through?
- Does this test exercise behavior or just re-describe the implementation?
- Is the failure message going to tell a future you what broke?
If any answer is “no,” ask for changes.