Context Engineering for AI Coding Agents

Macintosh HDWritingAI Coding

Article 08AI CodingApril 29, 2026

Reading time: 22 min

Context Engineering for AI Coding Agents

A practical guide to context engineering for AI coding agents: specs, repo maps, constraints, examples, tests, acceptance criteria, and review workflows.

The worst prompt you can give an AI coding agent is also the most tempting one:

txtCopy
Just fix this.

Sometimes it works. More often, the agent guesses the architecture, edits the wrong layer, invents missing requirements, changes unrelated files, and leaves you with a diff that is harder to review than the original bug.

The problem is not always the model. The problem is context.

Modern AI coding tools can read files, inspect repositories, edit code, run commands, and reason across multiple parts of a codebase. Claude Code, for example, is described by Anthropic as an agentic coding tool that reads your codebase, edits files, runs commands, and integrates with development tools. Codex can adapt to existing project structure and conventions. GitHub Copilot, Cursor, and similar tools also rely on persistent instructions, rules, and project context.

That means the skill is no longer only prompt writing. The real skill is context engineering: preparing the right information, constraints, examples, tests, and review criteria so an AI coding agent can make a small, correct, reviewable change.

This article is a practical guide to context engineering for AI coding agents. We will cover repo maps, task specs, AGENTS.md, CLAUDE.md, Copilot instructions, Cursor rules, examples, constraints, tests, acceptance criteria, and the anti-patterns that make agents produce messy code.

What Context Engineering Means

Context engineering is the discipline of deciding what the agent should know before and during a coding task.

It answers questions like:

What is the task?
Which files matter?
Which files should not be touched?
What architectural pattern should be followed?
What examples should the agent copy?
What tests prove the change works?
What commands should be run?
What would make the output unacceptable?

Prompt engineering is often about wording a request. Context engineering is about building the working environment around the request.

A good context package reduces guessing. It gives the agent enough information to act, but not so much irrelevant material that it gets distracted.

The Core Rule: Context Is an Engineering Artifact

Treat context like code. It should be versioned, reviewed, and improved over time.

If you repeatedly tell agents the same thing — project commands, folder structure, banned libraries, style rules, testing expectations — that information should not live only in a chat message.

It belongs in files such as:

AGENTS.md for coding agents that support it;
CLAUDE.md for Claude Code project memory;
.github/copilot-instructions.md for GitHub Copilot repository instructions;
.cursor/rules/*.mdc or Cursor rules for persistent project guidance;
task-specific specs in docs/specs/;
examples close to the code they demonstrate.

OpenAI’s Codex docs say Codex reads AGENTS.md files before doing work, so teams can layer global guidance with project-specific overrides. GitHub’s docs explain that Copilot repository custom instructions live at .github/copilot-instructions.md. Cursor’s official docs describe persistent Project, Team, and User Rules, plus AGENTS.md. Anthropic’s Claude Code docs describe project memory through files such as CLAUDE.md.

The exact file name depends on the tool. The principle is the same: do not make every agent session start from zero.

The Context Stack

A useful way to organize context is as a stack.

Global rules
Team-wide preferences: language, style, security posture, review standards, dependency policy.
Project rules
Repository-specific setup: commands, architecture, folder structure, testing tools, conventions.
Domain rules
Business logic: pricing rules, permission rules, data model assumptions, compliance constraints.
Task context
The specific issue, expected behavior, relevant files, examples, and acceptance criteria.
Runtime feedback
Test failures, build errors, review comments, logs, and user corrections.

Bad workflows mix all of this into one long prompt. Better workflows store stable context in versioned files and keep task prompts short and specific.

Create a Repo Map

A repo map is a short guide that tells an agent where things live. It prevents the agent from wandering through the repository and editing the wrong layer.

For a Next.js / TypeScript project, a repo map might look like this:

mdCopy
# Repo Map

## App structure
- `app/` — Next.js routes and page-level layouts.
- `components/` — reusable React components.
- `components/ui/` — low-level UI primitives. Do not add business logic here.
- `lib/` — framework-agnostic utilities.
- `server/` — server-only data access and integrations.
- `db/` — database schema, migrations, and query helpers.
- `tests/` — test utilities and integration tests.

## Important patterns
- Page components should fetch data and pass typed props to components.
- Business logic belongs in `lib/` or `server/`, not inside UI primitives.
- Use Zod schemas for external input validation.
- Keep database writes behind server actions or API handlers.

## Files requiring extra review
- `db/migrations/**`
- `server/auth/**`
- `server/billing/**`
- `.github/workflows/**`
- `.env*`

This map can live inside AGENTS.md, CLAUDE.md, or a separate docs/repo-map.md linked from the agent instructions.

The repo map does not need to be perfect. It only needs to be accurate enough to reduce wrong guesses.

Write a Practical AGENTS.md

AGENTS.md works like a README for coding agents. It should tell the agent how to work in the repository.

A practical version:

mdCopy
# AGENTS.md

## Setup commands
- Install dependencies: `npm install`
- Start dev server: `npm run dev`
- Type check: `npm run typecheck`
- Lint: `npm run lint`
- Unit tests: `npm test`
- Production build: `npm run build`

## Project conventions
- TypeScript is strict. Do not use `any` to hide type errors.
- Prefer small functions and explicit return types for exported utilities.
- Use existing UI components from `components/ui` before creating new primitives.
- Do not add new dependencies without explaining the reason.
- Keep changes scoped to the task.

## Architecture rules
- UI components should not call the database directly.
- Server-only logic belongs in `server/` or server actions.
- External input must be validated with Zod.
- Do not put business logic inside route components if it can be tested separately.

## Testing rules
- If business logic changes, add or update tests.
- If a bug is fixed, add a regression test when practical.
- Run the narrowest relevant test first, then broader checks.

## Safety rules
- Do not edit `.env*`, auth, billing, migrations, CI, or deployment files unless explicitly requested.
- Do not print secrets.
- Do not run destructive shell commands.
- Ask before deleting files or doing broad refactors.

## Final response
Before finishing, report:
1. Files changed.
2. Tests run.
3. Known risks.
4. Manual checks still needed.

This is not a prompt for one task. It is the operating agreement for the repository.

Add Tool-Specific Instructions Without Duplicating Everything

Different tools support different instruction files.

For GitHub Copilot, repository instructions usually go here:

txtCopy
.github/copilot-instructions.md

Example:

mdCopy
# GitHub Copilot Instructions

Follow the project conventions in `AGENTS.md`.

When generating code:
- Prefer TypeScript-safe solutions.
- Use existing components and utilities.
- Avoid new dependencies unless asked.
- Add tests for changed business logic.
- Do not suggest changes to auth, billing, migrations, or CI unless the issue is about those areas.

For Claude Code, a repository-level CLAUDE.md can point to the same source of truth:

mdCopy
# CLAUDE.md

Read `AGENTS.md` before editing.
Use `docs/repo-map.md` for repository structure.
Use `docs/testing.md` for test commands and expectations.

Before editing, propose a plan.
After editing, run the relevant checks and summarize the diff.

For Cursor, project rules can reference the same files and add IDE-specific behavior. A rule might say:

mdCopy
---
description: Project architecture and review rules
globs:
  - "**/*.{ts,tsx}"
alwaysApply: true
---

Follow `AGENTS.md` and `docs/repo-map.md`.
Keep changes scoped.
Prefer existing patterns in nearby files.
Do not loosen TypeScript types to make errors disappear.

The important point: do not maintain five contradictory instruction sets. Keep one main source of truth, then have tool-specific files refer to it or summarize it.

Task Specs Beat Vague Prompts

A coding agent should receive a task spec, not a wish.

Bad prompt:

txtCopy
Just fix the billing page.

Better task spec:

mdCopy
# Task: Preserve billing tab in URL

## Problem
On `/settings/billing`, switching between `overview`, `invoices`, and `payment-methods` works visually, but the selected tab is not reflected in the URL. Reloading the page resets the tab to `overview`.

## Expected behavior
- Selecting a tab updates `?tab=` in the URL.
- Reloading keeps the selected tab.
- Invalid tab values fall back to `overview`.
- Existing billing data fetching should not change.

## Relevant files
- `app/settings/billing/page.tsx`
- `components/billing/BillingTabs.tsx`
- `lib/url-state.ts`

## Constraints
- Keep the change client-side unless existing architecture requires otherwise.
- Do not edit billing provider integration code.
- Do not change payment logic.
- Do not add a dependency.

## Acceptance criteria
- Add or update tests for tab parsing and URL updates.
- Run `npm run typecheck`.
- Run the relevant billing tab tests.
- Summarize all changed files.

This gives the agent a target, boundaries, and a definition of done.

Give Examples the Agent Can Copy

AI coding agents are pattern matchers as much as reasoners. If you want consistent code, show the agent a nearby example that already follows your standard.

Example prompt:

txtCopy
Before editing `BillingTabs.tsx`, inspect `components/dashboard/StatusFilter.tsx`.
Use the same URL state pattern and accessibility style.
Do not invent a new pattern unless the existing one cannot work.

You can also create a small examples file:

mdCopy
# docs/examples/url-state.md

Use this pattern when a small UI state should persist in the URL.

```tsx
const params = new URLSearchParams(searchParams.toString());

if (nextValue === defaultValue) {
  params.delete(paramName);
} else {
  params.set(paramName, nextValue);
}

router.replace(params.toString() ? `${pathname}?${params}` : pathname, {
  scroll: false,
});

Rules:

Validate query params before using them.
Remove params for default values.
Use replace, not push, for filter changes unless the UX requires browser history.


Examples are better than abstract style advice. “Follow the pattern in this file” is usually more effective than “write clean code.”

### Constraints Prevent Expensive Mistakes

Constraints are the boundaries that protect the codebase.

Useful constraints include:

- do not change public API shape;
- do not add dependencies;
- do not touch database migrations;
- do not modify authentication logic;
- do not change CSS outside this component;
- keep existing accessibility behavior;
- preserve backwards compatibility;
- do not rename exported symbols;
- keep changes under a specific folder.

A good constraint is specific enough that you can verify it in the diff.

Weak constraint:

```txt
Be careful.

Better constraint:

txtCopy
Do not edit files under `server/billing/` or `db/migrations/`. This task is only about the UI tab state.

Tests Are Context Too

Tests are not just validation after the edit. They are context before the edit.

Before asking the agent to implement, tell it where the tests are and what behavior they should protect.

Example:

txtCopy
Before editing, inspect existing tests for `StatusFilter` and `BillingTabs`.
If no direct test exists, inspect nearby component tests for mocking `next/navigation`.
Add the smallest useful test for:
1. reading the tab from the URL;
2. updating the URL when the user selects a tab;
3. falling back on invalid tab values.

Useful test commands should be stored in project instructions:

bashCopy
npm run typecheck
npm run lint
npm test -- BillingTabs
npm run build

If the project has no tests, say that explicitly and require manual verification steps. Do not let the agent pretend a change is safe just because no tests failed.

Acceptance Criteria Make Review Easier

Acceptance criteria turn a vague task into a review checklist.

Good criteria are observable:

selected status persists after reload;
invalid query params fall back to default;
no new dependency was added;
typecheck passes;
relevant tests pass;
only dashboard filter files changed.

Bad criteria are subjective:

make it cleaner;
improve UX;
make it production ready;
optimize the code.

The agent can help implement, but the human reviewer still needs a clear way to decide whether the task is done.

A Complete Context Package

For a real coding agent task, your prompt can be short if the context files are good.

Example:

txtCopy
Task: implement `docs/specs/billing-tab-url-state.md`.

Before editing:
1. Read `AGENTS.md`.
2. Read `docs/repo-map.md`.
3. Inspect the relevant files listed in the spec.
4. Propose a small plan and wait for approval.

After approval:
- Make the minimal change.
- Add or update tests.
- Run the relevant checks.
- Summarize the diff and risks.

The actual spec might live in version control:

mdCopy
# docs/specs/billing-tab-url-state.md

## Goal
Persist the selected billing tab in the URL query string.

## User behavior
When a user selects a billing tab, the URL updates with `?tab=`. Reloading the page keeps the selected tab. Invalid values fall back to `overview`.

## Relevant files
- `app/settings/billing/page.tsx`
- `components/billing/BillingTabs.tsx`
- `components/dashboard/StatusFilter.tsx` as an example pattern

## Constraints
- Do not edit billing provider integrations.
- Do not edit database code.
- Do not add dependencies.
- Keep the visual design unchanged.

## Acceptance criteria
- Tests cover URL parsing and updates.
- `npm run typecheck` passes.
- Relevant component tests pass.
- Diff is limited to UI state and tests.

This approach is slower than typing “just fix this,” but it usually saves time because the diff is cleaner and easier to review.

Context Rot: When Too Much Context Makes the Agent Worse

More context can make the agent worse when it is stale, irrelevant, or contradictory.

Common context rot examples:

old architecture docs that no longer match the code;
pasted logs from a previous bug;
screenshots without the underlying files;
a giant list of rules no human maintains;
tool-specific instructions that contradict each other;
examples from deprecated folders;
broad prompts that include every possible future requirement.

The fix is not to hide context. The fix is to curate it.

Ask the agent to use targeted retrieval:

txtCopy
Search for the existing URL-state pattern first.
Do not read unrelated dashboard files unless the first search fails.
If you find multiple patterns, summarize them and ask which one to follow.

Good context engineering is selective. It gives the agent enough to make the next decision, not the entire history of the product.

Anti-Patterns to Avoid

Avoid these patterns when working with AI coding agents.

“Just fix this”
No problem statement, no expected behavior, no constraints, no test plan.
“Refactor this whole area”
Too broad. Break it into scoped steps.
“Make it production-ready”
Undefined. Production-ready for what risk level?
“Use your best judgment”
Sometimes necessary, but dangerous when architecture or product rules matter.
Pasting a whole file tree without guidance
Volume is not the same as relevance.
Letting the agent change tests to match broken code
Tests should protect expected behavior, not become decoration.
Mixing unrelated tasks in one session
Debugging auth and redesigning a dashboard should not share the same context.
Skipping diff review
The final summary is not proof. The diff is proof.

Review the Context Files Themselves

Because context files influence code, they deserve review.

Review AGENTS.md, CLAUDE.md, Copilot instructions, and Cursor rules for:

outdated commands;
wrong folder descriptions;
banned libraries that are no longer banned;
missing safety rules;
too many vague instructions;
conflicts between tools;
rules nobody follows.

A bad context file can quietly damage every future agent session. Keep it short, accurate, and maintained.

Internal Links to Build Around This Topic

This article should connect naturally to other AI coding content:

Claude Code workflow: how to run issue → plan → edit → test → review.
Codex skills: how to package repeatable coding procedures.
AI code review: how to review code written by agents.
Custom agent skills: how to turn recurring tasks into reusable workflows.

The topic cluster should make one point clear: successful AI coding is not about one perfect prompt. It is about a system of context, constraints, tests, review, and repeatable workflows.

Conclusion: Better Context Creates Better Diffs

AI coding agents are getting better, but they still need the right working environment.

If the agent gets a vague task, stale docs, no tests, no examples, and no boundaries, it will guess. Sometimes the guess will look confident. That is the dangerous part.

If the agent gets a clear spec, repo map, project instructions, examples, constraints, tests, and acceptance criteria, it can produce a diff that is smaller, safer, and easier to review.

That is context engineering.

Do not ask the agent to read your mind. Give it the engineering artifacts a good developer would ask for: what to build, where to look, what to avoid, how to test, and what done means.