Context and memory management for agents

Context engineering is the central reliability discipline for agents: what the model knows at the moment of action determines output quality. More tokens are not better — performance degrades as the window fills, and irrelevant history actively distracts the model. Curate context as a finite budget, not a log.

Core requirements

  • The agent MUST treat the context window as a finite, actively curated budget — not an append-only transcript.
  • The agent MUST NOT front-load context "just in case." Retrieve files, tool output, and references just-in-time, at the point they are needed.
  • Large or read-heavy subtasks (codebase exploration, multi-file investigation, log analysis) SHOULD be delegated to a subagent that runs in an isolated context and returns only a condensed summary.
  • The agent SHOULD clear or reset context between unrelated tasks rather than carrying stale history forward.
  • Information that must outlive the current window (decisions, conventions, task state) MUST be written to a persistent memory file, not relied upon to survive in-conversation.

Compaction is lossy — protect critical state

When context approaches its limit, harnesses summarize older history (compaction). Summaries discard detail, so anything not explicitly preserved can vanish silently.

  • Before a compaction is likely, the agent MUST ensure the list of modified files and the test/run/build commands are recorded somewhere durable (a memory file or compaction directive), so they survive the summary.
  • The agent SHOULD assume compaction loses specifics: exact line numbers, intermediate reasoning, and raw tool output are not guaranteed to persist.
  • Where the harness supports it, the agent SHOULD declare compaction-preservation directives (e.g. "when compacting, always keep the full modified-file list and test commands").

Subagent isolation

Use a subagent when... Keep in main context when...
Exploring many files to answer one question The result must stay live for ongoing editing
Verifying/reviewing a diff in a fresh context The history itself is the work product
Running a noisy, token-heavy investigation The subtask is a single cheap lookup
  • A subagent MUST return a condensed summary (findings, file paths, decisions) — not raw dumps of everything it read.
  • The caller MUST NOT assume the subagent's intermediate context is available afterward; only the returned summary survives.

Persistent memory across sessions

  • Cross-session continuity (plans, conventions, in-progress task state) MUST live in a persistent file the agent re-reads at session start, since the context window does not persist between sessions.
  • Memory files MUST stay concise and high-signal. An overstuffed memory file dilutes attention and causes the agent to ignore the rules that matter — keep only what changes behavior, and prune regularly.
  • The agent MUST NOT duplicate information the model can cheaply re-derive (e.g. file contents, standard conventions) into memory; store only what cannot be inferred.

Pruning between tasks

  • On switching to an unrelated task, the agent SHOULD reset context so prior files and tool output do not bias the new work.
  • After repeated failed correction attempts on one problem, the agent SHOULD clear the polluted context and restart from a sharper prompt rather than layering more corrections — failed approaches in-window degrade subsequent reasoning.

Note: larger context windows (forecast to keep growing) raise the threshold at which curation becomes urgent, but do not remove the degradation-with-fill effect. Treat a bigger window as more headroom, not a license to stop curating — this is a durable property, not a vendor-specific number.

version
1.0.0
tags
agents, context, ai-workflow
author
Mike Fullerton
modified
2026-06-09

Change History

Version Date Author Summary
1.0.0 2026-06-09 Mike Fullerton Initial creation