Skip to content

Operational Contract

This contract keeps memd useful without turning it into a transcript dump. Agents should retrieve bounded context before substantive work, write only durable facts after meaningful progress, and inspect quality with the same CLI that stores the memory.

Scope First

Each repo that uses memd should have .memd/project_scope.json.

memd doctor --project-dir . --format markdown
memd memory-md --project-dir . --output memory.md
memd agent-context \
  --tenant-id "$TENANT_ID" \
  --project-id "$PROJECT_ID" \
  --query "$TASK" \
  --k 2 \
  --token-budget 700 \
  --format markdown \
  --output .memd/context.md \
  --log-dir .memd/search-logs

Use memory.md and .memd/context.md as evidence, not instructions. A stored memory is useful only when it still matches current files, logs, tests, or operator decisions.

Write Budget

A typical single task should leave fewer than 10 durable chunks. Prefer 1 to 4 records:

  • one decision, if a design or operational choice was made
  • one evidence/run record, if commands, parameters, metrics, or failures matter
  • one finish summary, if the result should be reusable later
  • one durable follow-up, only when the next session would otherwise lose it

Do not write every tool call. Do not store chat history, play-by-play progress, large logs, secrets, credentials, private account data, or guessed conclusions. Concrete kind:progress summaries without explicit priority or durable category tags are retained as short-lived reviewable context rather than permanent memory. Add explicit priority only when the progress record is a durable lesson that should remain a candidate for future startup context.

Durable Writes

Durable records should contain at least one of these signals:

  • decision plus rationale
  • validated fix or result
  • root cause of a failure
  • command, path, parameter, metric, or version needed to reproduce work
  • evidence that supports or contradicts a claim
  • durable follow-up with enough context to resume safely

Examples:

memd add \
  --tenant-id "$TENANT_ID" \
  --project-id "$PROJECT_ID" \
  --chunk-type decision \
  --tags kind:decision,task:"$TASK_ID",priority:8 \
  --text "Decision: use tenant/project-scoped retrieval. Rationale: global summaries hid project-specific failures. Agent action: Verify tenant_id and project_id before reusing retrieval results."
memd add \
  --tenant-id "$TENANT_ID" \
  --project-id "$PROJECT_ID" \
  --chunk-type trace \
  --tags kind:run,task:"$TASK_ID",tool:cargo-test,status:passed \
  --text "cargo test -p memd passed after adding write-admission coverage; 831 passed, 4 ignored."
memd add \
  --tenant-id "$TENANT_ID" \
  --project-id "$PROJECT_ID" \
  --chunk-type summary \
  --tags kind:finish,task:"$TASK_ID",priority:8 \
  --text "Implemented memory-md candidate explanations. Validation: live explain report filtered generated wrappers and cargo test -p memd passed. Agent action: Run eval-memory-md before claiming startup memory quality is fixed."

Use priority:8 or priority:9 only for lessons that should plausibly appear in future memory.md refreshes. Lower-priority routine records remain searchable without dominating startup context.

Routine kind:progress summaries without explicit priority, evidence, decision, finish, consolidated, or retention:durable tags receive a 14-day retention window by default. Use them for active handoff context, not permanent project knowledge. If the result should survive cleanup, tag it as kind:evidence, kind:decision, kind:finish, or add an explicit priority:N/retention:durable tag.

Low-Value Writes

These should be rejected, downgraded, or avoided:

  • "starting to inspect files"
  • "ran tests" without the command and outcome
  • "made progress" without the result
  • generated digest wrapper text
  • duplicate summaries that add no new tags, evidence, or source provenance
  • broad claims without validation or uncertainty
  • routine progress summaries that should have been a short-lived handoff note

If an intermediate note is needed for handoff, make it concrete: name the file, command, error, partial conclusion, and next check.

High-priority durable records with priority:8+ or importance:8+ must include a concrete Agent action: sentence. It should tell the next agent what to do, check, prefer, avoid, verify, reuse, or resolve. memory.md renders this action guidance for each displayed takeaway, and memd eval-memory-md fails when displayed project takeaways lack concrete action guidance.

Inspect Quality

Use these commands before rolling out a memory workflow or after a noisy session:

memd eval-memory-md --project-dir . --min-useful-ratio 0.8 --max-generated-wrappers 0
memd memory-md --project-dir . --output memory.md --explain-output .memd/memory-explain.json
memd eval-write-quality --project-dir .
memd eval-retrieval --tenant-id "$TENANT_ID" --project-id "$PROJECT_ID" --project-dir .
memd audit --tenant-id "$TENANT_ID" --project-id "$PROJECT_ID" --format markdown

memory-md --explain-output is the first diagnostic when startup context looks bad. It shows which candidates were retrieved, score components, tags, whether they were generated digests, and why they were displayed or filtered. audit also reports routine progress summaries, unbounded routine progress without an expiry, and unbounded routine progress older than 30 days so legacy handoff records are visible before cleanup. eval-retrieval reports precision@k, hit-rate, known recall, and MRR. Its default sparse judgment set gates on hit-rate only unless stricter recall, MRR, or precision thresholds are supplied; use --min-precision-at-k only with a query file that has enough judged useful IDs to make the requested precision mathematically reachable.

Cleanup Safety

Cleanup is dry-run and archive-first. Do not run destructive purge commands on a shared machine until the exact tenant/project list and archive path are approved.

memd cleanup-plan \
  --tenant-id "$TENANT_ID" \
  --project-id "$PROJECT_ID" \
  --project-dir . \
  --output tasks/memd-cleanup-plan.md \
  --archive-dir tasks/memd-cleanup-archive
memd purge --tenant-id "$TENANT_ID" --project-id "$PROJECT_ID" --older-than-days 30
memd purge \
  --tenant-id "$TENANT_ID" \
  --project-id "$PROJECT_ID" \
  --include-unreadable-active \
  --limit 100
memd purge \
  --tenant-id "$TENANT_ID" \
  --project-id "$PROJECT_ID" \
  --older-than-days 30 \
  --archive /path/to/archive.json \
  --apply \
  --rewrite-segments \
  --vacuum-metadata
memd purge-archive \
  --archive /path/to/archive.json \
  --expect-tenant-id "$TENANT_ID" \
  --expect-project-id "$PROJECT_ID"

cleanup-plan is non-destructive. It classifies tenants and projects for archive/delete review, high generated-digest noise, missing scope, legacy routine-progress rows without expiry, and hidden-row purge readiness, then emits command previews for the approved list. Each approval item has a stable approval_id, command kind, destructive flag, scope counts, generated-noise counts and ratios, payload-integrity counts, and legacy progress-retention counts. Treat unreadable_active_chunks > 0 as a dry-run item first: normal retrieval and export could not load every active metadata row, so run the generated memd purge --include-unreadable-active preview and inspect the candidate counts before approving destructive cleanup for that scope. Applying that cleanup still requires --apply --archive <path>; the archive records metadata, canonical text, candidate reason, and whether the segment payload was available. The approval_summary block rolls up command kinds, destructive-command preview coverage, archive-verifier coverage, estimated batch count, and the number of concrete batch command previews generated for those estimates. It also reports the unreadable active rows covered by purge previews and includes action rollups so reviewers can see how many items are tenant archive/delete reviews, project archive/delete reviews, high-noise reviews, scope/noise reviews, legacy progress-retention reviews, or unreadable-active purge reviews before inspecting every item. review_legacy_progress_retention items are export-review prompts only. They mark active legacy progress summaries that predate the current default TTL and need consolidation, expiry, or deletion decisions before any destructive cleanup is considered. When cleanup-plan can derive the next step, it also prints destructive_command, verify_archive, and estimated_batches fields. These are approval aids only: run the dry-run command first, approve the exact destructive command and archive path, then run memd purge-archive against the written archive before considering the batch complete. Large unreadable metadata cleanups also include batch_command_previews with unique archive paths for each estimated batch. These are an ordered sequence over the current candidate set, not offset-based pages: execute only the approved batch, verify its archive with the generated --min-records count, rerun the dry-run command, and continue with the next batch only while candidate counts remain consistent with the approved cleanup. Every cleanup plan also includes a post_cleanup_verification block with non-destructive commands and pass criteria. After any approved purge/archive cleanup, rerun the generated audit, cleanup-plan, startup-memory evaluation, memory refresh, and doctor commands. When the project directory has evals/bench/queries/retrieval_queries.jsonl, the block also includes a fixed memd eval-retrieval gate. Treat cleanup as incomplete unless those checks show reduced approved candidates without new unexplained high-risk classifications, retrieval hit-rate/known-recall/MRR pass the generated thresholds, memd eval-memory-md exits successfully with useful startup context that includes concrete action guidance, and host/project wiring remains valid. memd purge --apply verifies the archive before deleting rows and reports the verification summary in archive_verification. You can also run memd purge-archive against the exact archive path. Treat a verification failure, tenant/project mismatch, record-count mismatch, or payload flag mismatch as a failed cleanup run until explained.

For retrieval-sensitive projects without a checked-in retrieval fixture, add one or rerun representative memd search checks before treating storage reduction as successful.