Compaction bounds the size of the event buffer when events accumulate faster than extraction removes them. It reads the buffer entries, summarizes them into fewer, denser entries via an LLM, and atomically replaces the buffer contents. The result is a smaller buffer that preserves the important signal from what came before. Compaction is asynchronous and does not block event ingestion. It runs when the buffer watcher fires a compaction trigger.Documentation Index
Fetch the complete documentation index at: https://kiro-learn.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
When it fires
The buffer watcher tracks accumulated bytes per project. When the buffer exceeds the compaction size threshold (default: 1 MiB), the watcher fires a compaction trigger. Compaction triggers independently of extraction. A single append can cross the extraction threshold (256 KiB) and the compaction threshold (1 MiB) at the same moment. Both fire in parallel. Three thresholds bound buffer size:| Threshold | Default | Behavior |
|---|---|---|
| Extraction size | 256 KiB | Fires extraction. Buffer is cleared on success. |
| Compaction size | 1 MiB | Fires compaction. Buffer is summarized in-place and replaced. |
| Hard ceiling | 4 MiB | Appends are refused. Events are still persisted to the database. |
How it works
1. Snapshot with byte offset
The worker reads the buffer entries and the current file size (S0) in a single atomic read. S0 marks the boundary between entries the worker will summarize and entries that may be appended during the model call. Capturing both values from the same read prevents appends from slipping between them.2. LLM summarization
The worker serializes the buffer entries as XML — the same<tool_observation> format used by extraction — and sends them to the kiro-learn-compactor agent via ACP. The prompt instructs the model to:
- Summarize the entries into fewer, denser entries
- Preserve decisions, errors, patterns, and discoveries
- Merge related entries
- Drop redundant or low-value entries
- Emit each summary as a
<compacted_entry>block
<compacted_entry> block from the response. Each becomes a new buffer entry with a synthetic ID (compact_ + ULID) and kind: session_summary. Namespace and surface are inherited from the original entries. The timestamp is set to the latest timestamp across the batch so ordering is preserved.
3. Atomic replace with catch-up
Between snapshot time (S0) and now, new events may have been appended. The buffer store’sreplace operation handles this concurrency:
The exclusive file lock (POSIX flock) serializes the replace against concurrent appends. The lock is held only for the read-write-rename window. POSIX guarantees rename is atomic on the same filesystem, so a concurrent reader sees either the old file or the new file — never a partial state.
Events appended during compaction are preserved: they are read from the catch-up window during replace and appended to the new buffer file after the compacted entries.
Feedback into the buffer
Compaction writes its output back to the same buffer file it read from. The result is not a separate artifact — it is a smaller buffer that extraction will process on its next trigger. After compaction, the buffer contains:session_summaryentries produced by the compactor model.- Catch-up entries — events appended during the compaction run, preserved intact.
Relationship to extraction
Extraction and compaction address different problems:| Property | Extraction | Compaction |
|---|---|---|
| Fires when | 256 KiB accumulated or 5s idle | 1 MiB accumulated |
| Input | Raw buffered events | Buffer entries (possibly including prior summaries) |
| Output | Memory records in the database | Summary entries back in the buffer |
| Clears buffer | Yes, on success | No — replaces in-place |
| Purpose | Normal memory formation | Bound buffer size when extraction is behind |
Reliability
Reentrance guard
Only one compaction runs at a time across all projects. A compaction trigger that fires while another is in-flight is rejected immediately. This bounds resource usage — each compaction spawns an LLM child process and holds a file lock. The guard is released in afinally block. A crashed or failed compaction does not leave the worker blocked.
Retries and deterministic fallback
LLM responses are not reliable. The worker has three layers of defense: Retry. If the model returns no<compacted_entry> blocks, the worker creates a fresh ACP session and tries again, up to 2 attempts by default.
Deterministic eviction. If all retries fail, the worker sorts entries by timestamp descending and keeps the most recent half (rounded up). No model call is involved. Entries are kept by reference — nothing is fabricated.
Circuit breaker. Consecutive model failures are tracked per project. After 3 consecutive failures, the worker skips the model call and uses deterministic eviction directly. A single success resets the counter.
Deterministic eviction produces a worse result than LLM summarization — the older half of the buffer is dropped without condensation — but it is bounded, free, and always succeeds. The buffer size always decreases.
Timeouts
Every model call has a 2-minute timeout by default. If the compactor does not complete within that window, the child process is killed and the call is recorded as a failure. Timeout failures feed into the retry and circuit breaker logic.Disabled by default
Compaction is off by default and must be enabled via configuration. The compactor model is expensive to run, and not every deployment needs LLM-driven summarization. When disabled, the watcher still tracks buffer size and the hard ceiling still prevents unbounded growth; the compaction trigger is ignored.What gets preserved
Compaction is lossy at the individual-event level but retains the important signal. The prompt instructs the model to preserve:- Decisions — architectural choices, tradeoffs, selected approaches
- Errors — failures, misconfigurations, incorrect paths
- Patterns — repeated behaviors, recurring themes
- Discoveries — new facts learned about the codebase
- Redundant entries — near-identical tool calls
- Low-value entries — routine reads that did not change understanding
Key design decisions
In-place replacement. Compacted output is written back to the same buffer file. Extraction has no knowledge of compaction — it reads a smaller buffer on the next trigger. The two workers remain decoupled. Catch-up window instead of a global lock. Holding a lock across the entire model call would block ingestion for up to 2 minutes. Instead, the byte offset S0 is captured at snapshot time, appends continue during the model call, and the replace operation reads the catch-up window under a short-duration lock. Ingestion is never blocked by compaction. Deterministic fallback. When the model fails, dropping the oldest half of the buffer is a worse outcome than a good summary, but it is better than leaving the buffer oversized. Compaction must make progress because the alternative is the hard ceiling, which refuses new events. Shared infrastructure with extraction. Compaction uses the same ACP client, XML framing, and child-process lifecycle as extraction. Only the agent name (kiro-learn-compactor vs kiro-learn-compressor) and the prompt differ.
No event loss on failure. If compaction fails — reentrance rejection, model timeout, replace error — the buffer is left untouched. Events remain in the buffer and in the database. The next trigger retries.
Related pages
Summarization
How turn summaries are produced and stored
Extraction
The normal consumer of buffer entries
Retrieval
How surviving memory records are searched
Database
Where memory records persist after extraction
Collector
The daemon hosting the buffer watcher and compaction trigger
Event buffer
The per-project staging area compaction operates on