KV Cache

定义

键值缓存（Key-Value Cache），用于提高大模型推理中的缓存命中率。在 Transformer 架构中，KV Cache 存储了之前 token 的 Key 和 Value，避免重复计算。

在 Claude Code 中的应用

Claude Code 在缓存分块模块（constants/systemPromptSections.ts 中的 splitSysPromptPrefix()）会负责把最终的 System Prompt 数组拆分成缓存友好的块。

明确区分缓存范围

这样明确的告诉 Claude 哪些是前缀 Prefix，就可以显式的走 KV Cache，哪些是不需要做 KV Cache 的。

打包后的结构

javascript

[
  { text: "x-anthropic-billing-header: ...", cacheScope: null },    // 归属头（永不缓存）
  { text: "You are Claude Code...",          cacheScope: 'org' },   // 前缀
  { text: "静态内容（边界前）",                cacheScope: 'global' }, // 全局缓存
  { text: "动态内容（边界后）",                cacheScope: null },    // 不缓存
]

好处

容易提高缓存命中率，减少重复计算，提升推理速度。

来源

深度解析Claude Code在Prompt_Context_Harness的设计与实践