Pattern · Context

Caching · thinking · compute

Anthropic

CACHE

Distinct from semantic caching. Yan's Caching pattern covers GPTCache-style semantic caching (match requests by embedding similarity). Generally avoid in agent loops — silent false-match risk is high. Safe applications are narrow: pre-computed summaries against item IDs or constrained input combinations a human can verify.

Auditing prompt-cache utilization is one of the highest-ROI quick wins — many teams burn 3–5× more than they need to. Recommend semantic caching only against item IDs, never against free-text queries inside an agent loop.