Semantic Search as Cached Compute

TL;DR

Semantic search는 사전 indexing 비용을 지불해 runtime agent가 더 적은 token/time/cost로 context를 찾게 하는 cached compute로 볼 수 있다.

Rogut은 embeddings와 semantic search를 cached compute로 해석하고, upfront indexing이 runtime token, time, money를 줄인다고 주장했다 [2].
Cursor 원문은 semantic search 제공 시 Cursor Context Bench에서 평균 12.5% 질문 답변 정확도 향상을 보고했다 [4].
Cursor 원문은 Merkle tree와 chunk-content embedding cache로 변경되지 않은 chunk의 embedding 비용을 다시 내지 않는 구조를 설명했다 [5].