life-echo/api/.agents/skills/redis-development/rules/semantic-cache-langcache-usage.md at 8f66f94a1a89b868aa1c9cd31700113f7cdcd5a1

Files

Sully 53e0065e3e refactor(api): TOML 配置 SSOT、统一错误契约、Auth/事务加固与可观测性 (#33 )

配置 SSOT（TOML + .env）
统一错误契约
Auth 与事务边界
Redis / Celery 可靠性:业务 Redis（DB/0）与 Celery broker/backend（DB/1）显式拆分；连接池、sync client
可观测性（OpenTelemetry + LGTM）

2026-05-22 13:44:50 +08:00

2.3 KiB

Raw Blame History

title, impact, impactDescription, tags, description, alwaysApply

title	impact	impactDescription	tags	description	alwaysApply
Use LangCache for LLM Response Caching	HIGH	Reduces LLM API costs by 50-90% for similar queries	langcache, llm, semantic-cache, embeddings, ai	Use LangCache for LLM Response Caching	true

Use LangCache for LLM Response Caching

Note: LangCache is currently in preview on Redis Cloud. Features and behavior may change.

LangCache is a fully-managed semantic caching service on Redis Cloud that reduces LLM costs and latency.

How it works:

Your app sends a prompt to LangCache via POST /v1/caches/{cacheId}/entries/search
LangCache generates an embedding and searches for similar cached responses
If found (cache hit), returns the cached response instantly
If not found (cache miss), your app calls the LLM and stores the response

Correct: Use the LangCache Python SDK.

from langcache import LangCache
import os

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY")
)

# Search for cached response
result = lang_cache.search(
    prompt="What is Redis?",
    similarity_threshold=0.9
)

if result:
    response = result[0]["response"]
else:
    response = llm.generate("What is Redis?")
    # Store for future queries
    lang_cache.set(
        prompt="What is Redis?",
        response=response
    )

LangCache REST API:

# Search cache
curl -X POST "https://$HOST/v1/caches/$CACHE_ID/entries/search" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is Redis?"}'

# Store a response
curl -X POST "https://$HOST/v1/caches/$CACHE_ID/entries" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is Redis?", "response": "Redis is an in-memory database..."}'

With custom attributes for filtering:

# Store with attributes
lang_cache.set(
    prompt="What is Redis?",
    response="Redis is an in-memory database...",
    attributes={"category": "database", "version": "v1"}
)

# Search with attribute filter
result = lang_cache.search(
    prompt="Tell me about Redis",
    attributes={"category": "database"},
    similarity_threshold=0.9
)

Reference: LangCache Documentation

2.3 KiB Raw Blame History

Use LangCache for LLM Response Caching

2.3 KiB

Raw Blame History