配置 SSOT(TOML + .env) 统一错误契约 Auth 与事务边界 Redis / Celery 可靠性:业务 Redis(DB/0)与 Celery broker/backend(DB/1)显式拆分;连接池、sync client 可观测性(OpenTelemetry + LGTM)
2.3 KiB
2.3 KiB
title, impact, impactDescription, tags, description, alwaysApply
| title | impact | impactDescription | tags | description | alwaysApply |
|---|---|---|---|---|---|
| Use LangCache for LLM Response Caching | HIGH | Reduces LLM API costs by 50-90% for similar queries | langcache, llm, semantic-cache, embeddings, ai | Use LangCache for LLM Response Caching | true |
Use LangCache for LLM Response Caching
Note: LangCache is currently in preview on Redis Cloud. Features and behavior may change.
LangCache is a fully-managed semantic caching service on Redis Cloud that reduces LLM costs and latency.
How it works:
- Your app sends a prompt to LangCache via
POST /v1/caches/{cacheId}/entries/search - LangCache generates an embedding and searches for similar cached responses
- If found (cache hit), returns the cached response instantly
- If not found (cache miss), your app calls the LLM and stores the response
Correct: Use the LangCache Python SDK.
from langcache import LangCache
import os
lang_cache = LangCache(
server_url=f"https://{os.getenv('HOST')}",
cache_id=os.getenv("CACHE_ID"),
api_key=os.getenv("API_KEY")
)
# Search for cached response
result = lang_cache.search(
prompt="What is Redis?",
similarity_threshold=0.9
)
if result:
response = result[0]["response"]
else:
response = llm.generate("What is Redis?")
# Store for future queries
lang_cache.set(
prompt="What is Redis?",
response=response
)
LangCache REST API:
# Search cache
curl -X POST "https://$HOST/v1/caches/$CACHE_ID/entries/search" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "What is Redis?"}'
# Store a response
curl -X POST "https://$HOST/v1/caches/$CACHE_ID/entries" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt": "What is Redis?", "response": "Redis is an in-memory database..."}'
With custom attributes for filtering:
# Store with attributes
lang_cache.set(
prompt="What is Redis?",
response="Redis is an in-memory database...",
attributes={"category": "database", "version": "v1"}
)
# Search with attribute filter
result = lang_cache.search(
prompt="Tell me about Redis",
attributes={"category": "database"},
similarity_threshold=0.9
)
Reference: LangCache Documentation