配置 SSOT(TOML + .env) 统一错误契约 Auth 与事务边界 Redis / Celery 可靠性:业务 Redis(DB/0)与 Celery broker/backend(DB/1)显式拆分;连接池、sync client 可观测性(OpenTelemetry + LGTM)
53 lines
1.6 KiB
Markdown
53 lines
1.6 KiB
Markdown
---
|
|
title: Implement RAG Pattern Correctly
|
|
impact: HIGH
|
|
impactDescription: Proper RAG implementation improves LLM response quality
|
|
tags: vector, rag, llm, embeddings, retrieval
|
|
description: Implement RAG Pattern Correctly
|
|
alwaysApply: true
|
|
---
|
|
|
|
## Implement RAG Pattern Correctly
|
|
|
|
Store documents with embeddings, retrieve relevant context, and pass to LLM.
|
|
|
|
**Correct:** Full RAG pipeline with RedisVL.
|
|
|
|
```python
|
|
from redisvl.index import SearchIndex
|
|
from redisvl.query import VectorQuery
|
|
|
|
# 1. Store documents with embeddings
|
|
records = []
|
|
for doc in documents:
|
|
records.append({
|
|
"content": doc["content"],
|
|
"embedding": embed_model.encode(doc["content"]).tolist(),
|
|
"source": doc["source"]
|
|
})
|
|
|
|
index.load(records)
|
|
|
|
# 2. Query with vector similarity
|
|
query_embedding = embed_model.encode(user_question)
|
|
results = index.query(VectorQuery(
|
|
vector=query_embedding,
|
|
vector_field_name="embedding",
|
|
return_fields=["content", "source"],
|
|
num_results=5
|
|
))
|
|
|
|
# 3. Pass context to LLM
|
|
context = "\n".join([r["content"] for r in results])
|
|
response = llm.generate(f"Context: {context}\n\nQuestion: {user_question}")
|
|
```
|
|
|
|
**Best practices:**
|
|
- Match your distance metric to your embedding model; many modern text embeddings already work well with COSINE
|
|
- Batch inserts using `index.load()` with lists
|
|
- Set appropriate M and EF_CONSTRUCTION for HNSW based on dataset size
|
|
- Use filters to reduce the search space before vector comparison
|
|
- Consider chunking long documents for better retrieval
|
|
|
|
Reference: [Redis RAG Quickstart](https://redis.io/docs/latest/develop/get-started/rag/)
|