api/.agents/skills/redis-development/rules/vector-rag-pattern.md

---
title: Implement RAG Pattern Correctly
impact: HIGH
impactDescription: Proper RAG implementation improves LLM response quality
tags: vector, rag, llm, embeddings, retrieval
description: Implement RAG Pattern Correctly
alwaysApply: true
---

## Implement RAG Pattern Correctly

Store documents with embeddings, retrieve relevant context, and pass to LLM.

**Correct:** Full RAG pipeline with RedisVL.

```python
from redisvl.index import SearchIndex
from redisvl.query import VectorQuery

# 1. Store documents with embeddings
records = []
for doc in documents:
    records.append({
        "content": doc["content"],
        "embedding": embed_model.encode(doc["content"]).tolist(),
        "source": doc["source"]
    })

index.load(records)

# 2. Query with vector similarity
query_embedding = embed_model.encode(user_question)
results = index.query(VectorQuery(
    vector=query_embedding,
    vector_field_name="embedding",
    return_fields=["content", "source"],
    num_results=5
))

# 3. Pass context to LLM
context = "\n".join([r["content"] for r in results])
response = llm.generate(f"Context: {context}\n\nQuestion: {user_question}")
```

**Best practices:**
- Match your distance metric to your embedding model; many modern text embeddings already work well with COSINE
- Batch inserts using `index.load()` with lists
- Set appropriate M and EF_CONSTRUCTION for HNSW based on dataset size
- Use filters to reduce the search space before vector comparison
- Consider chunking long documents for better retrieval

Reference: [Redis RAG Quickstart](https://redis.io/docs/latest/develop/get-started/rag/)
refactor(api): TOML 配置 SSOT、统一错误契约、Auth/事务加固与可观测性 (#33) 配置 SSOT（TOML + .env）统一错误契约 Auth 与事务边界 Redis / Celery 可靠性:业务 Redis（DB/0）与 Celery broker/backend（DB/1）显式拆分；连接池、sync client 可观测性（OpenTelemetry + LGTM） 2026-05-22 13:44:50 +08:00			`---`
			`title: Implement RAG Pattern Correctly`
			`impact: HIGH`
			`impactDescription: Proper RAG implementation improves LLM response quality`
			`tags: vector, rag, llm, embeddings, retrieval`
			`description: Implement RAG Pattern Correctly`
			`alwaysApply: true`
			`---`

			`## Implement RAG Pattern Correctly`

			`Store documents with embeddings, retrieve relevant context, and pass to LLM.`

			`Correct: Full RAG pipeline with RedisVL.`

			```python
			`from redisvl.index import SearchIndex`
			`from redisvl.query import VectorQuery`

			`# 1. Store documents with embeddings`
			`records = []`
			`for doc in documents:`
			`records.append({`
			`"content": doc["content"],`
			`"embedding": embed_model.encode(doc["content"]).tolist(),`
			`"source": doc["source"]`
			`})`

			`index.load(records)`

			`# 2. Query with vector similarity`
			`query_embedding = embed_model.encode(user_question)`
			`results = index.query(VectorQuery(`
			`vector=query_embedding,`
			`vector_field_name="embedding",`
			`return_fields=["content", "source"],`
			`num_results=5`
			`))`

			`# 3. Pass context to LLM`
			`context = "\n".join([r["content"] for r in results])`
			`response = llm.generate(f"Context: {context}\n\nQuestion: {user_question}")`
			```

			`Best practices:`
			`- Match your distance metric to your embedding model; many modern text embeddings already work well with COSINE`
			- Batch inserts using `index.load()` with lists
			`- Set appropriate M and EF_CONSTRUCTION for HNSW based on dataset size`
			`- Use filters to reduce the search space before vector comparison`
			`- Consider chunking long documents for better retrieval`

			`Reference: [Redis RAG Quickstart](https://redis.io/docs/latest/develop/get-started/rag/)`