Agentic AI Glossary

RAG & Vector Retrieval Terms

RAG & Vector Retrieval Terms terms and explanations from the Agentic AI Glossary.

66 terms in this chapter

Agentic RAG

Definition

A retrieval workflow where an agent plans searches, uses tools, evaluates evidence, and iterates before answering.

Answer Relevance

Definition

How well an answer addresses the user's question or task.

Approximate Nearest Neighbor

Definition

A fast search method that finds near-best vector matches efficiently at scale.

BM25

Definition

A classic keyword ranking algorithm used in sparse retrieval and hybrid search.

Chunk Overlap

Definition

Repeated text shared between adjacent chunks to preserve context across boundaries.

Chunk Size

Definition

The amount of text placed into each retrieval unit during document processing.

Chunking

Definition

Splitting documents into smaller passages so retrieval can return useful, focused context.

Citation Generation

Definition

Producing references or source links that show where an answer was grounded.

Context Compression

Definition

Reducing retrieved content to the most relevant facts before sending it to the model.

Context Precision

Definition

The share of retrieved context that is actually useful for answering the question.

Context Recall

Definition

How much of the needed evidence the retrieval system successfully found.

Context Relevance

Definition

How relevant retrieved context is to the query and final answer.

Contextual Retrieval

Definition

Retrieval that uses surrounding context, metadata, or rewritten queries to improve relevance.

Corrective RAG

Definition

A RAG pattern that detects weak retrieval or unsupported answers and corrects the retrieval or response path.

Cosine Similarity

Definition

A measure of vector similarity based on the angle between vectors.

Cross-Encoder Reranker

Definition

A reranker that jointly reads the query and candidate passage to produce a relevance score.

Document Ingestion

Definition

The process of loading, parsing, cleaning, and preparing source documents for retrieval.

Document Parsing

Definition

Extracting text, tables, metadata, and structure from source files or webpages.

Dot Product Similarity

Definition

A similarity score calculated by multiplying and summing vector components.

Embedding

Definition

A vector representation of data used for semantic search and similarity matching.

Faithfulness

Definition

The degree to which an answer is supported by the provided context.

GraphRAG

Definition

A RAG approach that uses graph relationships among entities, documents, and facts to improve reasoning.

Grounded Response

Definition

An answer supported by retrieved evidence, source-of-truth data, or citations.

Hallucination Detection

Definition

Methods for identifying unsupported or fabricated model claims.

Hybrid Search

Definition

Combining keyword and vector retrieval to improve recall and relevance.

Keyword Search

Definition

Search based on exact words, phrases, or lexical matching.

Long-Context RAG

Definition

A retrieval approach that combines large context windows with selected external evidence.

Metadata Filtering

Definition

Restricting retrieval by document attributes such as source, date, owner, product, or permission.

Multi-Query Retrieval

Definition

Retrieving with several query variations to improve coverage of relevant context.

Query Expansion

Definition

Adding related terms or variants to a query to improve retrieval coverage.

Query Rewriting

Definition

Transforming a user question into a better search query or multiple focused queries.

RAG Pipeline

Definition

The end-to-end flow for ingestion, chunking, embedding, retrieval, reranking, and answer generation.

Reranking

Definition

Reordering retrieved results using a stronger relevance model after initial search.

Retrieval-Augmented Generation

Definition

A pattern that retrieves relevant external knowledge before generation so responses can be grounded in sources.

Retrieval Evaluation

Definition

Measuring whether the retrieval system returns useful and complete context.

Semantic Search

Definition

Search based on meaning rather than exact keyword matching.

Similarity Search

Definition

Finding items closest to a query representation in a vector or feature space.

Vector

Definition

A numeric representation of meaning, features, or model state.

Vector Database

Definition

A database optimized for storing embeddings and retrieving semantically similar content.

ANN Index

Definition

An approximate-nearest-neighbor index that finds similar vectors quickly without comparing every vector exactly.

Chroma

Definition

An open-source vector database commonly used for local RAG prototypes and lightweight embedding search.

Dense Retrieval

Definition

Dense Retrieval is a retrieval method focused on dense. It finds useful information before generation, tool use, grounding, or answer verification.

Elasticsearch

Definition

A distributed search engine used for keyword search, logs, analytics, and hybrid retrieval systems.

Embedding Drift

Definition

The term Embedding Drift means an embedding drift concept used in vector database and search terms for practical AI engineering work.

Embedding Space

Definition

The term Embedding Space means an embedding space concept used in vector database and search terms for practical AI engineering work.

FAISS

Definition

A library from Meta for efficient vector similarity search and clustering.

HNSW

Definition

Hierarchical Navigable Small World, a graph-based index commonly used for fast vector search.

Hybrid Retrieval

Definition

Hybrid Retrieval is a retrieval method focused on hybrid. It finds useful information before generation, tool use, grounding, or answer verification.

IVF

Definition

Inverted file indexing, a vector search method that partitions embeddings into clusters for faster lookup.

Metadata Store

Definition

A storage system for metadata data that an AI application can save, query, or retrieve during execution.

Milvus

Definition

An open-source vector database for storing, indexing, and searching large-scale embedding collections.

MMR

Definition

Maximal Marginal Relevance, a retrieval method that balances relevance with diversity in selected results.

OpenSearch

Definition

An open-source search and analytics engine used for logs, keyword search, and hybrid retrieval.

pgvector

Definition

A PostgreSQL extension that stores embeddings and supports vector similarity search inside Postgres.

Pinecone

Definition

A managed vector database service used to store and search embeddings at scale.

PQ

Definition

Product quantization, a compression method that reduces vector storage size for faster large-scale search.

Precision@k

Definition

The term Precision@k means a precision at k concept used in vector database and search terms for practical AI engineering work.

Qdrant

Definition

A vector database focused on similarity search, filtering, and production-ready embedding retrieval.

Recall@k

Definition

The term Recall@k means a recall at k concept used in vector database and search terms for practical AI engineering work.

Semantic Cache

Definition

Semantic Cache stores reusable semantic results. It reduces repeated work, lowers cost, and improves speed when similar requests appear again.

Similarity Threshold

Definition

The term Similarity Threshold means a similarity threshold concept used in vector database and search terms for practical AI engineering work.

Sparse Retrieval

Definition

Sparse Retrieval is a retrieval method focused on sparse. It finds useful information before generation, tool use, grounding, or answer verification.

Top-k Retrieval

Definition

Top-k Retrieval is a retrieval method focused on top-k. It finds useful information before generation, tool use, grounding, or answer verification.

Vector Index

Definition

An index structure for vector that speeds up lookup, retrieval, or similarity matching.

Vector Store

Definition

A storage system for vector data that an AI application can save, query, or retrieve during execution.

Weaviate

Definition

A vector database that combines semantic search, metadata filtering, and knowledge-oriented data modeling.

Explore more chapters or test your knowledge with quizzes.

Back to Agentic AI Glossary All glossary chapters Practice quizzes