AI Failure Dictionary

RAG & LLM Generative AI Failures

RAG & LLM Generative AI Failures terms and explanations from the AI Failure Dictionary.

81 terms in this chapter

Tokenization Error

Definition

Text is split into tokens incorrectly, damaging meaning.

Solution

Use the right tokenizer and test domain-specific terms, names, code, and multilingual text.

Out-of-Vocabulary Failure

Definition

The model struggles with rare words, names, slang, or new terms.

Solution

Use subword tokenization, domain data, updated embeddings, or retrieval.

Named Entity Recognition Failure

Definition

The model misses or mislabels names, dates, places, organizations, or products.

Solution

Add better labeled examples and evaluate entity-level precision and recall.

Entity Linking Failure

Definition

The model identifies an entity but links it to the wrong real-world concept.

Solution

Use better knowledge bases, disambiguation logic, and context-aware linking.

Coreference Failure

Definition

The model misunderstands references such as ``he,'' ``she,'' ``they,'' or ``it.''

Solution

Use better context modeling and coreference-specific training or evaluation data.

Negation Failure

Definition

The model misses words like ``not,'' ``never,'' or ``without.''

Solution

Add negation-heavy examples and rule-based checks for high-risk tasks.

Sarcasm Failure

Definition

The model interprets sarcastic text literally.

Solution

Use domain-specific examples and human review for sensitive decisions.

Ambiguity Failure

Definition

The model chooses the wrong meaning for a word or phrase.

Solution

Use more context or ask a clarifying question when the meaning is unclear.

Context Misunderstanding

Definition

The model ignores important surrounding text.

Solution

Use better prompts, retrieval, context windows, and targeted evaluation.

Sentiment Misclassification

Definition

The model assigns the wrong sentiment.

Solution

Use balanced labeled data, domain-specific evaluation, and error analysis.

Intent Classification Failure

Definition

The model misunderstands what the user wants.

Solution

Clarify intent labels and train on real user examples.

Slot Filling Failure

Definition

The system extracts the wrong values from user input.

Solution

Improve annotation rules, examples, schema validation, and extraction checks.

Translation Error

Definition

The model mistranslates text or loses meaning.

Solution

Use domain-tuned translation and human review for high-risk content.

Summarization Hallucination

Definition

A summary includes details not found in the source text.

Solution

Use grounding checks, citation requirements, and source-faithfulness evaluation.

Faithfulness Failure

Definition

Generated text does not accurately reflect the input.

Solution

Use extractive checks, citations, factual consistency metrics, and verification prompts.

Toxicity Detection Failure

Definition

Harmful content is missed or safe content is incorrectly flagged.

Solution

Use balanced safety datasets, threshold tuning, and human review for edge cases.

Language Drift

Definition

The model performs worse on dialects, multilingual text, slang, or new terms.

Solution

Continuously evaluate language coverage and update domain data.

Prompt Sensitivity

Definition

Small wording changes cause large output differences.

Solution

Test prompts across many examples and use robust templates.

Context Window Overflow

Definition

The text is too long for the model to process fully.

Solution

Use chunking, summarization, retrieval, or hierarchical processing.

Long-Context Failure

Definition

The model has access to long text but fails to use the right part.

Solution

Use retrieval, section ranking, summaries, and long-context evaluation.

Keyword Overmatching

Definition

The system focuses on matching words instead of meaning.

Solution

Use semantic search, embeddings, hybrid retrieval, and relevance evaluation.

Semantic Misclassification

Definition

Text is categorized incorrectly because the model misses the real meaning.

Solution

Improve labels, add representative examples, and evaluate semantic edge cases.

Entity Boundary Error

Definition

The model selects too much or too little text as an entity.

Solution

Improve annotation guidelines and use entity-span evaluation.

Topic Drift

Definition

Generated text slowly moves away from the original subject.

Solution

Use stronger prompts, section constraints, and validation checks.

Retrieval Failure

Definition

The system fails to retrieve useful documents.

Solution

Improve indexing, embeddings, query rewriting, search strategy, and retrieval evaluation.

Low Recall

Definition

The retriever misses important relevant documents.

Solution

Increase top-k, improve chunking, use hybrid search, and expand queries carefully.

Low Precision

Definition

The retriever returns too many irrelevant documents.

Solution

Use reranking, metadata filters, better embeddings, and relevance thresholds.

Bad Chunking

Definition

Documents are split in a way that loses meaning.

Solution

Use semantic chunking, overlap, and document-aware splitting.

Chunk Boundary Failure

Definition

The needed answer is split across chunks and not retrieved together.

Solution

Use chunk overlap, parent-child retrieval, larger chunks, or hierarchical retrieval.

Context Pollution

Definition

Irrelevant retrieved text distracts the model.

Solution

Use stricter retrieval filters, reranking, and context pruning.

Context Stuffing

Definition

Too much retrieved content overwhelms the model.

Solution

Select only the most relevant evidence and summarize where appropriate.

Corpus Gap

Definition

The answer is not present in the knowledge base.

Solution

Update the corpus or return a clear ``not found'' response.

Index Staleness

Definition

The vector or search index is outdated.

Solution

Schedule reindexing and monitor document freshness.

Document Drift

Definition

Indexed documents become outdated compared with the real world.

Solution

Track document versions, owners, expiration dates, and refresh cycles.

Wrong Document Version

Definition

The system retrieves an old or incorrect version of a document.

Solution

Use version-aware metadata filters and deprecate outdated content.

Duplicate Retrieval

Definition

The retriever returns repeated or near-identical chunks.

Solution

Apply deduplication and diversity-aware retrieval.

Embedding Mismatch

Definition

The query and documents are embedded in a way that fails to capture meaning.

Solution

Use stronger or domain-specific embedding models and evaluate retrieval quality.

Semantic Drift

Definition

Retrieved results are semantically related but not actually useful.

Solution

Use reranking, relevance labels, and task-specific retrieval evaluation.

Query Understanding Failure

Definition

The retriever misunderstands the user's search intent.

Solution

Use intent detection, query rewriting, and clarification for ambiguous queries.

Query Rewrite Failure

Definition

The system rewrites the user query incorrectly.

Solution

Evaluate rewrite quality and keep original query signals available.

Query Expansion Failure

Definition

Expanded terms move retrieval away from the true intent.

Solution

Limit expansion and validate expanded queries against relevance metrics.

Metadata Filter Failure

Definition

Incorrect metadata filters exclude relevant documents.

Solution

Validate metadata quality and test filter logic.

Missing Metadata

Definition

Important metadata such as date, author, version, or product is unavailable.

Solution

Enrich documents during ingestion and enforce metadata requirements.

Top-K Failure

Definition

The correct document exists but is not included in the selected top results.

Solution

Tune top-k, retrieval scoring, reranking, and hybrid search.

Reranking Failure

Definition

The reranker fails to move the best evidence to the top.

Solution

Train or select stronger rerankers and evaluate with labeled queries.

Grounding Gap

Definition

Relevant documents are retrieved, but the model does not use them correctly.

Solution

Use answer-evidence prompts and verification checks.

Grounding Failure

Definition

The final answer is not supported by retrieved context.

Solution

Require citations, refuse unsupported claims, and validate claim-source alignment.

Citation Hallucination

Definition

The model cites a source that does not support the answer.

Solution

Check each claim against its cited source before final output.

Source Attribution Failure

Definition

The model gives an answer without showing where it came from.

Solution

Require traceable source references for factual claims.

Attribution Failure

Definition

The answer cannot be traced back to reliable evidence.

Solution

Add source linking, evidence snippets, and audit logs.

Answer Synthesis Failure

Definition

The model retrieves the right evidence but combines it incorrectly.

Solution

Use structured synthesis prompts, chain verification, and contradiction checks.

Multi-Hop Retrieval Failure

Definition

The system fails when the answer requires multiple documents or reasoning steps.

Solution

Use iterative retrieval, graph retrieval, or query decomposition.

Retrieval Latency

Definition

Search or retrieval takes too long.

Solution

Optimize indexes, cache results, reduce candidate sets, and tune infrastructure.

Vector Search Failure

Definition

Similarity search fails to find the most useful content.

Solution

Use hybrid search, better embeddings, metadata filters, and reranking.

Hallucination

Definition

The model generates false or unsupported information.

Solution

Use grounding, retrieval, citations, uncertainty handling, and factuality evaluation.

Fabrication

Definition

The model invents facts, numbers, sources, citations, or events.

Solution

Require evidence and allow the model to say ``I do not know'' when information is missing.

False Confidence

Definition

The model sounds certain even when it is wrong.

Solution

Calibrate responses and require source-backed claims for factual answers.

Contradiction

Definition

The model gives answers that conflict with itself or known facts.

Solution

Use consistency checks, better context management, and verification prompts.

Ambiguous Output

Definition

The model gives an unclear answer that can be interpreted multiple ways.

Solution

Ask clarifying questions or enforce structured output.

Instruction Misfollowing

Definition

The model does not follow the user or system instruction.

Solution

Use clearer prompts, examples, schemas, and output validators.

Instruction Conflict

Definition

The model receives competing instructions and follows the wrong one.

Solution

Define instruction priority and remove contradictions.

Instruction Misinterpretation

Definition

The model misunderstands what the user actually asked.

Solution

Add task clarification, examples, and intent checks.

Context Loss

Definition

The model forgets or ignores important information from earlier context.

Solution

Use conversation summaries, memory, retrieval, and better context selection.

Answer Drift

Definition

The response slowly moves away from the original question or task.

Solution

Use tighter prompts, checkpoints, and validation against the user request.

Repetition Loop

Definition

The model repeats the same phrase, idea, or pattern.

Solution

Use decoding controls, repetition penalties, and response validation.

Mode Collapse

Definition

The model gives repetitive, generic, or overly similar answers.

Solution

Improve prompting, sampling settings, examples, and output diversity checks.

Verbosity Failure

Definition

The model gives too much detail when a concise answer is needed.

Solution

Specify length, audience, and format constraints.

Under-Answering

Definition

The model gives an incomplete answer.

Solution

Use coverage rubrics, checklists, and completeness validation.

Over-Answering

Definition

The model adds unnecessary or unsupported information.

Solution

Limit scope and require evidence for added claims.

Role Confusion

Definition

The model misunderstands which role or persona it should follow.

Solution

Use clear role instructions and periodic role reminders.

Format Failure

Definition

The output does not follow the requested format.

Solution

Use schemas, examples, structured output, and validators.

Refusal Failure

Definition

The model refuses a safe request or answers an unsafe request.

Solution

Improve safety policy interpretation and refusal evaluation.

Safety Failure

Definition

The model produces harmful, unsafe, or policy-violating content.

Solution

Use safety filters, policy checks, red-team testing, and human review for high-risk cases.

Toxic Output

Definition

The model generates offensive, abusive, hateful, or unsafe language.

Solution

Use toxicity filtering, safer training data, and moderation policies.

Speculative Answering

Definition

The model guesses instead of saying it does not know.

Solution

Instruct the model to express uncertainty and ask for missing information.

Capability Overclaiming

Definition

The model claims it can do something it cannot actually do.

Solution

Clearly define available tools, limits, and system capabilities.

Unfaithful Reasoning

Definition

The explanation does not match the real basis for the answer.

Solution

Use evidence-based explanations and separate hidden reasoning from user-facing justification.

Non-Determinism

Definition

The model produces different answers for the same input.

Solution

Lower temperature, use deterministic settings where possible, and validate outputs.

Tool-Use Hallucination

Definition

The model claims it used a tool or source when it did not.

Solution

Separate tool execution from response generation and log tool calls.

Emergent Misbehavior

Definition

Unexpected harmful or incorrect behavior appears at scale.

Solution

Use staged rollout, monitoring, red teaming, and incident response.

Alignment Failure

Definition

The AI system behaves in a way that does not match human goals, rules, or expectations.

Solution

Improve instruction tuning, policy design, evaluation, and guardrails.

Explore more chapters or test your knowledge with quizzes.

Back to AI Failure Dictionary All glossary chapters Practice quizzes