Prompt Injection Attack
Definition
Malicious input tries to override system behavior.
Solution
Use instruction isolation, content filtering, permission controls, and output validation.
Security, Privacy & Governance Failures terms and explanations from the AI Failure Dictionary.
Definition
Malicious input tries to override system behavior.
Solution
Use instruction isolation, content filtering, permission controls, and output validation.
Definition
A prompt bypasses model safety restrictions.
Solution
Use stronger safety policies, red-team testing, output moderation, and layered guardrails.
Definition
Malicious or bad data is inserted into the system.
Solution
Use trusted sources, anomaly detection, review workflows, and secure ingestion.
Definition
Harmful examples are inserted during training.
Solution
Validate datasets, verify sources, and audit suspicious samples.
Definition
Malicious content is added to the RAG knowledge base and later retrieved.
Solution
Use document trust scoring, ingestion approval, and source validation.
Definition
Attackers try to copy the model through repeated queries.
Solution
Use rate limits, monitoring, access control, and abuse detection.
Definition
Attackers try to recover private training data from outputs.
Solution
Use privacy-preserving training, regularization, access limits, and output filtering.
Definition
Attackers try to determine whether a record was in training data.
Solution
Use differential privacy, regularization, and restricted access.
Definition
Sensitive data is extracted through the model or connected tools.
Solution
Use data access controls, output inspection, and least-privilege tool permissions.
Definition
Personal information is exposed in outputs, logs, prompts, or datasets.
Solution
Use redaction, masking, privacy filters, and safe logging practices.
Definition
API keys, credentials, tokens, or internal details are exposed.
Solution
Use secret scanning, vaults, key rotation, and strict logging rules.
Definition
The AI system has more access than needed.
Solution
Apply least-privilege access and scope permissions tightly.
Definition
An agent tool has broader access than necessary.
Solution
Narrow tool permissions and require approval for risky actions.
Definition
The AI performs an action without proper approval.
Solution
Use confirmation steps, authorization checks, and audit logging.
Definition
An agent can call tools that modify systems without guardrails.
Solution
Use sandboxing, approvals, permission scopes, and audit logs.
Definition
Models, datasets, packages, or tools introduce security vulnerabilities.
Solution
Use dependency scanning, trusted sources, model provenance, and signed artifacts.
Definition
Users or systems access data or functions they should not.
Solution
Use authentication, authorization, policy checks, and access reviews.
Definition
The organization cannot explain what the model did and why.
Solution
Keep logs, traces, model versions, data versions, prompts, and decision records.
Definition
The AI system violates legal, regulatory, or internal policy requirements.
Solution
Run governance reviews, compliance testing, and documentation checks.
Definition
Data is stored longer than allowed.
Solution
Use retention policies, automated deletion, and storage audits.
Definition
User data is used beyond the permission originally granted.
Solution
Track consent and enforce purpose limitation.
Definition
Governance rules exist on paper but are not enforced in the system.
Solution
Implement policy checks directly in pipelines, tools, and release gates.
Definition
Filters, policies, or validators fail to block risky output.
Solution
Use layered guardrails, adversarial testing, and monitoring.
Definition
Model output is trusted directly without validation.
Solution
Validate outputs before using them in tools, code, databases, or user-facing actions.
Explore more chapters or test your knowledge with quizzes.