M

Production · Days 78-85

Safety, Security, and Guardrails

AI security is not a checkbox. Learn prompt injection, jailbreak resistance, tool-call exfiltration, moderation, guardrail frameworks, PII redaction, abuse prevention, schema validation, and audit logging.

Advanced 9 subtopics 8 daily blocks

Outcome

Threat-model AI apps against prompt injection, exfiltration, jailbreaks, unsafe outputs, abuse, and compliance blind spots.

Practice builds

Prompt injection labAI gateway guardrail middlewarePII-safe logging wrapper

What to learn

Prompt injection: direct and indirect
Jailbreak resistance
Data exfiltration through tool calls and rendered links
Content moderation with Llama Guard, OpenAI moderation, Azure-style systems
Guardrail frameworks: NeMo Guardrails and Guardrails AI
PII redaction in inputs and outputs
Rate limiting and abuse prevention
Output validation against schemas
Audit logging for compliance

Daily study plan

Day 78: Threat-model one AI feature and list trust boundaries.
Day 79: Test direct and indirect prompt injection attempts.
Day 80: Add allowlists for tools, domains, and sensitive actions.
Day 81: Add schema validation and safe failure responses.
Day 82: Add PII detection/redaction for logs and model inputs.
Day 83: Add rate limits, quotas, and abuse prevention controls.
Day 84: Add audit logging for user, prompt, tool, and output events.
Day 85: Run a jailbreak and exfiltration red-team checklist.

Resources