GenAI & Agent Governance

Time	Topic
0:00–0:35	Risks and failure modes
0:35–1:10	Evaluation and verification
1:10–1:35	Governance and regulation
1:35–1:50	Break
1:50–2:25	Hands-on: Red-teaming workshop
2:25–2:55	Hands-on: Evaluation pipeline
2:55–3:00	Wrap-up + Assignment 5

Generative AI for Business — Week 6

The regulatory landscape

    ┌─────────────────────────────────────────────────────┐
    │                                                      │
    │  EU AI ACT (2024-2026 rollout)                       │
    │  ┌─────────────────────────────────────────────┐    │
    │  │  UNACCEPTABLE    Social scoring, real-time   │    │
    │  │  RISK            biometric surveillance      │    │
    │  │  → Banned                                    │    │
    │  ├─────────────────────────────────────────────┤    │
    │  │  HIGH RISK       Hiring, lending, medical,   │    │
    │  │                  law enforcement              │    │
    │  │  → Strict requirements (audit, transparency) │    │
    │  ├─────────────────────────────────────────────┤    │
    │  │  LIMITED RISK    Chatbots, content generation │    │
    │  │  → Transparency obligations (disclose AI)    │    │
    │  ├─────────────────────────────────────────────┤    │
    │  │  MINIMAL RISK    Spam filters, games         │    │
    │  │  → No requirements                           │    │
    │  └─────────────────────────────────────────────┘    │
    │                                                      │
    │  NIST AI RMF          US EXECUTIVE ORDERS            │
    │  ┌────────────────┐  ┌────────────────────────┐     │
    │  │ Govern │ Map   │  │ Safety testing for     │     │
    │  │ Measure│Manage │  │ dual-use foundation    │     │
    │  │                │  │ models                  │     │
    │  │ Voluntary      │  │ Federal procurement    │     │
    │  │ framework      │  │ requirements            │     │
    │  └────────────────┘  └────────────────────────┘     │
    └─────────────────────────────────────────────────────┘

GenAI & Agent Governance

Risk, Evaluation, Verification & Regulation

Today's agenda

The risk landscape

Hallucination: types and causes

Hallucination mitigation

Bias and fairness

Security: prompt injection

Privacy risks

Evaluation & Verification

Why evaluation is hard

Evaluation dimensions

Model-as-judge

Red-teaming

The regulatory landscape

Building an AI governance program

Break

15 minutes

Hands-on

Red-Teaming + Evaluation Pipeline

Exercise 1: Red-teaming workshop

Exercise 2: Evaluation pipeline

Assignment 5 (due next week)

Next week

Questions?