Generative AI for Business — Week 2

Generative AI in Action (I)

Foundation Models & the GenAI Ecosystem

Week 2

JHU Carey Business School | 2026
Generative AI for Business — Week 2

Today's agenda

Time Topic
0:00–0:35 Foundation model landscape
0:35–1:10 The GenAI ecosystem
1:10–1:35 Choosing models for business
1:35–1:50 Break
1:50–2:25 Hands-on: Model comparison
2:25–2:55 Hands-on: Build a GenAI app
2:55–3:00 Wrap-up + Assignment 1
JHU Carey Business School | 2026
Generative AI for Business — Week 2

What is a "foundation model"?

                    ┌─────────────────────────┐
                    │    Foundation Model      │
                    │   (pre-trained on vast   │
                    │    general-purpose data) │
                    └────────┬────────────────┘
                             │
              ┌──────────────┼──────────────┐
              ▼              ▼              ▼
        ┌──────────┐  ┌──────────┐  ┌──────────┐
        │ Chatbot  │  │ Code Gen │  │ Analysis │
        └──────────┘  └──────────┘  └──────────┘
              ▼              ▼              ▼
        ┌──────────┐  ┌──────────┐  ┌──────────┐
        │ Med Q&A  │  │ Legal    │  │ Finance  │
        └──────────┘  └──────────┘  └──────────┘

One model, many downstream applications — that's the "foundation" idea

JHU Carey Business School | 2026
Generative AI for Business — Week 2

The players

         CLOSED SOURCE                    OPEN SOURCE / WEIGHTS
    ┌─────────────────────┐         ┌─────────────────────┐
    │                     │         │                     │
    │  OpenAI  (GPT-4o)   │         │  Meta   (Llama 3)   │
    │  Anthropic (Claude)  │         │  Mistral (Mixtral)   │
    │  Google  (Gemini)    │         │  DeepSeek (R1, V3)   │
    │                     │         │  Alibaba (Qwen)      │
    │  API access only    │         │  Downloadable        │
    │  Pay per token      │         │  Self-hostable       │
    └─────────────────────┘         └─────────────────────┘

                    ▲                         ▲
              Most capable              Most flexible
              Easiest to start          Most control
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Open vs. closed: the trade-offs

          Closed Source                        Open Weights
    ┌──────────────────┐                ┌──────────────────┐
    │ ✓ Best capability │                │ ✓ Full control   │
    │ ✓ Managed infra   │                │ ✓ Data privacy   │
    │ ✓ Quick to start  │                │ ✓ No vendor lock │
    │                   │                │ ✓ Can fine-tune  │
    │ ✗ Vendor lock-in  │                │                   │
    │ ✗ Data leaves org │                │ ✗ Need infra     │
    │ ✗ Cost at scale   │                │ ✗ More expertise │
    │ ✗ Less control    │                │ ✗ Slower updates │
    └──────────────────┘                └──────────────────┘
                         ▲
                   Most orgs start here
                   and expand to both
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Model size tiers

    ┌────────────────────────────────────────────────────────────┐
    │  FRONTIER  (400B+ params)                                  │
    │  GPT-4o, Claude Opus, Gemini Ultra                         │
    │  Complex reasoning, analysis, creative writing              │
    │  $$$                                                        │
    ├────────────────────────────────────────────────────────────┤
    │  MID-TIER  (30-100B params)                                │
    │  Claude Sonnet, Gemini Pro, Llama 70B                      │
    │  Most business tasks, good cost/quality balance             │
    │  $$                                                         │
    ├────────────────────────────────────────────────────────────┤
    │  SMALL  (1-10B params)                                     │
    │  Claude Haiku, Gemini Flash, Llama 8B, Phi-3               │
    │  Classification, extraction, routing, high-volume           │
    │  $                                                          │
    └────────────────────────────────────────────────────────────┘

Not every task needs the biggest model

JHU Carey Business School | 2026
Generative AI for Business — Week 2

The multimodal convergence

                         ┌───────────────┐
                         │               │
                         │   Multimodal  │
                         │    Model      │
                         │               │
                         └───────┬───────┘
                    ┌────────────┼────────────┐
                    │            │            │
              ┌─────▼────┐ ┌────▼─────┐ ┌────▼─────┐
              │  Text    │ │  Vision  │ │  Audio   │
              │  in/out  │ │  in/out  │ │  in/out  │
              └──────────┘ └──────────┘ └──────────┘
                    │            │            │
              ┌─────▼────┐ ┌────▼─────┐ ┌────▼─────┐
              │  Code    │ │  Video   │ │  Music   │
              │  in/out  │ │  in/out  │ │  in/out  │
              └──────────┘ └──────────┘ └──────────┘

Models are converging toward unified multimodal architectures

JHU Carey Business School | 2026
Generative AI for Business — Week 2

The GenAI ecosystem

┌─────────────────────────────────────────────────────────────┐
│  APPLICATION LAYER                                          │
│  Chatbots, copilots, agents, content tools, search          │
├─────────────────────────────────────────────────────────────┤
│  ORCHESTRATION LAYER                                        │
│  LangChain, LlamaIndex, Claude Code, prompt management      │
├─────────────────────────────────────────────────────────────┤
│  MODEL LAYER                                                │
│  GPT-4o, Claude, Gemini, Llama, Mistral, domain-specific    │
├─────────────────────────────────────────────────────────────┤
│  INFRASTRUCTURE LAYER                                       │
│  NVIDIA GPUs, cloud providers (AWS, Azure, GCP), MLOps      │
└─────────────────────────────────────────────────────────────┘

         Value capture shifts UP over time →
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Infrastructure layer deep dive

    TRAINING                              INFERENCE
    ┌────────────────────┐          ┌────────────────────┐
    │  GPU clusters      │          │  API endpoints     │
    │  (NVIDIA H100/B200)│          │  (pay per token)   │
    │                    │          │                    │
    │  Months of compute │          │  Milliseconds per  │
    │  $10M-$100M+       │          │  request           │
    │  Done once         │          │  Ongoing cost      │
    └────────────────────┘          └────────────────────┘
            │                               │
            ▼                               ▼
    Only big labs and               Everyone uses this
    well-funded startups            (this is how you
    do this                         access models)
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Token economics

    INPUT                                OUTPUT
    (your prompt)                        (model's response)

    "Summarize this 10-page      →     "The report shows
     quarterly report..."               three key trends..."

    ~4,000 tokens                       ~500 tokens
    @ $3/M tokens                       @ $15/M tokens
    = $0.012                             = $0.0075

                    Total: ~$0.02 per query
    At 10,000 queries/day:    $200/day    →    $73K/year
    At 100,000 queries/day:   $2,000/day  →    $730K/year

The math matters. Model selection directly impacts margins.

JHU Carey Business School | 2026
Generative AI for Business — Week 2

The model layer

    PRE-TRAINED                FINE-TUNED              DISTILLED
    ┌──────────────┐     ┌──────────────┐     ┌──────────────┐
    │ General       │     │ Specialized  │     │ Small but    │
    │ knowledge     │     │ for your     │     │ mimics the   │
    │              │ ──→ │ domain       │ ──→ │ big model    │
    │ Use via API  │     │              │     │              │
    │ + prompting  │     │ Need data +  │     │ Cheap +      │
    │              │     │ compute      │     │ fast         │
    └──────────────┘     └──────────────┘     └──────────────┘

    Most common             When prompting          When you need
    starting point          isn't enough            scale + speed
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Application layer patterns

    ┌─────────────────────────────────────────────┐
    │              APPLICATION PATTERNS             │
    ├──────────────┬──────────────┬────────────────┤
    │              │              │                │
    │   CHATBOT    │   COPILOT    │    AGENT       │
    │              │              │                │
    │  User asks   │  User works  │  System acts   │
    │  AI answers  │  AI assists  │  autonomously  │
    │              │              │                │
    │  Customer    │  Code editor │  Email triage  │
    │  support     │  Writing aid │  Data pipeline │
    │  FAQ bot     │  Analysis    │  Monitoring    │
    │              │              │                │
    │  LOW         │  MEDIUM      │  HIGH          │
    │  autonomy    │  autonomy    │  autonomy      │
    └──────────────┴──────────────┴────────────────┘
JHU Carey Business School | 2026
Generative AI for Business — Week 2

LLMOps: running GenAI in production

    ┌──────────────────────────────────────────────────┐
    │                  PRODUCTION LOOP                  │
    │                                                   │
    │   Prompts ──→ Gateway ──→ Model ──→ Output       │
    │      │          │                     │           │
    │      ▼          ▼                     ▼           │
    │   Version    Rate limit           Evaluate        │
    │   control    & cache              & monitor       │
    │   & test     & route              & log           │
    │                                                   │
    │         ◄──── Feedback loop ────►                 │
    │                                                   │
    │   Costs ◄── Dashboard ──► Alerts                  │
    └──────────────────────────────────────────────────┘
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Choosing models: the decision framework

    ┌─────────────────────────────────────────────────┐
    │                                                  │
    │                  TASK COMPLEXITY                  │
    │                                                  │
    │  HIGH │  Fine-tune     │  Frontier model         │
    │       │  open model    │  (GPT-4o, Claude Opus)  │
    │       │                │                          │
    │  ─────┼────────────────┼──────────────────────── │
    │       │                │                          │
    │  LOW  │  Small model   │  Mid-tier model         │
    │       │  (Haiku, Phi)  │  (Sonnet, Gemini Pro)   │
    │       │                │                          │
    │       └────────────────┴──────────────────────── │
    │         LOW                HIGH                   │
    │              DATA SENSITIVITY                     │
    │                                                  │
    └─────────────────────────────────────────────────┘
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Build vs. buy vs. configure

    BUY (SaaS)              CONFIGURE (API)           BUILD (Custom)
    ┌──────────────┐     ┌──────────────┐     ┌──────────────┐
    │  Off-the-shelf│     │  API + custom │     │  Train/fine- │
    │  product      │     │  prompts +    │     │  tune your   │
    │              │     │  integration  │     │  own model   │
    │  Fastest      │     │              │     │              │
    │  Least control│     │  Sweet spot  │     │  Most control│
    │  $/seat/month │     │  for most    │     │  Most effort │
    │              │     │              │     │  Best fit    │
    └──────┬───────┘     └──────┬───────┘     └──────┬───────┘
           │                     │                     │
           ▼                     ▼                     ▼
     Days to deploy      Weeks to deploy       Months to deploy

Most companies should start in the middle

JHU Carey Business School | 2026
Generative AI for Business — Week 2

Case discussion: model selection

Your company has three GenAI use cases. Which model strategy for each?

┌────────────────────────────┬──────────┬───────────┬──────────┐
│ Use Case                   │ Volume   │ Accuracy  │ Data     │
│                            │          │ Needed    │ Sensitiv.│
├────────────────────────────┼──────────┼───────────┼──────────┤
│ Customer email auto-reply  │ 50K/day  │ Medium    │ Medium   │
│ Legal contract review      │ 50/day   │ Very High │ Very High│
│ Internal meeting summaries │ 500/day  │ Medium    │ Low      │
└────────────────────────────┴──────────┴───────────┴──────────┘

Discuss: What model tier, open vs. closed, and deployment approach for each?

JHU Carey Business School | 2026
Generative AI for Business — Week 2

Vendor lock-in: the real risk

    START                                           LOCK-IN
    ┌──────┐                                    ┌──────────┐
    │ Pick │                                    │ Prompts  │
    │ one  │──→ Build prompts ──→ Build app ──→ │ tuned to │
    │ API  │                                    │ one model│
    └──────┘                                    └──────────┘

    MITIGATION STRATEGIES:
    ┌──────────────────────────────────────────────────────┐
    │  1. Abstract the model layer (swap models easily)    │
    │  2. Test across multiple models regularly             │
    │  3. Avoid model-specific features unless critical     │
    │  4. Keep prompts model-agnostic where possible        │
    │  5. Use open standards (MCP, OpenAPI) for tooling     │
    └──────────────────────────────────────────────────────┘
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Break

15 minutes

JHU Carey Business School | 2026
Generative AI for Business — Week 2

Hands-on

Model Comparison + Building a GenAI App

JHU Carey Business School | 2026
Generative AI for Business — Week 2

Exercise 1: Model comparison

cd scripts/week2
claude "Read model_compare.py and explain what it does"

Your task:

  1. Run the script with the default business task
  2. Add a third model to the comparison
  3. Add an evaluation criterion (e.g., tone, specificity)
  4. Generate a comparison report
    Prompt ──→ Model A ──→ Response A ──┐
           ──→ Model B ──→ Response B ──┼──→ Comparison
           ──→ Model C ──→ Response C ──┘      Report
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Exercise 2: Build a GenAI app

claude "Read simple_app.py, explain it, then help me extend it"

MBA track: Build a meeting summarizer or email drafter
MS track: Add structured output parsing, streaming, error handling

    ┌──────────────┐     ┌──────────────┐     ┌──────────────┐
    │  User Input  │ ──→ │ System Prompt │ ──→ │   Output     │
    │  (meeting    │     │ + Context     │     │  (summary,   │
    │   notes,     │     │ + History     │     │   action     │
    │   email)     │     │              │     │   items)     │
    └──────────────┘     └──────────────┘     └──────────────┘
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Assignment 1 (due next week)

MS section:

  • Use Claude Code to build and extend model_compare.py
  • Compare 3 models on a business task of your choice
  • Submit: script + 1-page comparison write-up

MBA section:

  • Read "What Leaders Need to Know" + "Who Profits Most from GenAI?"
  • Write a 1-page memo: recommend a GenAI strategy for a company of your choice
  • Use the ecosystem framework from today's lecture

Graded on completion (not rubric)

JHU Carey Business School | 2026
Generative AI for Business — Week 2

Next week preview

Week 3: Generative AI in Action (II)

  • Reasoning models (o1/o3, DeepSeek-R1, extended thinking)
  • Context engineering as a discipline
  • Prompt patterns and anti-patterns

Reading:

  • Wei et al., "Chain-of-Thought Prompting" (2022)
  • Anthropic prompt engineering guide
JHU Carey Business School | 2026
Generative AI for Business — Week 2

Questions?

JHU Carey Business School | 2026

Welcome back. This week we zoom out from architecture to the landscape: who's building what, how the pieces fit together, and how you make model selection decisions for real business problems.

The term "foundation model" was coined by the Stanford HAI group in 2021. The key insight: instead of training a separate model for each task, you train one massive model on broad data and adapt it. This is a paradigm shift from traditional ML where you'd have a different model for sentiment analysis, translation, classification, etc.

The landscape is bifurcated. Closed-source models from OpenAI, Anthropic, Google are typically the most capable but you're dependent on their API. Open-weight models from Meta, Mistral, DeepSeek give you more control — you can host them yourself, fine-tune them, inspect them. The gap in capability is narrowing. For many business tasks, open models are now competitive.

Most companies start with closed APIs because they're easiest to get going. As they scale and have specific needs — data residency, fine-tuning, cost optimization — they often adopt open models for some workloads. The mature approach is a portfolio: use the best model for each task. Some tasks need frontier capability, others just need a fast cheap model.

This is a critical business insight. Many companies default to the biggest model because they want "the best." But for high-volume tasks like classification, extraction, or routing, a small model at 1/50th the cost can be just as good. The right approach: start with the biggest model to establish a quality baseline, then try to match that quality with a smaller model. Use frontier models for the hard stuff.

The trend is clear: models are becoming multimodal. GPT-4o can see, hear, and speak. Claude can read images and PDFs. Gemini processes video. This matters for business because it means a single API can handle text, image, and audio tasks. The modality walls are coming down. Think about what this means for document processing, customer service, content creation.

This is the GenAI value chain from the Wu & Higgins reading. Infrastructure is where the money was made first — NVIDIA's stock price. Models sit on top, then orchestration tools, then applications. The key insight for business strategy: value capture is shifting upward. The infrastructure is commoditizing, model capabilities are converging, and the real differentiation is in the application layer — how you use these models to solve specific problems. This is where most of you will operate.

Training and inference are fundamentally different businesses. Training requires massive upfront capital — hundreds of millions for frontier models. Only a handful of organizations do this. Inference is the ongoing cost of using models. This is what you pay as a business when you make API calls. The training vs inference distinction matters for understanding the economics: training costs are fixed and declining per unit of capability, inference costs are variable and directly tied to usage.

Let's make this concrete. A typical business query might cost 1-2 cents with a frontier model. That sounds trivial, but it compounds fast. At enterprise scale — 100K queries per day — you're looking at $730K per year just for model API costs. This is why model selection matters. If a mid-tier model can do the job at 1/5th the cost, that's a $600K annual savings. Always prototype with the best model, then optimize.

Three ways to use the model layer. Most businesses start with pre-trained models via API, customizing behavior through prompting. When prompting hits its limits — domain-specific terminology, particular output formats, specialized reasoning — you fine-tune. Distillation is the next optimization: use a big model to generate training data, then train a small model to mimic it. This gives you frontier-quality outputs at small-model costs.

Three dominant application patterns, arranged by autonomy level. Chatbots are reactive — user asks, AI answers. Copilots are collaborative — user works, AI assists alongside. Agents are proactive — they act autonomously with minimal human input. Most enterprise deployments today are chatbots and copilots. Agents are the frontier — higher value but harder to get right. We'll spend Weeks 4-5 on agents.

LLMOps is the operational layer that most people forget about. You need prompt versioning (your prompts are code), a gateway to handle rate limits and caching, evaluation and monitoring of outputs, cost dashboards, and feedback loops. This is analogous to MLOps but with new challenges: non-deterministic outputs, no clear accuracy metric, fast-changing model landscape. Companies that skip this step end up with fragile, expensive, unmonitored AI deployments.

Here's a practical 2x2 for model selection. On one axis: how complex is the task? Simple extraction vs. complex reasoning. On the other: how sensitive is the data? Public info vs. PII or trade secrets. Low complexity + low sensitivity = small cloud model, cheapest option. High complexity + high sensitivity = either frontier model with enterprise agreements, or fine-tune an open model you host yourself. Most business tasks fall in the middle, which is why mid-tier models like Sonnet are so popular.

The build-buy spectrum for GenAI. Buying a SaaS product (like Jasper for marketing copy) is fastest but least differentiated. Building your own fine-tuned model is most powerful but takes months. The sweet spot for most companies is the middle: use foundation model APIs with custom prompts and orchestration. This is what we'll teach you to do in this course. You can build sophisticated AI applications without training a single model.

Let's work through this together. Email auto-reply: high volume, medium accuracy — mid-tier or small model, maybe fine-tuned on your email style. Cost matters here. Legal contract review: low volume but very high stakes — frontier model, possibly with human-in-the-loop, enterprise API agreement for data protection. Meeting summaries: moderate volume, low data sensitivity — could use almost anything, probably mid-tier model via API, maybe even a small model. The point: one company, three different model strategies.

Vendor lock-in is real in GenAI. Your prompts get tuned to a specific model's behavior. Your evaluation benchmarks are calibrated to its outputs. Your users get used to its personality. Switching costs are higher than they appear. Mitigation: build an abstraction layer so you can swap models. Test against multiple models periodically. The MCP standard we'll cover in Week 5 is specifically designed to address tooling lock-in.

This exercise makes the model selection discussion concrete. You'll send the same prompt to multiple models and compare the outputs on quality, cost, and latency. Use Claude Code to modify the script — add models, change the task, add evaluation criteria. The goal is to develop intuition for how models differ in practice, not just on benchmarks.

This is your first real build. Start with the skeleton in simple_app.py, then use Claude Code to extend it. The MBA folks should focus on making something useful — a tool you'd actually use at work. The MS folks should dig into the engineering: how do you parse structured outputs? How do you handle API errors? How does streaming work? Everyone should experiment with system prompts to shape the output.