Files

T

claude 3bab54d135 fix(modules-1,15,17): onboarding step, make M15 gate actually catch the plant, M17 .env override

- M1: add a no-git "Get the course materials" step (download+unzip; clone noted
  as Module 8) so Part A's paths resolve without assuming git. URL flagged
  Verify-before-publish (swap to public host before publishing).
- M15: security gate was failing OPEN on python3-only systems (bare `python`)
  and missing the UNTRACKED config.py, so the planted secret passed green. Now
  guards python3, fails CLOSED on any non-clean exit, and stages files so the
  planted SYNC_API_KEY + typosquat dep are actually caught.
- M15: correct the false "Bandit flags the API key" claim (B105-107 need
  password-named ids); add an honest MD5 (B324) flaw so the SAST demo fires.
  Planted secret/deps preserved.
- M17: require the .env loader to use setdefault so Part D's override demo works;
  explain precedence. Hardcoded "before" anti-pattern left intact.

Closes #6
Closes #17
Closes #18
Closes #19
Closes #29

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT

2026-06-22 15:48:27 -04:00

lab/starter

Build out all 27 modules + capstone (#1 )

2026-06-22 12:19:01 -04:00

README.md

fix(modules-1,15,17): onboarding step, make M15 gate actually catch the plant, M17 .env override

2026-06-22 15:48:27 -04:00

README.md

Module 17 — Secrets, Config, and Environments

Ask an AI to "connect to the API" and it will cheerfully paste your secret key straight into a source file — the one place it must never go. This module gives you the standard, boring, correct place to put secrets and per-environment config instead, and a reflex for catching the AI when it does the wrong thing.

Prerequisites

Module 2 — Version Control as a Safety Net. You need .gitignore and the habit of reading git diff before you commit. Both are load-bearing here.
Module 12 — Revert, Reset, and Recovery. You learned that Git history is forever and that secrets don't belong in it — this module is the practical follow-through on that promise.
Module 15 — Security Scanning for AI-Generated Code. Secret scanning is the automated gate that catches a hardcoded key after the fact. This module is the prevention that means the gate rarely has to fire.
Module 16 — Containers and Reproducible Environments. A container is a sealed box; config and secrets are how you pass the outside world into it at run time. That handoff is environment variables, which is exactly what this module is about.

You can attempt the lab with only Modules 1–2, but the why leans on 12, 15, and 16.

Learning objectives

By the end of this module you can:

Explain why a secret in source code is a different and worse problem than a bug — and why Git makes it permanent.
Move a secret out of code and into the environment (an environment variable or a gitignored .env file), and have the app read it back at run time.
Keep config you can commit (a committed template) separate from secrets you can't (the real .env), so a teammate or a fresh AI session knows exactly what to supply.
Apply the 12-factor rule — config lives in the environment, not the build — to run one codebase unchanged across dev, staging, and prod.
Describe what a secrets manager buys you over .env files, in vendor-neutral terms, and know when you've outgrown a file on disk.

Key concepts

A secret in source is not a bug — it's a leak

A bug is a wrong behavior you can fix and move on from. A hardcoded secret is different: the moment it's written to a file in a repo, you've started a countdown. Commit it and it's in your history forever — Module 12 was blunt about this: git revert writes a new commit undoing the change, but the old commit, with the key in plain text, is still right there in the log for anyone who clones the repo. Push it (Module 8) and it's now on a server, in every teammate's clone, and in every backup. "Delete the line and commit again" does nothing; the secret is in the snapshot, not the current file.

So the only real fix after a leak is rotation: revoke the exposed key at the provider and issue a new one, treating the old one as compromised. That's expensive and easy to forget, which is why the entire discipline is built around never writing the secret to a tracked file in the first place. Prevention is the whole game.

What counts as a secret: API keys and tokens, database passwords and connection strings, private keys and certificates, signing/encryption keys, OAuth client secrets, webhook signing secrets. The test is simple — if this string leaked, would someone have to scramble? If yes, it's a secret and it does not go in code.

Config vs. secrets vs. code

Three things often get jumbled into source files. Pulling them apart is the whole mental model:

Kind	Example	Where it lives	Goes in Git?
Code	The logic of your app	Source files	Yes — that's the point
Config	Which backend URL, log level, feature flags, timeouts	The environment (often a `.env` template you commit + real values you don't)	The template yes, the values it depends
Secrets	API keys, passwords, tokens	The environment, sourced from a secret store in real deployments	Never

The dividing line that matters: config and secrets are things that change between where the app runs, not what the app does. Your dev laptop, the staging server, and production all run the same code — they differ only in config (different URLs) and secrets (different keys). That observation is the entire 12-factor idea below.

The environment: where config and secrets actually go

An environment variable is a named value the operating system hands to a process when it starts. Every OS has them; your shell is full of them right now (PATH, HOME). They're the universal, language-agnostic channel for passing config into a program without putting it in the program.

Set one for a single command:

# macOS / Linux
TASKS_API_KEY="sk-live-..." python sync.py

# Windows PowerShell
$env:TASKS_API_KEY="sk-live-..."; python sync.py

Read it back in code — and fail loudly if it's missing, because a silent empty string is worse than a crash:

import os

api_key = os.environ.get("TASKS_API_KEY")
if not api_key:
    raise SystemExit("TASKS_API_KEY is not set. Copy .env.example to .env and fill it in.")

That's the whole pattern. The secret never appears in the file; the file only asks the environment for it. Anyone reading the source learns that a key is needed but not what the key is — which is exactly the property you want.

`.env` files: the developer-friendly middle ground

Typing TASKS_API_KEY=... before every command gets old, and exported shell variables vanish when you close the terminal. The conventional fix is a .env file — a flat list of KEY=value lines, sitting in your project, that gets loaded into the environment when the app starts:

APP_ENV=dev
TASKS_API_KEY=sk-live-9f8a7b6c5d4e3f2a1b0c9d8e7f6a5b4c

Two non-negotiable rules come with it:

The real .env is gitignored. Always. Add .env to your .gitignore (Module 2) before you create the file, so there's never a window where it could be committed. This is the single most important line in this module:
```
# secrets and local config — never commit
.env
.env.*
!.env.example
```
That last two lines say: ignore .env and any .env.something, but keep tracking .env.example (the ! un-ignores it). More on that next.
Commit a template, not the secrets. A .env.example (or .env.template) lists every variable the app needs with placeholder values and no real secrets. This file you commit. It's the documentation that tells a teammate — or the next AI session reading the repo as memory (Module 2) — exactly what to supply:
```
# .env.example  (committed)
APP_ENV=dev
TASKS_API_KEY=replace-me
```

Loading a .env is usually one line via a small library (every major language has one). You can also load it with a few lines of your own code and zero dependencies — the lab shows the dependency-free version so it runs anywhere with just the language installed.

Naming, not values, is the contract. Standardize the variable names across the team and commit them in the template. The values are local and secret; the names are shared and public. When the AI writes os.environ["TASKS_API_KEY"], it should match what's in .env.example exactly — a mismatch is the most common "works on my machine" failure in this whole area.

12-factor: config in the environment, one build everywhere

The principle behind all of this comes from the 12-factor app guidelines, and factor III states it plainly: store config in the environment. The payoff for this audience:

You build the artifact once and run the same artifact in every environment. Nothing about dev, staging, or prod is baked into the code or the container image — the differences are injected at run time as environment variables.

This is why it pairs so tightly with containers (Module 16). A container image is your immutable, built-once artifact. You don't build a "staging image" and a "prod image" — you build one image and start it with different environment variables:

docker run -e APP_ENV=staging -e TASKS_API_KEY="$STAGING_KEY" tasks-app
docker run -e APP_ENV=prod    -e TASKS_API_KEY="$PROD_KEY"    tasks-app

Same image, different environment. That's the whole idea, and it's what makes the delivery pipeline in Module 18 sane: promote one artifact through environments instead of rebuilding per stage.

Per-environment config: dev, staging, prod

"Environments" here means the distinct places your code runs, each with its own config and its own secrets. The standard three:

dev — your machine. A dev backend, a dev key with low privileges, verbose logging.
staging — a production-like rehearsal. Separate backend, separate key, real-ish data.
prod — the real thing. Real users, the powerful key, conservative settings.

The rule that catches people: each environment gets its own secrets, and they never mix. A dev key must not be able to touch prod data, and a prod key must never sit in a developer's .env. The clean pattern is one variable that names the environment (APP_ENV), which the code uses to pick the right URLs and behavior, plus per-environment secret values supplied separately:

import os

ENVIRONMENTS = {
    "dev":     "https://api.dev.example-tasks.com/v1",
    "staging": "https://api.staging.example-tasks.com/v1",
    "prod":    "https://api.example-tasks.com/v1",
}

app_env = os.environ.get("APP_ENV", "dev")
backend_url = ENVIRONMENTS[app_env]   # config selected by environment, not hardcoded

The non-secret per-environment config (which URL goes with which env) is fine to keep in code like this — it's not sensitive and it's the same everywhere the code runs. Only the secret values and the choice of which environment this process is come from outside.

Secret stores: when a file on disk isn't enough

A gitignored .env is the right tool on your laptop. It does not scale to a running fleet, for reasons that show up fast in real operations:

A plaintext file on a server is readable by anything that compromises that box.
You can't rotate a key across fifty machines by editing fifty files.
You get no audit trail — no record of who read which secret when.
There's no access control — "this service can read the DB password but not the signing key."

A secret manager (also called a secrets store or vault, categorically) solves these. It's a dedicated service that stores secrets encrypted at rest, hands them out only to authenticated callers, logs every access, and supports rotation and fine-grained access policies. At run time your app — or the platform it runs on — fetches the secret from the manager into memory instead of reading a file. The categories you'll encounter:

Cloud-provider managers — every major cloud has one, tightly integrated with that cloud's identity system.
Standalone / self-hostable vaults — dedicated secret-management products you run yourself, a good fit for the on-prem and air-gapped scenarios this audience often lives in (the same self-host instinct from Module 8).
Platform-native secrets — your container orchestrator and your CI/CD system both have a built-in concept of "secrets" you can inject as environment variables, which is how secrets reach a pipeline (Module 14) or a deployment (Module 18) without ever touching the repo.

You don't need a manager for the lab or for a solo project. You need it the moment a secret has to be available to more than one machine you don't personally babysit. The mental upgrade is the same either way: the app reads its secret from the environment; what populates the environment grows up from a file to a service. Your code doesn't change — that's the point of reading from the environment all along.

The AI angle

This module exists because of one specific, relentless AI failure mode: AI loves to hardcode secrets. Ask any coding assistant to "add authentication," "connect to the database," or "call the API," and a large fraction of the time it will write the key, token, or password directly into the source file — often with a cheerful comment like # your API key here. It does this because its training data is full of tutorials and quick examples that do exactly that, and because a literal value is the path of least resistance to working code. The code runs, the demo works, and a leak is now one git commit away.

This is the textbook case of the recurring course theme: AI output that looks right and runs is not the same as output that's safe. A human who knows better still has to catch it, because the model will keep offering it. Concretely:

Make "where did the secret go?" a review reflex. Every time the AI touches auth, config, or a network call, read the git diff (Module 2) and grep the change for anything that looks like a key before you commit. The diff is where you catch it cheaply — before it's in history.
Tell the AI the pattern up front. Put the rule in your committed instructions file (Module 5): "Never hardcode secrets. Read all keys and config from environment variables; add new ones to .env.example." A model given that house rule will usually write the os.environ version on the first try. This is the prevention-by-config payoff Module 5 promised.
Let the AI do the refactor — it's good at it. The same model that hardcodes a key on the way in is genuinely good at pulling it back out when you ask: "move every hardcoded secret and environment-specific value into environment variables, fail loudly if they're missing, and update .env.example." That's exactly the lab.
Secret scanning is the backstop, not the plan (Module 15). A scanner in CI catches the key you missed — but by then it may already be in a commit. Treat a scanner hit as a rotation event, not a code-review comment. The goal of this module is that the scanner stays quiet because the secret never reached the repo.

Hands-on lab

Lab language: Python + shell, on a new sync feature for the tasks-app from Module 1.

You'll take a file that hardcodes a secret — the exact thing an AI hands you — and refactor it so the secret lives in the environment and the real values never enter Git. Then you'll make it select config per environment.

You'll need:

The tasks-app folder from Modules 1–2 (a Git repo with a .gitignore).
Python 3.10+ and a terminal.
The starter files in this module's lab/starter/: sync.py (the before) and .env.example.
Your AI assistant (browser or editor-integrated — by now, your choice).

Part A — See the smell

Copy lab/starter/sync.py and lab/starter/.env.example into your tasks-app folder, then run the before-picture:
```
cd ~/workflow-course/tasks-app
python sync.py
```
It prints a simulated request — including Authorization: Bearer sk-live-.... Open sync.py and find the two hardcoded lines: API_KEY and BACKEND_URL. This is the AI default. Picture this getting committed and pushed: the key is now in history forever (Module 12) and a secret scanner (Module 15) would light up — if you were lucky enough to have one.

Part B — Gitignore the secret first

Before any real secret exists, close the door. Add these lines to your .gitignore:
```
# secrets and local config — never commit
.env
.env.*
!.env.example
```
Confirm Git will ignore a real .env but still track the template:
```
printf 'APP_ENV=dev\nTASKS_API_KEY=sk-live-test-0000\n' > .env
git status        # .env must NOT appear; .env.example and your .gitignore change SHOULD
```
If .env shows up in git status, stop and fix the ignore rule before going further. This is the step that prevents the leak.

Part C — Refactor the secret into the environment

Now move the secret and the environment-specific URL out of the code. Ask your AI:

"Refactor sync.py so it reads TASKS_API_KEY and APP_ENV from environment variables instead of hardcoding them. Pick the backend URL from APP_ENV (dev/staging/prod). Fail loudly with a clear message if TASKS_API_KEY is missing. Don't add any third-party dependency — load the .env file with a few lines of plain Python, and make sure the loader does not overwrite a variable that's already set in the environment, so a value passed on the command line still wins."

You're looking for a result shaped like this (read the diff before you accept it):
```
import os
from pathlib import Path

def load_dotenv(path: Path) -> None:
    """Minimal .env loader — no dependency. Real projects use a library for this."""
    if not path.exists():
        return
    for line in path.read_text().splitlines():
        line = line.strip()
        if not line or line.startswith("#") or "=" not in line:
            continue
        key, _, value = line.partition("=")
        os.environ.setdefault(key.strip(), value.strip())

load_dotenv(Path(__file__).parent / ".env")

ENVIRONMENTS = {
    "dev":     "https://api.dev.example-tasks.com/v1",
    "staging": "https://api.staging.example-tasks.com/v1",
    "prod":    "https://api.example-tasks.com/v1",
}

app_env = os.environ.get("APP_ENV", "dev")
api_key = os.environ.get("TASKS_API_KEY")
if not api_key:
    raise SystemExit("TASKS_API_KEY is not set. Copy .env.example to .env and fill it in.")
backend_url = ENVIRONMENTS[app_env]
```
Confirm there is no literal key left anywhere in sync.py:
```
grep -n "sk-live" sync.py     # should print nothing
```
Why setdefault and not plain assignment? The loader uses os.environ.setdefault(key, value), which sets a variable only if it isn't already set. That precedence is load-bearing: a value the environment already supplies — like an APP_ENV you pass on the command line — wins over the .env file. A loader that writes os.environ[key] = value instead clobbers anything already there, so the file silently overrides your command line and Part D's override demo does nothing. This matches the real-world dotenv default (override=False): the file fills in gaps, it doesn't stomp on what's already in the environment. If the AI hands you plain assignment, that's the correction to make.

Part D — Run it from the environment

Run it reading from your .env:

python sync.py                # loads .env -> dev URL, key from the file

Now prove the 12-factor point: same code, different environment, no edit. Override at the command line to act like staging, then prod:
```
# macOS / Linux
APP_ENV=staging python sync.py
APP_ENV=prod    TASKS_API_KEY="sk-live-prod-key" python sync.py
```
```
# Windows PowerShell
$env:APP_ENV="staging"; python sync.py
```
Watch the backend URL change with APP_ENV while the source never does. That's config in the environment. If the URL doesn't change, your loader is clobbering variables that were already set — it's using os.environ[key] = value where it needs os.environ.setdefault(...) (see Part C). Fix the loader so the command line wins, and the override takes effect.

Part E — Commit, and verify the secret didn't tag along

Stage and read the diff before committing — the review reflex from the AI angle:

git add -A
git diff --cached            # the refactored sync.py + .gitignore + .env.example

Confirm the diff contains the template and the code that reads the environment, and not the real key or your .env. Then:

git commit -m "Read secrets and per-env config from the environment, not source"
git status                   # clean; .env remains untracked

You've now done the exact refactor that turns the AI's default mistake into the correct pattern — and left behind a .env.example so the next person (or agent) knows what to supply.

Where it breaks

.env is not encryption. A .env file is plaintext on disk. Gitignoring it keeps it out of Git, not out of reach of anything with access to your machine. It's the right tool for local dev and the wrong tool for a shared server — that's where a secret manager earns its place.
Environment variables leak in their own ways. They can show up in process listings, crash dumps, log lines that print the whole environment, and child processes that inherit them. Reading from the environment is far better than hardcoding, but it's not a force field — don't log the environment, and scrub secrets from error reports.
A committed template can still leak by accident. The whole scheme depends on .env.example staying free of real values. It's easy to "just fill it in to test" and commit it. Keep the placeholder discipline, and lean on the Module 15 scanner as the backstop for the day you slip.
The damage may already be done. If a secret was ever committed — even in a commit you later reverted — assume it's compromised and rotate it. Removing it from current files does not remove it from history. Scrubbing history is possible but disruptive (and Module 12 warned you about rewriting shared history); rotation is the reliable fix.
Managed secrets aren't automatically safe. A secret manager with over-broad access policies, or one whose secrets you copy into a .env "just for now," gives back everything it was supposed to protect. The tool only helps if least-privilege access and rotation are actually configured.

Check for understanding

You're done when:

sync.py runs entirely from the environment, and grep "sk-live" sync.py prints nothing.
A real .env exists, contains your secret, and does not appear in git status — while .env.example is tracked.
APP_ENV=staging python sync.py and the default run hit different backend URLs with zero source edits between them.
You can state, in one sentence, why deleting a committed secret and re-committing does not fix the leak — and what the actual fix is (rotation).
You've added a "never hardcode secrets; read from the environment" rule to your committed instructions file (Module 5), so the AI stops reintroducing the problem.

When the AI hands you a hardcoded key and your first instinct is "that goes in the environment, and the diff has to prove it didn't reach Git," the reflex is installed. Module 18 takes this artifact — built once, configured per environment — and ships it.

Verify-before-publish

This is an expansion-zone module; the durable concepts (env vars, .env, 12-factor, the config/secret/code split) are stable, but anything naming a specific product drifts. Before publishing:

Keep secret-manager references categorical. The text deliberately names categories (cloud-provider managers, standalone/self-hostable vaults, platform-native secrets), not products. If you add specific product names, re-verify each still exists, is current, and isn't pinned as the answer (vendor-neutral rule, AGENTS.md).
Re-check the 12-factor reference. Confirm the 12factor.net link resolves and that "factor III — config" is still phrased as "store config in the environment."
Re-verify .gitignore negation behavior. Confirm !.env.example still un-ignores the template under the .env.* rule with a current Git, and that git status behaves as the lab claims.
Re-verify the Windows PowerShell syntax ($env:VAR="...") and the inline VAR=value command syntax for macOS/Linux against current shells.
Confirm dependency-free .env loading still reads correctly under the current Python version, so the lab runs with no pip install.
Confirm cross-references to Modules 2, 5, 8, 12, 14, 15, 16, and 18 still match those modules' final numbering and titles.

README.md Unescape Escape

Module 17 — Secrets, Config, and Environments

Prerequisites

Learning objectives

Key concepts

A secret in source is not a bug — it's a leak

Config vs. secrets vs. code

The environment: where config and secrets actually go

.env files: the developer-friendly middle ground

12-factor: config in the environment, one build everywhere

Per-environment config: dev, staging, prod

Secret stores: when a file on disk isn't enough

The AI angle

Hands-on lab

Part A — See the smell

Part B — Gitignore the secret first

Part C — Refactor the secret into the environment

Part D — Run it from the environment

Part E — Commit, and verify the secret didn't tag along

Where it breaks

Check for understanding

Verify-before-publish

README.md

`.env` files: the developer-friendly middle ground