fix(modules-1,15,17): onboarding step, make M15 gate actually catch the plant, M17 .env override

- M1: add a no-git "Get the course materials" step (download+unzip; clone noted
  as Module 8) so Part A's paths resolve without assuming git. URL flagged
  Verify-before-publish (swap to public host before publishing).
- M15: security gate was failing OPEN on python3-only systems (bare `python`)
  and missing the UNTRACKED config.py, so the planted secret passed green. Now
  guards python3, fails CLOSED on any non-clean exit, and stages files so the
  planted SYNC_API_KEY + typosquat dep are actually caught.
- M15: correct the false "Bandit flags the API key" claim (B105-107 need
  password-named ids); add an honest MD5 (B324) flaw so the SAST demo fires.
  Planted secret/deps preserved.
- M17: require the .env loader to use setdefault so Part D's override demo works;
  explain precedence. Hardcoded "before" anti-pattern left intact.

Closes #6
Closes #17
Closes #18
Closes #19
Closes #29

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TfzV5QvtPDz8LJS3Pu5VLT
This commit is contained in:
2026-06-22 15:48:27 -04:00
parent 06b9f8f308
commit 3bab54d135
5 changed files with 112 additions and 20 deletions
@@ -137,6 +137,26 @@ purpose** so you recognize it later.
- Python 3.10 or newer (`python --version` or `python3 --version` to check).
- Your usual AI chat assistant, open in a browser tab.
### Get the course materials
Everything you'll run in this course lives in one repo. Grab it once, up front — no tools required
beyond a web browser:
1. Open the course's home page — **`https://git.jpaul.io/justin/the-workflow-course`** — and use its
**Download ZIP** (archive) link.
2. Unzip it under your home directory so the course's `modules/` folder lands at
`~/workflow-course/modules/`. (Rename the unzipped folder to `workflow-course` if your download
named it something else.)
You now have every module's files locally, including this one's under
`modules/01-the-copy-paste-problem/`.
> *A cleaner, **updatable** way to get the repo — `git clone` — arrives in **Module 8**, once you've
> learned Git (Module 2). A one-time ZIP is all you need today; don't reach for `clone` yet.*
> *Verify-before-publish: confirm this download URL points at the published course host before
> shipping.*
### Part A — Stand up the project
1. Make a working directory and copy in the starter app from this module's `lab/starter/` folder:
+29 -5
View File
@@ -287,9 +287,14 @@ that key had been real and ever pushed, removing it now is not enough; you'd hav
because it's in history. (Proper secret management is Module 17; this is just the catch.)
> **Stretch — Gate 3 (SAST):** install a static analyzer for your language (for Python,
> `pip install bandit`, then `bandit -r .`) and see it flag insecure patterns — including, often, the
> very hardcoded secret from Part C, from a different angle. Note how much noisier it is than the
> first two gates. That noise is why it's the one you tune.
> `pip install bandit`, then `bandit -r .`) and watch it flag insecure *code you wrote* — here, the
> MD5-based request signing in `config.py` (weak crypto, CWE-327). Now note what it does **not**
> flag: the hardcoded `SYNC_API_KEY`. Bandit's hardcoded-credential checks (B105107) key on
> *password-named* identifiers — `password`, `secret`, `token` — so a key named `SYNC_API_KEY` slips
> right past them. Catching that string is a secret scanner's job (Gate 2), not SAST's. Same file,
> two distinct flaws, caught by two different gates with two different blind spots — which is exactly
> why you run all three rather than trusting one. And note how much noisier SAST is than the first
> two gates: that noise is why it's the one you tune.
### Part D — Wire the gates into CI
@@ -298,13 +303,32 @@ runs on every push and blocks the merge.
1. Copy `lab/security-scan.sh` into your project. It runs the SCA and secret-scan gates and **exits
non-zero on any finding** — which is what makes CI go red. Make it executable
(`chmod +x security-scan.sh`) and run it locally first:
(`chmod +x security-scan.sh`).
Before you run it, **stage the starter files** so the secret gate can see them:
```bash
git add config.py requirements.txt
```
This is not a footnote. `detect-secrets scan` with no path argument scans the files Git
*tracks* — an *untracked* `config.py` is invisible to it, so the gate would report "no secrets"
on a file that's full of them (a silent false pass, the worst kind). Staging puts the file in
front of the scanner. It's the same reason the explicit `detect-secrets scan config.py` in
Part C worked, and the same reason "secrets live in history": the moment Git knows about a file,
so does the gate.
To watch the gate catch both planted problems at once, restore the original booby-trapped files
first (you fixed them in Parts B and C) — re-copy `config.py` and `requirements.txt` from this
module's starter, re-stage, then run:
```bash
./security-scan.sh
```
With the bad starter files in place it should fail. With your Part B/C fixes applied, it should
It should **fail on both gates** — the SCA gate on the unresolvable/vulnerable dependencies and
the secret gate on the hardcoded key — and you should be able to point at which finding caused
each non-zero exit. Re-apply your Part B/C fixes (and re-stage), run it once more, and it should
pass.
2. Add a security step to your pipeline that calls it. `lab/ci-security.yml` is a provider-neutral
+15 -4
View File
@@ -1,15 +1,18 @@
"""Cloud-sync config for tasks-app — a realistic snapshot of what an AI hands you.
Asked to "sync tasks to a cloud service," a model will cheerfully produce something like this: it
works, it reads naturally, it passes lint and tests... and it has a live credential baked straight
into the source. That is the *exact* failure mode Module 15's secret-scanning gate exists to catch.
works, it reads naturally, it passes lint and tests... and it carries two planted flaws — a live
credential baked straight into the source (caught by Gate 2, secret scanning) and a weak-crypto
"signature" using MD5 (caught by Gate 3, SAST). Two different gates, two different blind spots.
DO NOT copy this pattern. The point of this file is to be caught by a scanner, not imitated.
DO NOT copy these patterns. The point of this file is to be caught by a scanner, not imitated.
The fix (read from the environment) is shown at the bottom, commented out, so you can see the
difference once Part C of the lab is done.
"""
# --- The problem the scanner should flag -------------------------------------------------------
import hashlib
# --- The problem the SECRET scanner should flag (Gate 2) ---------------------------------------
# A hardcoded API key. Looks like a normal string literal; lint and tests will never complain.
SYNC_API_KEY = "sk_live_9c3f2a7b41d84e0fa6b2c5d8e1f09a73bdac46"
SYNC_ENDPOINT = "https://api.example-task-cloud.com/v1/sync"
@@ -19,6 +22,14 @@ def sync_headers() -> dict:
return {"Authorization": f"Bearer {SYNC_API_KEY}"}
# --- The problem the SAST scanner should flag (Gate 3) -----------------------------------------
# AI-classic: "sign" the request body with a quick hash. MD5 is broken for anything
# security-relevant — a textbook weak-crypto idiom. A secret scanner won't catch this (it's not a
# secret); a SAST tool like bandit will (it's insecure code you wrote). DO NOT imitate.
def sign_payload(body: str) -> str:
return hashlib.md5(body.encode()).hexdigest()
# --- The fix (Part C) --------------------------------------------------------------------------
# Read the secret from the environment instead of committing it. Proper secret management — env
# files, secret stores, per-environment config — is Module 17. This is just enough to make the
@@ -14,6 +14,13 @@
set -u # treat unset vars as errors; we manage exit codes explicitly below.
# A security gate must fail CLOSED. If the interpreter the secret gate needs isn't here, abort with a
# non-zero exit rather than sailing past the check and reporting a false "passed".
command -v python3 >/dev/null 2>&1 || {
echo ">> python3 is required for the secret gate but was not found. Aborting." >&2
exit 2
}
status=0
echo "=== Gate 1: SCA / dependency scan (pip-audit) ==="
@@ -28,16 +35,33 @@ fi
echo
echo "=== Gate 2: secret scan (detect-secrets) ==="
# detect-secrets prints a JSON report of any secrets it finds. We treat a non-empty results set as a
# failure. `python -c` keeps this portable (no jq dependency).
# detect-secrets prints a JSON report of any secrets it finds. NOTE: with no path it scans the files
# git TRACKS, so stage the starter files (`git add`) before running this, or an untracked file is
# invisible to the gate. We parse the JSON with `python3` (no jq dependency) and fail CLOSED: the
# parser returns 0=secrets found, 1=clean, anything else=couldn't tell — and "couldn't tell" must
# count as a failure, never a silent pass.
report="$(detect-secrets scan)"
if printf '%s' "$report" | python -c 'import sys, json; sys.exit(0 if json.load(sys.stdin).get("results") else 1)'; then
echo "$report"
echo ">> SECRET gate FAILED: a credential was detected in the tree. See report above." >&2
status=1
else
echo "no secrets detected."
fi
printf '%s' "$report" | python3 -c 'import sys, json
try:
found = bool(json.load(sys.stdin).get("results"))
except Exception:
sys.exit(2)
sys.exit(0 if found else 1)'
secret_rc=$?
case "$secret_rc" in
0)
echo "$report"
echo ">> SECRET gate FAILED: a credential was detected in the tree. See report above." >&2
status=1
;;
1)
echo "no secrets detected."
;;
*)
echo ">> SECRET gate ERROR: could not parse the scan output (exit $secret_rc). Failing closed." >&2
status=1
;;
esac
echo
if [ "$status" -ne 0 ]; then
@@ -332,7 +332,9 @@ config per environment.
> *"Refactor `sync.py` so it reads `TASKS_API_KEY` and `APP_ENV` from environment variables
> instead of hardcoding them. Pick the backend URL from `APP_ENV` (dev/staging/prod). Fail loudly
> with a clear message if `TASKS_API_KEY` is missing. Don't add any third-party dependency — load
> the `.env` file with a few lines of plain Python."*
> the `.env` file with a few lines of plain Python, and make sure the loader does **not**
> overwrite a variable that's already set in the environment, so a value passed on the command
> line still wins."*
You're looking for a result shaped like this (read the diff before you accept it):
@@ -372,6 +374,15 @@ config per environment.
grep -n "sk-live" sync.py # should print nothing
```
**Why `setdefault` and not plain assignment?** The loader uses `os.environ.setdefault(key, value)`,
which sets a variable *only if it isn't already set*. That precedence is load-bearing: a value the
environment already supplies — like an `APP_ENV` you pass on the command line — wins over the
`.env` file. A loader that writes `os.environ[key] = value` instead **clobbers** anything already
there, so the file silently overrides your command line and Part D's override demo does nothing.
This matches the real-world dotenv default (`override=False`): the file fills in gaps, it doesn't
stomp on what's already in the environment. If the AI hands you plain assignment, that's the
correction to make.
### Part D — Run it from the environment
5. Run it reading from your `.env`:
@@ -395,7 +406,9 @@ config per environment.
```
Watch the backend URL change with `APP_ENV` while the source never does. That's config in the
environment.
environment. **If the URL *doesn't* change, your loader is clobbering variables that were already
set** — it's using `os.environ[key] = value` where it needs `os.environ.setdefault(...)` (see
Part C). Fix the loader so the command line wins, and the override takes effect.
### Part E — Commit, and verify the secret didn't tag along