Onboarding + make M15 gate catch the plant + M17 override (#6,#17,#18,#19,#29) #58
@@ -137,6 +137,26 @@ purpose** so you recognize it later.
|
|||||||
- Python 3.10 or newer (`python --version` or `python3 --version` to check).
|
- Python 3.10 or newer (`python --version` or `python3 --version` to check).
|
||||||
- Your usual AI chat assistant, open in a browser tab.
|
- Your usual AI chat assistant, open in a browser tab.
|
||||||
|
|
||||||
|
### Get the course materials
|
||||||
|
|
||||||
|
Everything you'll run in this course lives in one repo. Grab it once, up front — no tools required
|
||||||
|
beyond a web browser:
|
||||||
|
|
||||||
|
1. Open the course's home page — **`https://git.jpaul.io/justin/the-workflow-course`** — and use its
|
||||||
|
**Download ZIP** (archive) link.
|
||||||
|
2. Unzip it under your home directory so the course's `modules/` folder lands at
|
||||||
|
`~/workflow-course/modules/`. (Rename the unzipped folder to `workflow-course` if your download
|
||||||
|
named it something else.)
|
||||||
|
|
||||||
|
You now have every module's files locally, including this one's under
|
||||||
|
`modules/01-the-copy-paste-problem/`.
|
||||||
|
|
||||||
|
> *A cleaner, **updatable** way to get the repo — `git clone` — arrives in **Module 8**, once you've
|
||||||
|
> learned Git (Module 2). A one-time ZIP is all you need today; don't reach for `clone` yet.*
|
||||||
|
|
||||||
|
> *Verify-before-publish: confirm this download URL points at the published course host before
|
||||||
|
> shipping.*
|
||||||
|
|
||||||
### Part A — Stand up the project
|
### Part A — Stand up the project
|
||||||
|
|
||||||
1. Make a working directory and copy in the starter app from this module's `lab/starter/` folder:
|
1. Make a working directory and copy in the starter app from this module's `lab/starter/` folder:
|
||||||
|
|||||||
@@ -287,9 +287,14 @@ that key had been real and ever pushed, removing it now is not enough; you'd hav
|
|||||||
because it's in history. (Proper secret management is Module 17; this is just the catch.)
|
because it's in history. (Proper secret management is Module 17; this is just the catch.)
|
||||||
|
|
||||||
> **Stretch — Gate 3 (SAST):** install a static analyzer for your language (for Python,
|
> **Stretch — Gate 3 (SAST):** install a static analyzer for your language (for Python,
|
||||||
> `pip install bandit`, then `bandit -r .`) and see it flag insecure patterns — including, often, the
|
> `pip install bandit`, then `bandit -r .`) and watch it flag insecure *code you wrote* — here, the
|
||||||
> very hardcoded secret from Part C, from a different angle. Note how much noisier it is than the
|
> MD5-based request signing in `config.py` (weak crypto, CWE-327). Now note what it does **not**
|
||||||
> first two gates. That noise is why it's the one you tune.
|
> flag: the hardcoded `SYNC_API_KEY`. Bandit's hardcoded-credential checks (B105–107) key on
|
||||||
|
> *password-named* identifiers — `password`, `secret`, `token` — so a key named `SYNC_API_KEY` slips
|
||||||
|
> right past them. Catching that string is a secret scanner's job (Gate 2), not SAST's. Same file,
|
||||||
|
> two distinct flaws, caught by two different gates with two different blind spots — which is exactly
|
||||||
|
> why you run all three rather than trusting one. And note how much noisier SAST is than the first
|
||||||
|
> two gates: that noise is why it's the one you tune.
|
||||||
|
|
||||||
### Part D — Wire the gates into CI
|
### Part D — Wire the gates into CI
|
||||||
|
|
||||||
@@ -298,13 +303,32 @@ runs on every push and blocks the merge.
|
|||||||
|
|
||||||
1. Copy `lab/security-scan.sh` into your project. It runs the SCA and secret-scan gates and **exits
|
1. Copy `lab/security-scan.sh` into your project. It runs the SCA and secret-scan gates and **exits
|
||||||
non-zero on any finding** — which is what makes CI go red. Make it executable
|
non-zero on any finding** — which is what makes CI go red. Make it executable
|
||||||
(`chmod +x security-scan.sh`) and run it locally first:
|
(`chmod +x security-scan.sh`).
|
||||||
|
|
||||||
|
Before you run it, **stage the starter files** so the secret gate can see them:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add config.py requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
This is not a footnote. `detect-secrets scan` with no path argument scans the files Git
|
||||||
|
*tracks* — an *untracked* `config.py` is invisible to it, so the gate would report "no secrets"
|
||||||
|
on a file that's full of them (a silent false pass, the worst kind). Staging puts the file in
|
||||||
|
front of the scanner. It's the same reason the explicit `detect-secrets scan config.py` in
|
||||||
|
Part C worked, and the same reason "secrets live in history": the moment Git knows about a file,
|
||||||
|
so does the gate.
|
||||||
|
|
||||||
|
To watch the gate catch both planted problems at once, restore the original booby-trapped files
|
||||||
|
first (you fixed them in Parts B and C) — re-copy `config.py` and `requirements.txt` from this
|
||||||
|
module's starter, re-stage, then run:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./security-scan.sh
|
./security-scan.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
With the bad starter files in place it should fail. With your Part B/C fixes applied, it should
|
It should **fail on both gates** — the SCA gate on the unresolvable/vulnerable dependencies and
|
||||||
|
the secret gate on the hardcoded key — and you should be able to point at which finding caused
|
||||||
|
each non-zero exit. Re-apply your Part B/C fixes (and re-stage), run it once more, and it should
|
||||||
pass.
|
pass.
|
||||||
|
|
||||||
2. Add a security step to your pipeline that calls it. `lab/ci-security.yml` is a provider-neutral
|
2. Add a security step to your pipeline that calls it. `lab/ci-security.yml` is a provider-neutral
|
||||||
|
|||||||
@@ -1,15 +1,18 @@
|
|||||||
"""Cloud-sync config for tasks-app — a realistic snapshot of what an AI hands you.
|
"""Cloud-sync config for tasks-app — a realistic snapshot of what an AI hands you.
|
||||||
|
|
||||||
Asked to "sync tasks to a cloud service," a model will cheerfully produce something like this: it
|
Asked to "sync tasks to a cloud service," a model will cheerfully produce something like this: it
|
||||||
works, it reads naturally, it passes lint and tests... and it has a live credential baked straight
|
works, it reads naturally, it passes lint and tests... and it carries two planted flaws — a live
|
||||||
into the source. That is the *exact* failure mode Module 15's secret-scanning gate exists to catch.
|
credential baked straight into the source (caught by Gate 2, secret scanning) and a weak-crypto
|
||||||
|
"signature" using MD5 (caught by Gate 3, SAST). Two different gates, two different blind spots.
|
||||||
|
|
||||||
DO NOT copy this pattern. The point of this file is to be caught by a scanner, not imitated.
|
DO NOT copy these patterns. The point of this file is to be caught by a scanner, not imitated.
|
||||||
The fix (read from the environment) is shown at the bottom, commented out, so you can see the
|
The fix (read from the environment) is shown at the bottom, commented out, so you can see the
|
||||||
difference once Part C of the lab is done.
|
difference once Part C of the lab is done.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# --- The problem the scanner should flag -------------------------------------------------------
|
import hashlib
|
||||||
|
|
||||||
|
# --- The problem the SECRET scanner should flag (Gate 2) ---------------------------------------
|
||||||
# A hardcoded API key. Looks like a normal string literal; lint and tests will never complain.
|
# A hardcoded API key. Looks like a normal string literal; lint and tests will never complain.
|
||||||
SYNC_API_KEY = "sk_live_9c3f2a7b41d84e0fa6b2c5d8e1f09a73bdac46"
|
SYNC_API_KEY = "sk_live_9c3f2a7b41d84e0fa6b2c5d8e1f09a73bdac46"
|
||||||
SYNC_ENDPOINT = "https://api.example-task-cloud.com/v1/sync"
|
SYNC_ENDPOINT = "https://api.example-task-cloud.com/v1/sync"
|
||||||
@@ -19,6 +22,14 @@ def sync_headers() -> dict:
|
|||||||
return {"Authorization": f"Bearer {SYNC_API_KEY}"}
|
return {"Authorization": f"Bearer {SYNC_API_KEY}"}
|
||||||
|
|
||||||
|
|
||||||
|
# --- The problem the SAST scanner should flag (Gate 3) -----------------------------------------
|
||||||
|
# AI-classic: "sign" the request body with a quick hash. MD5 is broken for anything
|
||||||
|
# security-relevant — a textbook weak-crypto idiom. A secret scanner won't catch this (it's not a
|
||||||
|
# secret); a SAST tool like bandit will (it's insecure code you wrote). DO NOT imitate.
|
||||||
|
def sign_payload(body: str) -> str:
|
||||||
|
return hashlib.md5(body.encode()).hexdigest()
|
||||||
|
|
||||||
|
|
||||||
# --- The fix (Part C) --------------------------------------------------------------------------
|
# --- The fix (Part C) --------------------------------------------------------------------------
|
||||||
# Read the secret from the environment instead of committing it. Proper secret management — env
|
# Read the secret from the environment instead of committing it. Proper secret management — env
|
||||||
# files, secret stores, per-environment config — is Module 17. This is just enough to make the
|
# files, secret stores, per-environment config — is Module 17. This is just enough to make the
|
||||||
|
|||||||
@@ -14,6 +14,13 @@
|
|||||||
|
|
||||||
set -u # treat unset vars as errors; we manage exit codes explicitly below.
|
set -u # treat unset vars as errors; we manage exit codes explicitly below.
|
||||||
|
|
||||||
|
# A security gate must fail CLOSED. If the interpreter the secret gate needs isn't here, abort with a
|
||||||
|
# non-zero exit rather than sailing past the check and reporting a false "passed".
|
||||||
|
command -v python3 >/dev/null 2>&1 || {
|
||||||
|
echo ">> python3 is required for the secret gate but was not found. Aborting." >&2
|
||||||
|
exit 2
|
||||||
|
}
|
||||||
|
|
||||||
status=0
|
status=0
|
||||||
|
|
||||||
echo "=== Gate 1: SCA / dependency scan (pip-audit) ==="
|
echo "=== Gate 1: SCA / dependency scan (pip-audit) ==="
|
||||||
@@ -28,16 +35,33 @@ fi
|
|||||||
|
|
||||||
echo
|
echo
|
||||||
echo "=== Gate 2: secret scan (detect-secrets) ==="
|
echo "=== Gate 2: secret scan (detect-secrets) ==="
|
||||||
# detect-secrets prints a JSON report of any secrets it finds. We treat a non-empty results set as a
|
# detect-secrets prints a JSON report of any secrets it finds. NOTE: with no path it scans the files
|
||||||
# failure. `python -c` keeps this portable (no jq dependency).
|
# git TRACKS, so stage the starter files (`git add`) before running this, or an untracked file is
|
||||||
|
# invisible to the gate. We parse the JSON with `python3` (no jq dependency) and fail CLOSED: the
|
||||||
|
# parser returns 0=secrets found, 1=clean, anything else=couldn't tell — and "couldn't tell" must
|
||||||
|
# count as a failure, never a silent pass.
|
||||||
report="$(detect-secrets scan)"
|
report="$(detect-secrets scan)"
|
||||||
if printf '%s' "$report" | python -c 'import sys, json; sys.exit(0 if json.load(sys.stdin).get("results") else 1)'; then
|
printf '%s' "$report" | python3 -c 'import sys, json
|
||||||
echo "$report"
|
try:
|
||||||
echo ">> SECRET gate FAILED: a credential was detected in the tree. See report above." >&2
|
found = bool(json.load(sys.stdin).get("results"))
|
||||||
status=1
|
except Exception:
|
||||||
else
|
sys.exit(2)
|
||||||
echo "no secrets detected."
|
sys.exit(0 if found else 1)'
|
||||||
fi
|
secret_rc=$?
|
||||||
|
case "$secret_rc" in
|
||||||
|
0)
|
||||||
|
echo "$report"
|
||||||
|
echo ">> SECRET gate FAILED: a credential was detected in the tree. See report above." >&2
|
||||||
|
status=1
|
||||||
|
;;
|
||||||
|
1)
|
||||||
|
echo "no secrets detected."
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
echo ">> SECRET gate ERROR: could not parse the scan output (exit $secret_rc). Failing closed." >&2
|
||||||
|
status=1
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
|
||||||
echo
|
echo
|
||||||
if [ "$status" -ne 0 ]; then
|
if [ "$status" -ne 0 ]; then
|
||||||
|
|||||||
@@ -332,7 +332,9 @@ config per environment.
|
|||||||
> *"Refactor `sync.py` so it reads `TASKS_API_KEY` and `APP_ENV` from environment variables
|
> *"Refactor `sync.py` so it reads `TASKS_API_KEY` and `APP_ENV` from environment variables
|
||||||
> instead of hardcoding them. Pick the backend URL from `APP_ENV` (dev/staging/prod). Fail loudly
|
> instead of hardcoding them. Pick the backend URL from `APP_ENV` (dev/staging/prod). Fail loudly
|
||||||
> with a clear message if `TASKS_API_KEY` is missing. Don't add any third-party dependency — load
|
> with a clear message if `TASKS_API_KEY` is missing. Don't add any third-party dependency — load
|
||||||
> the `.env` file with a few lines of plain Python."*
|
> the `.env` file with a few lines of plain Python, and make sure the loader does **not**
|
||||||
|
> overwrite a variable that's already set in the environment, so a value passed on the command
|
||||||
|
> line still wins."*
|
||||||
|
|
||||||
You're looking for a result shaped like this (read the diff before you accept it):
|
You're looking for a result shaped like this (read the diff before you accept it):
|
||||||
|
|
||||||
@@ -372,6 +374,15 @@ config per environment.
|
|||||||
grep -n "sk-live" sync.py # should print nothing
|
grep -n "sk-live" sync.py # should print nothing
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Why `setdefault` and not plain assignment?** The loader uses `os.environ.setdefault(key, value)`,
|
||||||
|
which sets a variable *only if it isn't already set*. That precedence is load-bearing: a value the
|
||||||
|
environment already supplies — like an `APP_ENV` you pass on the command line — wins over the
|
||||||
|
`.env` file. A loader that writes `os.environ[key] = value` instead **clobbers** anything already
|
||||||
|
there, so the file silently overrides your command line and Part D's override demo does nothing.
|
||||||
|
This matches the real-world dotenv default (`override=False`): the file fills in gaps, it doesn't
|
||||||
|
stomp on what's already in the environment. If the AI hands you plain assignment, that's the
|
||||||
|
correction to make.
|
||||||
|
|
||||||
### Part D — Run it from the environment
|
### Part D — Run it from the environment
|
||||||
|
|
||||||
5. Run it reading from your `.env`:
|
5. Run it reading from your `.env`:
|
||||||
@@ -395,7 +406,9 @@ config per environment.
|
|||||||
```
|
```
|
||||||
|
|
||||||
Watch the backend URL change with `APP_ENV` while the source never does. That's config in the
|
Watch the backend URL change with `APP_ENV` while the source never does. That's config in the
|
||||||
environment.
|
environment. **If the URL *doesn't* change, your loader is clobbering variables that were already
|
||||||
|
set** — it's using `os.environ[key] = value` where it needs `os.environ.setdefault(...)` (see
|
||||||
|
Part C). Fix the loader so the command line wins, and the override takes effect.
|
||||||
|
|
||||||
### Part E — Commit, and verify the secret didn't tag along
|
### Part E — Commit, and verify the secret didn't tag along
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user