Files

T

claude 863435915c De-slop the syllabus and the blog (em-dashes + banned words) (#96 )

Co-authored-by: claude <claude@jpaul.io>
Co-committed-by: claude <claude@jpaul.io>

2026-06-23 07:28:55 -04:00

12 KiB

Raw Blame History

Version Control Isn't Just for Code: Start With Your Words

I want to start with a file I'm genuinely embarrassed about. Somewhere on an old shared drive, there is a document called runbook-final-v2-ACTUAL-use-this.docx. There's a runbook-final.docx next to it. And a runbook-final-FIXED.docx. And (this is the one that hurts) a runbook-final-v2-ACTUAL-use-this-JP-edits.docx.

That little graveyard of filenames is what "version control" looked like for me for years. Not for code; I'd long since made peace with Git for code. For words. The runbooks, the design docs, the "why did we decide this" notes. All of it lived in Word, on a drive, and every time two of us touched the same file we'd email it back and forth and pray.

Here's the thing I wish someone had told me sooner: writing is the safest possible place to learn Git, and learning it there fixes that graveyard for good. That's what this post is about, and it's the first lesson in The Workflow that you can genuinely use on Monday with zero new tools.

A quick callback for anyone just landing here: in the last post we installed the safety net: Git as undo for the AI, a checkpoint you can always get back to. This post takes that same net and points it at something where a mistake costs you absolutely nothing: a markdown document.

Why words are the perfect practice ground

Think about it from a risk angle. When you're learning a new tool, you want a sandbox where a wrong move is free. Practicing Git on your live application means a fat-fingered command can nuke working code. Practicing it on an ADR (a short document explaining one decision) means the worst case is you mangle a paragraph nobody's read yet.

But low stakes would be a weak pitch on its own. The real reason this works is that documents have every problem Git was built to solve, and most teams feel those problems worse on their docs than on their code:

More than one document. A runbook references a design doc that references a spec. Change the decision and three documents are quietly out of sync, and there's no record of which one changed, or when.
More than one day. "Why did we store state as JSON instead of SQLite?" The answer lived in a meeting, or a Slack thread, or someone's head. Six months later it's just gone.
No undo. Someone edits the runbook during an incident, gets a step wrong, and there's no clean way back to the version that was correct an hour ago.

That last one is runbook-final-v2-ACTUAL-use-this.docx. That filename is what "no undo" looks like when it's been left to metastasize. Git fixes all three the same way it fixes them for code, if the document is in a format Git can actually work with. That "if" is the entire argument.

The argument, in one diff

Git's superpower is the line-based diff. It compares two snapshots and tells you exactly which lines changed. Everything good about Git (readable history, reviewable changes, automatic merges) is built on that one trick. So a format versions well in exact proportion to how much it looks like lines of text.

Markdown is just text. Change one sentence in a markdown runbook and git diff shows you precisely that:

-Restart the worker with `systemctl restart tasks-worker`.
+Restart the worker with `systemctl restart tasks-worker`, then tail the log for 30s to confirm.

That is a perfect change record. A reviewer reads it in two seconds. Two people can edit different sections and Git merges them automatically, because their changes touch different lines.

Now do the same edit in a .docx. A Word document isn't text; it's a zipped bundle of XML, styles, and metadata. Git will happily track it, but it can't diff it meaningfully. Ask for the diff and you get this:

Binary files a/runbook.docx and b/runbook.docx differ

That's it. That's the whole change record: something changed. You can't see what. You can't review it. And if two people edited it, Git makes you pick one entire file and throw the other one away. The history technically exists and is completely useless. (PowerPoint is even worse, because a slide deck is more structure and less text.)

[insert a screenshot referencing a side-by-side of a clean markdown git diff versus the "Binary files differ" message for a .docx here]

So here's the line I'll actually defend to a skeptical colleague, and it's an engineering argument, not a style preference:

Runbooks, ADRs, specs, and changelogs belong in markdown in the repo, not in Word on a shared drive. The moment a document needs history, review, or more than one author, a binary format is actively costing you the thing version control exists to provide.

The aha: your wiki was a Git repo the whole time

This is the part that rewired how I see documentation. Most Git hosts (GitHub, GitLab, Gitea) ship a wiki alongside every repo. It looks like a web app: click "New Page," type in a box, hit save. It feels like a totally different kind of thing from your code.

It isn't. On basically every one of these hosts, the wiki is itself a Git repository, usually addressable as something like your-project.wiki.git, full of markdown files. Every page is a .md. Every "save" in that web editor is a git commit. The fancy textbox is just a convenience layer over the exact same machinery you're learning here.

Which means the documentation you've been editing in a browser has had full version history (diffs, blame, the works) the entire time. It's not a CMS. It's a repo wearing a web UI. Once you see that, you can't unsee it.

The AI angle: this is the one you can adopt tomorrow

Here's why this matters more in the AI era, not less.

LLMs are native markdown writers. Markdown is arguably the single most fluent output format these models have; they were trained on oceans of it and reach for it by default. Ask an AI to "write an ADR for this decision" or "turn these rough notes into a runbook" and you're playing directly to its strengths. The output is good, and it's in exactly the right format, with zero conversion.

That makes a four-word workflow available to you right now: draft it, branch it, diff it, merge it. No new model, no editor integration, no plugins. Branch the repo, paste the AI's draft into a .md file, read the diff, merge. It works today with the browser chat tab you already have open. Most of this course gives you capability you have to build up to. This one you can use on your next document.

And reading that diff is the skill. The AI will write an ADR that sounds completely authoritative and confidently states a rationale it just made up. Reading the diff is how you catch "wait, that's not actually why we did this." The format makes the review possible; your judgment makes it correct. It's the same muscle you'll use later to review AI code, except here a mistake costs nothing.

What it actually looks like

On the tasks-app we've been building, the whole loop is six commands. Branch off, let the AI draft an ADR for why the app stores its state in a plain tasks.json file, review it, and fold it back into main:

git switch -c docs/adr-storage     # a private copy to draft on; main is untouched
# ...paste the AI's ADR draft into docs/adr/0001-task-storage-format.md...
git add docs/adr/0001-task-storage-format.md
git diff --staged                  # READ IT: every line, before it lands
git commit -m "Add ADR 0001: store tasks as JSON"
git switch main
git merge docs/adr-storage         # fast-forward, no conflict
git branch -d docs/adr-storage     # work's in main now; tidy up

Two small gotchas worth flagging, because they trip everyone up the first time:

git diff shows nothing for a brand-new file. New files are "untracked," and git diff only compares tracked changes. That's why the loop does git add then git diff --staged: staging tells Git "track this," and --staged shows you what's staged. For a new file the diff is all green additions, which is fine. You're still reading every line.
git switch -c is just the newer, clearer spelling of git checkout -b. Older docs and muscle memory use checkout; either works.

Because nothing else touched main while you worked, that merge is trivial; Git just slides main up to your branch. No conflict. That clean case is the whole reason we practice on a lonely document first. (What happens when two branches edit the same lines, an actual merge conflict, is a real skill, and it gets its own treatment later, on code, where the stakes make the depth worth it.)

[insert a screenshot referencing git diff --staged output showing a freshly drafted ADR as all-green additions here]

Where it breaks (because I'd rather you trust me)

A few honest caveats, because "markdown for everything" would be overselling it:

Line diffs punish reflowed paragraphs. Git diffs lines. If the AI rewraps a paragraph so every line shifts, the diff shows the whole block as changed even if three words moved. The fix the technical-writing world uses is semantic line breaks: one sentence (or clause) per line, so edits stay local. The AI won't do this by default; you have to ask.
Plain text isn't free of binaries. A markdown doc with screenshots still drags .png files along, and Git diffs those as "binary files differ" too. It stores them fine; it just can't show you what changed inside them.
Word and PowerPoint still exist for good reasons. A pixel-precise client deliverable, a heavily-laid-out deck, a doc a non-technical stakeholder must edit in a tool they know: those are real constraints. The argument was never "markdown for everything." It's "anything that needs history, review, or multiple authors is paying a steep tax in a binary format." Aim at the targets where that tax actually bites: runbooks, ADRs, specs, changelogs.
The AI writes confident fiction. It'll produce a fluent ADR with a rationale that reads exactly like a senior engineer wrote it, and is sometimes simply invented. The format makes the document reviewable; it does not make it true. Reading the diff is necessary, not sufficient. You still have to know whether the reasoning is right.

You're done when

You can take an ADR or a runbook from "the AI drafts it" to "reviewed, branched, merged into main" without thinking about the commands. You can explain to a skeptical colleague (using the line-based-diff argument, not just "markdown is nicer") why the team's runbooks shouldn't be .docx files on a shared drive. And you know that your Git host's wiki is itself a repo, and what that quietly implies.

Once that loop (the AI drafts, I review the diff, I decide) is reflexive on documents where a mistake is free, you'll apply it without thinking when the AI starts editing actual code. Which is exactly the next step: the AI finally comes out of the browser tab and starts editing your files directly, a move that's only safe because you can now branch, diff, and revert exactly what it does.

If you've got your own runbook-final-v2-ACTUAL-use-this.docx story (and I know some of you do) tell me in the comments. I read them. And if you try the draft-branch-diff-merge loop on a real doc this week, let me know how it goes. It's the gentlest on-ramp to Git I know of, and the only one where the worst case is a slightly worse paragraph.

12 KiB Raw Blame History