Session 2.4: The Human Gates

Course → Module 2: AI as Infrastructure, Not Magic

Session 4 of 5

A quality gate is a point in your pipeline where production stops until a human reviews and approves the output. Not "AI checks AI." Not "it's probably fine." A human with domain knowledge looks at the work and decides whether it passes or fails. If it fails, it goes back to the previous stage. If it passes, it moves forward.

Quality gates are expensive in time. They are non-negotiable in quality. The question is not whether to have them. The question is where to place them and what criteria to apply.

The Minimum Viable Gate Structure

Every content pipeline needs at least three human gates. Fewer than three means you are publishing content that has not been adequately reviewed. More than three is fine, but three is the floor.

A quality gate is where production stops until a human says "go." No automation, no "AI checks AI," no "it's probably fine."

Gate 1: Specification Review

Before any AI generation begins, a human reviews the specification. This gate prevents the most expensive failure: generating content from a bad plan.

Check	Question	Fail Condition
Audience	Is the target audience clearly defined?	Vague or missing audience definition
Purpose	Why does this content exist?	"Because we need content" is not a purpose
Sources	Are research inputs sufficient?	No primary sources, no expert input
Structure	Is the outline specific and logical?	Generic structure, no clear argument flow
Voice	Are voice constraints documented?	No voice specification
Constraints	Are forbidden patterns listed?	No negative constraints

Gate 1 catches problems that would be expensive to fix later. A specification missing voice constraints produces output that does not sound like you. A specification with no research inputs produces output that contains no original information. Catching these at Gate 1 costs minutes. Catching them after generation costs hours.

Gate 2: Output Review

After AI generates content, a human reviews the output against the specification. This is not a "does this look okay?" check. It is a systematic comparison of output against defined criteria.

Check	Question	Fail Condition
Format compliance	Does the output match the specified structure?	Wrong number of sections, missing elements
Content coverage	Are all specified topics covered?	Missing subtopics, added unrequested content
Factual accuracy	Are claims verifiable?	Unsourced claims, hallucinated data
Voice compliance	Does the voice match the specification?	AI voice markers present (hedging, filler, false enthusiasm)
Forbidden patterns	Are forbidden patterns absent?	Any forbidden pattern present
Originality	Does the content contain the specified original elements?	Generic content with no unique perspective

Gate 2 is where the 15 forensic markers from Module 1 become operational tools. Scan the output for hedging, filler, false enthusiasm, hollow metaphors, and the other markers. If the marker density exceeds your threshold (a reasonable starting point is 5 markers per 1,000 words), the output fails and goes back to generation with adjusted prompts.

Gate 3: Pre-Publish Review

The final gate asks one question: would you attach your name to this? Not "is it good enough." Not "will it rank." Would you show this to your most respected colleague and feel confident about it?

Gate 3 checks what the other gates do not: overall impression, coherence, and whether the piece achieves its purpose as a whole. Individual sections might pass Gate 2 while the overall piece lacks flow or coherence. Gate 3 is the human reading the complete work as a reader would experience it.

Designing Gate Criteria

Gate criteria must be specific enough that someone other than you could apply them. "Is it good?" is not a gate criterion. "Does every factual claim include a citation?" is a gate criterion. The test: could you hand the criteria to a competent colleague and get the same pass/fail result?

graph LR A["Write criteria"] --> B{"Could another person
apply these criteria
consistently?"} B -->|"Yes"| C["Criteria are
specific enough"] B -->|"No"| D["Revise: make
criteria binary
and verifiable"] D --> A

Good gate criteria are binary (pass or fail, no "kind of"), specific (checking a defined attribute), and verifiable (another person can confirm the result). Building these criteria takes time upfront. It saves exponentially more time downstream because every piece of content goes through the same consistent review process.

Assignment

Define 3 quality gates for your content pipeline.
For each gate, specify:
- Where in the pipeline it occurs
- What gets checked (list specific criteria)
- What determines pass vs. fail (binary, verifiable conditions)
- What happens when something fails (regenerate? revise? discard?)
Write the criteria as if you are training someone else to run your gates. Could a competent colleague apply your criteria and reach the same pass/fail decisions you would?
Test your Gate 2 criteria on a piece of AI-generated content. Does it pass or fail? Do the criteria catch the right problems?

The Human Gates