Course → Module 5: Prompt Engineering
Session 8 of 10

A Prompt Is a Specification

A prompt is not a wish. It is a specification. Specifications require testing. You write a prompt, run it, evaluate the output, identify the gap between output and desired result, modify the prompt, and run it again. This is an engineering feedback loop.

Most people write a prompt once and accept whatever comes back. If it is roughly acceptable, they publish it. If it is bad, they rewrite the prompt from scratch. Neither approach is efficient. The first produces inconsistent quality. The second wastes the information contained in the failed attempt.

A prompt is not done when it works once. A prompt is done when it works every time. Consistency is the standard. If your prompt produces good output 3 out of 5 runs, it has a 40% failure rate. In production, that means 40% of your batch needs regeneration. That is not a working prompt. That is a draft prompt.

The Iteration Loop

Each iteration follows the same steps. The discipline is in doing all five steps every time, not skipping the evaluation to save time.

graph TD A["1. Run prompt
(5 times, same parameters)"] --> B["2. Evaluate each output
against quality criteria"] B --> C["3. Identify the most
common failure mode"] C --> D["4. Modify prompt to
address that failure"] D --> E["5. Run modified prompt
(5 more times)"] E --> F{"4/5 or better
pass rate?"} F -->|Yes| G["Prompt is production-ready"] F -->|No| A style A fill:#222221,stroke:#c8a882,color:#ede9e3 style C fill:#222221,stroke:#c47a5a,color:#ede9e3 style G fill:#222221,stroke:#6b8f71,color:#ede9e3

Running the prompt five times is not optional. A single successful run tells you nothing about consistency. Five runs reveal patterns: does the output always start with a generic opening? Does paragraph three always contain hedging? Does the AI ignore a specific constraint in 2 out of 5 runs? These patterns guide your modifications.

Common Failure Modes and Fixes

Failure Mode Symptom Prompt Fix
Structural drift Output ignores the requested structure Add "Follow this structure exactly. Do not add, remove, or reorder sections."
Constraint violation Output uses forbidden phrases Move constraints to both the beginning and end of the prompt
Tone inconsistency Output shifts tone between sections Add a few-shot example demonstrating consistent tone throughout
Length violation Output is too long or too short Specify word count per section, not just total
Hedging Output qualifies every statement Add "State claims directly. Do not hedge with 'arguably,' 'perhaps,' or 'it could be said.'"
Generic opening Output starts with "In today's..." or similar Add "Begin with a specific fact, example, or direct statement. Never begin with a generic framing sentence."

The Iteration Journal

Track your iterations. For each prompt, maintain a log:

This journal serves two purposes. First, it prevents you from repeating failed modifications. Second, it builds a knowledge base of what works for your content type. After iterating on ten prompts, you will notice patterns: certain constraints always need to be repeated at the end, certain failure modes require few-shot examples rather than instructions, certain structures need explicit section-by-section word counts.

When to Stop Iterating

The target is a 4/5 or better pass rate across five runs. This means the prompt produces acceptable output at least 80% of the time. Pushing for 5/5 is often not worth the effort. The marginal improvement from 4/5 to 5/5 usually requires disproportionate prompt complexity. Accept 4/5 and handle the occasional failure through your human review process.

If you cannot achieve 3/5 after five iterations, the problem may not be the prompt. The task may be too complex for a single prompt, requiring a chain-of-thought approach or a multi-agent workflow (covered in Module 9).

Further Reading

Assignment

Pick a prompt you have been using. Run it five times and evaluate each output against your quality criteria. Identify the most common failure mode. Modify the prompt to address that failure. Run five more times. Did the failure rate decrease? Document each iteration in a log with: version number, change made, pass rate, and remaining failures. Repeat until you achieve 4/5 or better.