Course → Module 6: Voice Capture & Preservation
Session 2 of 9

From "My Writing Is Conversational" to Measurable Specifications

Voice extraction means systematically analyzing your existing writing to identify its unique characteristics. Not "my writing is conversational." That is a label, not a specification. Labels are too vague for AI to act on. "Conversational" means one thing to a marketing writer and something entirely different to a technical writer.

Instead, you need measurements. "My average sentence length is 14 words. I use fragments for emphasis, typically one per paragraph. I open paragraphs with statements, not questions. I never use the word 'leverage.' My metaphors come from manufacturing and cooking, never from sports." These are specifications an AI can follow.

Voice extraction is reverse engineering. You are not creating a voice. You are dissecting the one you already have. The goal is to take something intuitive (how you naturally write) and make it explicit (a list of measurable patterns). The explicit version is what you feed to the AI.

The Seven Dimensions of Voice

A complete voice extraction analyzes seven dimensions. Each dimension produces specific, measurable data points.

Dimension What to Measure How to Measure
Sentence length Average words per sentence, range (min-max) Count words in 50 random sentences from your writing
Transitions Most-used transition words, paragraph opening patterns List the first word of 30 paragraphs
Vocabulary Preferred words, forbidden words, domain jargon Frequency analysis of your writing vs AI default
Paragraph structure Average sentences per paragraph, build pattern Analyze 20 paragraphs: do they build up, break down, or pivot?
Opening habits How you start pieces and sections Collect the first sentences of 10 pieces
Metaphor sources Where your comparisons come from List every metaphor in 5 pieces and categorize by domain
Punctuation Semicolons, parenthetical asides, fragment use Scan for patterns: do you use semicolons? How often?

The Extraction Process

You need a corpus: at least 10,000 words of your own writing. Blog posts, emails, reports, anything that represents your real voice. Not your most polished work. Your typical work. The goal is to capture your actual patterns, not your aspirational ones.

graph TD A["Collect 10,000+ words
of your own writing"] --> B["Analyze Dimension 1:
Sentence length"] B --> C["Analyze Dimension 2:
Transitions"] C --> D["Analyze Dimension 3:
Vocabulary"] D --> E["Analyze Dimension 4:
Paragraph structure"] E --> F["Analyze Dimension 5:
Opening habits"] F --> G["Analyze Dimension 6:
Metaphor sources"] G --> H["Analyze Dimension 7:
Punctuation"] H --> I["Compile into
Voice Analysis Document"] style A fill:#222221,stroke:#c8a882,color:#ede9e3 style I fill:#222221,stroke:#6b8f71,color:#ede9e3

For each dimension, you perform the measurement manually. Yes, manually. AI can help with word counts and frequency analysis, but the interpretation must be yours. The AI does not know that you use cooking metaphors because you grew up in a kitchen. It only sees the pattern. You understand the reason, and that understanding informs which patterns to preserve and which are accidental.

Using AI to Assist the Analysis

While interpretation is human work, data collection can be AI-assisted. Paste your writing corpus into an AI with this prompt: "Analyze this writing sample. For each dimension listed below, provide specific measurements and patterns. Do not interpret or evaluate. Only measure."

The AI will return quantitative data: average sentence length, most frequent opening words, vocabulary frequency lists. Use this data as a starting point, then verify against your own reading. The AI will catch patterns you miss (like unconsciously starting 40% of paragraphs with "The"). You will catch patterns the AI misinterprets (like your deliberate use of fragments, which the AI might flag as grammatical errors).

Common Discoveries

Voice extraction consistently reveals surprises. Writers who describe themselves as "concise" discover their average sentence length is 22 words. Writers who believe they are "formal" discover they use contractions in 70% of sentences. Writers who claim they "never use jargon" discover their writing is dense with industry-specific terms they no longer recognize as jargon.

These surprises are valuable. They close the gap between how you think you write and how you actually write. The voice fingerprint (next session) is built on actuals, not self-perception.

Further Reading

Assignment

Collect 10,000+ words of your own writing. Using the seven-dimension framework from this session, document: (1) average sentence length from 50 sentences, (2) most-used transition words from 30 paragraph openings, (3) five words you use often and five you never use, (4) typical paragraph length in sentences, (5) how you open pieces, (6) where your metaphors come from, (7) your punctuation habits. Compile this into a single document.