Verified Spec-Driven Development (VSDD) is a unified software engineering methodology that fuses three proven paradigms into a single AI-orchestrated pipeline:
VSDD treats these not as competing philosophies but as sequential gates in a single pipeline. Specs define what. Tests enforce how. Adversarial verification ensures nothing was missed. AI models orchestrate every phase, with the human developer serving as the strategic decision-maker and final authority.
| Role | Entity | Function |
|---|---|---|
| The Architect | Human Developer | Strategic vision, domain expertise, acceptance authority. Signs off on specs, arbitrates disputes between Builder and Adversary. |
| The Builder | Claude (or similar) | Spec authorship, test generation, code implementation, and refactoring. Operates under strict TDD constraints. |
| The Tracker | Chainlink | Hierarchical issue decomposition — Epics → Issues → Sub-issues ("beads"). Every spec, test, and implementation maps to a bead. |
| The Adversary | Sarcasmotron (Gemini Gem or equivalent) | Hyper-critical reviewer with zero patience. Reviews specs, tests, and implementation. Fresh context on every pass. |
Nothing gets built until the contract is airtight — and the architecture is verification-ready by design.
The human developer describes the feature intent to the Builder. The Builder then produces a formal specification document for each unit of work. Critically, this phase doesn't just define what the software does — it defines what must be provable about it and structures the architecture accordingly.
Step 1a: Behavioral Specification
The Builder produces the functional contract:
Step 1b: Verification Architecture
Before any implementation design is finalized, the Builder produces a Verification Strategy that answers: "What properties of this system must be mathematically provable, and what architectural constraints does that impose?"
This includes:
Why this must happen in Phase 1: If the system is designed with side effects woven through the core logic, no amount of Phase 5 heroics will make it verifiable. A function that reads from a database, performs a calculation, and writes to a log in one block cannot be formally verified without mocking infrastructure that the verifier may not support. But a function that takes data in, returns a result, and lets the caller handle persistence — that's a function a model checker can reason about. This boundary must be drawn at the spec level because it fundamentally shapes the module decomposition, the dependency graph, and the testing strategy that follows.
Step 1c: Spec Review Gate
The complete spec — behavioral contracts and verification architecture — is reviewed by both the human and the Adversary before any tests are written. Sarcasmotron tears into the spec looking for:
The spec is iterated until the Adversary can't find legitimate holes in either the behavioral contract or the verification strategy.
Chainlink Integration: Each spec maps to a Chainlink Issue. Sub-issues are generated for each behavioral contract item, edge case, non-functional requirement, and each formally provable property. The provable properties get their own bead chain so their status is tracked independently from test coverage.
Red → Green → Refactor, enforced by AI.
With an airtight spec in hand, the Builder now writes tests — and only tests. No implementation code yet.
Step 2a: Test Suite Generation
The Builder translates the spec directly into executable tests:
The Red Gate: All tests must fail before any implementation begins. If a test passes without implementation, the test is suspect — it's either testing the wrong thing or the spec was wrong. The Builder flags this for human review.
Step 2b: Minimal Implementation
The Builder writes the minimum code necessary to make each test pass, one at a time. This is classic TDD discipline:
Step 2c: Refactor
After all tests are green, the Builder refactors for clarity, performance, and adherence to the non-functional requirements in the spec. The test suite acts as the safety net — if refactoring breaks something, the tests catch it immediately.
Human Checkpoint: The developer reviews the test suite and implementation for alignment with the "spirit" of the spec. AI can miss intent even when it nails the letter of the contract.
The code survived testing. Now it faces the gauntlet.
The verified, test-passing codebase — along with the spec and test suite — is presented to Sarcasmotron in a fresh context window.
What the Adversary reviews:
Negative Prompting: Sarcasmotron is prompted for zero tolerance. No "overall this looks good, but..." preamble. Every piece of feedback is a concrete flaw with a specific location and a proposed fix or question.
Context Reset: Fresh context window on every adversarial pass. No relationship drift. No accumulated goodwill.
The Adversary's critique feeds back through the entire pipeline:
This loop continues until convergence (see Phase 6).
The verification architecture designed in Phase 1b is now executed against the battle-tested implementation. Because the codebase was architected from the start with a pure core and clear purity boundaries, formal verification tools can operate on it without heroic refactoring.
All formal verification and fuzzing results feed back into Phase 4 if issues are found.
VSDD inherits VDD's hallucination-based termination, extended across all three dimensions:
| Dimension | Convergence Signal |
|---|---|
| Spec | The Adversary's spec critiques are nitpicks about wording, not about missing behavior, ambiguity, or verification gaps. |
| Tests | The Adversary can't identify a meaningful untested scenario. Mutation testing confirms high kill rate. |
| Implementation | The Adversary is forced to invent problems that don't exist in the code. |
| Verification | All properties from the Phase 1b catalog pass formal proof. Fuzzers find nothing. Purity boundaries are intact. |
Maximum Viable Refinement is reached when all four dimensions have converged. The software is considered Zero-Slop — every line of code traces to a spec requirement, is covered by a test, has survived adversarial scrutiny, and the critical path is formally proven.
One of VSDD's defining properties is full traceability. Every artifact links back:
Spec Requirement → Verification Property → Chainlink Bead → Test Case → Implementation → Adversarial Review → Formal Proof
At any point, you can ask: "Why does this line of code exist?" and trace it all the way back to a specific spec requirement, through the verification property it satisfies, the test that demanded it, the adversarial review that hardened it, and the formal proof that guarantees it. Equally, you can ask "Why is this module structured as a pure function?" and trace that decision back to the Purity Boundary Map in Phase 1b.
Spec Supremacy: The spec is the highest authority below the human developer. Tests serve the spec. Code serves the tests. Nothing exists without a reason traced to the spec.
Verification-First Architecture: The need for formal provability shapes the design, not the other way around. Pure core, effectful shell. If you can't verify it, you architected it wrong — and you find that out in Phase 1, not Phase 5.
Red Before Green: No implementation code is written until a failing test demands it. AI models are explicitly constrained to follow TDD discipline — no "let me just write the whole thing and add tests after."
Anti-Slop Bias: The first "correct" version is assumed to contain hidden debt. Trust is earned through adversarial survival, not initial appearance.
Forced Negativity: Adversarial pressure bypasses the politeness filters of standard LLM interactions. The Adversary doesn't care about your feelings — it cares about your invariants.
Linear Accountability: Chainlink beads ensure every spec item, test, and line of code has a corresponding tracked unit of work. Nothing slips through the cracks.
Entropy Resistance: Context resets on every adversarial pass prevent the natural degradation of long-running AI conversations.
Four-Dimensional Convergence: The system isn't done until specs, tests, implementation, and formal proofs have all independently survived adversarial review.
VSDD is explicitly designed for multi-model AI workflows:
Prompt Engineering for TDD Discipline: The Builder must be explicitly instructed: "You are operating under strict TDD. Write tests FIRST. Do NOT write implementation code until I confirm all tests fail. When implementing, write the MINIMUM code to pass each test." Without this constraint, AI models will naturally try to write implementation and tests simultaneously.
VSDD is high-ceremony by design. It's worth the overhead when:
For rapid prototyping or throwaway scripts, use the parts that make sense — TDD discipline and a quick adversarial pass can still catch a lot of slop even without the full ceremony.
"VSDD doesn't just generate code — it generates code that can prove why it exists, demonstrate that it works, and survive an adversary that wants it dead."