Definition of Done for protocols that don’t get exploited

Most protocols don’t fail because they’re careless, but because they don’t have processes.

Early stage teams are juggling product, design, governance, fundraising, and hiring, while also shipping software that holds money. In that chaos, security becomes a single event on the calendar:

We’ll do an audit before launch.

Don’t get me wrong. An audit is valuable, but it’s not a security process.

This post is my attempt to define what “done” means for a secure, long-lasting protocol, from writing specs to post-deployment operations. Not as a checklist that you paste in Notion and forget about it. But rather as a cultural shift and a practical standard that you can adopt.

Why “audit at the end” feels rational

To be fair, when you’re building something new, everything changes weekly, from specs to interfaces and user flows. You can’t afford to be slowed down by a heavy process. You need speed.

So you do what seems rational:

You build quickly
You write tests just to reach 100% coverage
You rely on engineers’ intuition to iterate well
You pay for an audit when the code “settles”

In the moment, it feels like an efficient path. You were fast, and you now have something that feels and works well. But does it really?

Why this fails in practice

The problem with this approach is that audits can’t compensate for shaky foundations. Auditors aren’t magicians, and they can’t validate intent if that intent wasn’t written. The same goes for invariants. They can’t cover the long list of edge cases if the threat model doesn’t exist.

And the most common vulnerabilities are rather simple in hindsight: wrong assumptions, user inputs, external integrations, operational hiccups, etc. The kind of issues that could have been caught in the early stages.

You might think to yourself, “our code should be the documentation”. Sure, but are you really going to leave millions in TVL up to chance or “shoulds”?

Core idea: Definition of Done across the protocol lifecycle

A “done” protocol needs to be able to answer all of the following questions at any point in time:

What are the known attack vectors?
What properties of the system must always hold true?
What assumptions are you making?
What happens if these assumptions break?
How do you know the code matches the intents?
How do you respond if something goes wrong?

To make this a reality, you need a Definition of Done that spans your product’s lifecycle. Below is a practical model that you can adopt right away, split into seven stages. Skip one, and others could suffer. Skip none, and you should be golden.

Stage 1: Requirements

What are you trying to achieve? Are you introducing unnecessary complexity? All new features add complexity, but you must pick what brings the most value to your users. Once you’re sure that the feature is valuable, its Product Requirement Document (PRD) should state the following:

Summary: What is the outcome of this feature? What is the simplest way to achieve it?
Problems: What issues are users facing right now?
Solution: What does the interface look like? How should the functions behave? What must they do, and what must they not do?
Trust models: Where are the trust boundaries? What roles exist?
Threat models: What assets are at risk? Who are the adversaries? How do you handle external dependencies? What are the known risks?
Invariants: What properties must always hold true? Are they enforced at all times?
Assumptions: What happens if there’s low liquidity? What if oracles are stale? Are user inputs properly validated? What happens if the sequencer isn’t lively?

It might seem cumbersome, but just imagine how happy your engineers will be now that they have a clear set of expectations and guidelines upon which they can build.

That’s the whole point: the code is just a materialization of the requirements. Nothing’s unexpected. It’s boring even.

Stage 2: Development

Engineers often don’t prioritize legibility for some reason. But the truth is that you can make your code both elegant and readable, while still have it achieve what it needs. Otherwise, the code might look clean, but it becomes harder to reason about and follow along.

At the end of development, you want two properties:

Traceability: every change maps back to a requirement in the PRD, otherwise you end up shipping behavior that only exists in somebody’s head. If something can’t be explained in plain words, it’s not good enough to be shipped.
Reviewability: every change is dead obvious, even to a newcomer. A security-oriented engineer treats boring as a feature. The goal is to include sophisticated code only where it’s unavoidable and keep everything else explicit.

This is where the security model becomes practical. Even if something fails, it was expected to fail under those conditions. Invariants are sound. Every line is there for a reason, and you can verify it against the PRD.

Also, you know the drill: a minimal set of lines to achieve the goal, good comments, NatSpec, clean abstractions, gas optimizations, linters, static analyzers, and generally respecting good coding practices. Your future self and fellow teammates are going to be thankful for it.

Stage 3: Testing

Reproducibility isn’t optional for serious protocols. Requirements evolve and assumptions change. You need a way to make sure that, if a module is changed, other modules are intact. If flows within a module change, the outcome is provable and expected. That’s where tests come in.

A secure baseline includes:

Unit tests: these are great for isolated flows, validating state changes and event emissions. Other internal components are mocked, but local soundness is the only thing you should care about.
Integration tests: moving into more serious territory, this is where you get into more complex user flows and cross-component interactions. Only external components are mocked, so you can be sure internal flows are good to go.
Fork tests: a real network gets pulled locally, and you get to run all sorts of scenarios on live data and with live external components (e.g., Morpho vaults, Uniswap routers, etc.). You can be sure that your external calls will go through and your code will handle them mostly alright.
Fuzz tests: this is where the magic starts to happen. Function inputs are no longer controlled, but completely randomized across thousands of runs. Many pesky edge cases surface here.
Invariant tests: ****also known as stateful fuzz tests, invariant tests persist state between multiple fuzz runs in order to find errors in multi-step user flows whose order of operations varies (deposit → redeem → withdraw → mint). Deep edge cases you never thought about can occur, because you never know what a user might do. Invariants are tested over and over. Truly a testing endgame.

Of course you should aim for 100% coverage, but there’s much more to it: relationships between lines, functions, internal, and external components can introduce cases that you never thought about. The only thing you can do is try your best to figure them out, or at least fuzz your way to them.

Stage 4: Pull Requests

Redundancy is security. A key reason why pull requests exist is to have another pair of eyes thoroughly review your code. Not check it out, not glance over it, but really think about it from a different perspective.

The PR should make the “why” obvious with a good description, and you can enforce that by adding a PULL_REQUEST_TEMPLATE.md, a template that should remind the author to provide the following:

Summary of the fix/new feature being introduced
Link to the PRD that this PR builds upon
Links to the docs of the external dependencies being used
Pointers to the critical parts of the feature/fix that should be double checked
Summary of how testing was carried out
Completed checklist of best practices, to “swear” that they’ve followed through on stuff like following the language’s best practices and style guidelines, adding good NatSpec, etc.
Any additional context for the reviewers (internal/external docs, previous audits, etc.)

The point of all this is to prevent the most systemic failure of all: relying on that one gigachad reviewer, and assuming they’ll get all the issues all the time. You want to allow anybody to pitch in and have their hands untied to contribute to the protocol’s overall soundness.

Stage 5: Audits

Audits are where teams often try to buy certainty they didn’t build. However, auditors are strongest when they are given well-defined intent, and weakest when they have to infer it. If auditors have to guess what the protocol is supposed to guarantee, they’ll spend more time reconstructing the mental model you should’ve provided, and less time exploring how to break it.

To avoid sounding like a broken record, handing auditors everything a developer used is key to steering the audit in the right direction. You should reduce the noise by shortening the “what is this supposed to do?” phase and expanding the “can this be broken?” phase.

Another thing to mention is that a single external audit is rarely enough. A flow I’d recommend is as follows:

Internal audit: a more thorough version of a PR review, aimed at helping another engineer enter a security-oriented mindset and try to clear out low-hanging fruit, reducing the noise for external auditors. There should be a document that the engineer fills out, streamlining their process of noting down any issues they’ve found or questions that they might have.
External audit(s): having a strong firm like Spearbit audit your codebase is a good guarantee that everything’s alright. But most of the time, you can squeeze out a few more bugs by doing a re-audit with a different firm. Once again, redundancy is security. A fresh set of eyes might just find something that no one else did.
Public contest: the ultimate test of codebase’s soundness is having not a few people review it, but dozens of them. There are many platforms to choose from: Code4rena, Cantina, Sherlock, all offering a similar experience while achieving the same goal of having dozens of (hopefully, high quality) independent security researchers review your code.

The point is not to buy credibility by paying for an audit, but to stack multiple reviews together, where each one of them eliminates less issues than the previous one, which proves that the codebase is converging to become a secure one.

Also, don’t forget to add new tests for the bug fixes you’ve implemented after an audit. These lines are as vulnerable as any other, regardless of who proposed them.

Stage 6: Bug bounties

Security research is continuous, and yours should be too.

Once they launch, mature protocols need to set up a bug bounty program, either through platforms like Immunefi or otherwise, and make it easy for bug hunters to report issues by providing all the details of how the program works:

what’s in scope
what’s out of scope
how severity is judged
how disclosure should work
what the rewards are, etc.

On the other hand, there needs to be an internal runbook as well:

who owns triage
who can make decisions
how reproductions are verified
how patches are shipped
how communication is handled when time is short

This is one of those stages where it seems very unlikely that something bad will happen, but if it does, everybody benefits from the processes you’ve set in place.

Stage 7: Monitoring and incident response

A protocol can be “formally correct” and still lose users or assets due to an operational failure: compromised keys, bad deployments, broken front-ends, misconfigured parameters, silent oracle degradations, slow human response, etc.

You need to have monitoring and alerting set up before meaningful value reaches your contracts. Your flow could look like this:

Bots monitor transactions and important contract addresses, looking for anomalies such as repeated deposits, large withdrawals, etc.
Predefine an Incident Response Plan that covers who calls the shots, who executes mitigations, whether external parties (like SEAL 911) will be involved, and what actions are authorized (pausing contracts, emergency upgrades, etc.).
If something bad happens, immediately alert devs (via Slack/Discord/PagerDuty), as well as external parties (if applicable). Ideally, emergency actions like contract pauses are already in place so value loss is minimal.
Set up a public channel for disclosing information live. Even if you’re doing something that’s best for the users, if it’s not properly communicated, it might get misinterpreted and degrade your relationship with them.
Once the incident is over, conduct a blameless post-mortem that documents root causes, missteps, and fixes that were applied. You’ll learn a lot from this write-up, so be thorough.

The uncomfortable truth is that live-action operational security is one of the most important stages, because users are trusting you with their hard earned funds, and blockchain is a Dark Forest. Preparation is your best friend.

How to make this non-optional

All of the above only works if it’s culturally adopted, rather than letting it sit in a guidelines document.

The simplest way to do that is to treat each stage as a release gate, in the same way you already treat “tests passing” as a release gate. Either the bar is met, or the protocol does not move forward.

This is as much a cultural shift as a technical one. Teams that internalize this stop asking “can we ship already?” and start asking “did we cover all the gates?” Security stops being an act of caution and becomes a shared operating model.

At that point, security isn’t something the team has to remember to care about. It’s something the system and the culture quietly enforce.

Final takeaway

Speed is a choice, and so is the kind of risk you carry. Most hacks come from user flows that weren’t thoroughly modeled. You need to turn security into a pipeline: specify properties that must hold, test them rigorously, verify them externally, and monitor them continuously.

A protocol matures when safety is repeatable and deterministic, not stochastic.