Early access · Phase 1 · invite-only

An IDE for building agents that don't blow up in production.

Most of what makes an agent useful isn't the model. It's the harness around it — the tools it's allowed to call, the gates on the dangerous ones, the cockpit you can watch it work from. That's what I'm building.

Running on prod today Hosted, isolated per agent Versioned, rollback-able
Why I'm building this

I started building agents for other businesses. Then I tried to run one for myself.

The first version of deflation.ai was a service business — you describe a workflow, I build, host and run the AI agent. Worked, in a sense. But every new customer surfaced the same problem at a different layer.

Spinning up an agent is easy. Running one in production is the actual hard problem. The privacy boundary. The deployment story. The thing where the agent does something you didn't ask for because it had a bad afternoon.

"Any agent will eventually hallucinate itself into doing something you didn't ask for. The bigger the toolbox you hand it, the worse the damage. The whole game is shrinking the toolbox and gating what's left."

— what I learned the hard way

So most of what I'm actually building isn't the agent itself. It's the harness around it — the thing that makes a hosted agent something you can trust to keep running without you watching every move.

What it looks like

Spec to production in seven steps.

The pipeline is the product. Each step is a concrete artifact you can read, diff and roll back. Below: a real agent being built — a lead qualifier for a property-management firm.

The seven steps in detail

Every step has an artifact. Every artifact is versioned.

No build hides behind a black box. Each step writes a concrete file you can read, diff against the previous version and roll back if it breaks the bot.

1

Spec

The plain-English brief the bot starts from
artifact: spec.md

Write what you'd tell a new hire on day one.

The spec is a markdown brief: who the customer is, what the agent is for, what success looks like, what NOT to do. No XML schema, no JSON DSL. Just the same conversation you'd have if you were onboarding a junior teammate.

Promoting a spec runs a checklist — required fields, scope boundaries, language & currency — before it can move forward.

spec.md · v3
# lead-qualifier
Customer: Brunner Immobilien AG, Zug
Size: 14 people, 320 properties

## Job
Read every inbound leasing enquiry, qualify it
(budget, move-in date, household), draft a
reply in the operator's voice, and route to
Lukas via Telegram for one-tap approval.

## Must not
- Quote rent. Ever.
- Send any email without operator approval.
- Schedule viewings outside business hours.
2

Workspace

The agent's long-term memory and identity files
artifact: 7 files

This is what the agent reads every time it wakes up.

System prompt, customer profile, workflow rulebook, heartbeat reminders. Owning these as plain files instead of database rows means you can review them like code — pull requests, blame, diffs.

Patching a single file doesn't restart the bot. The sync plugin picks the change up on the next loop. Iteration is 5 seconds, not 5 minutes.

workspace files
📁 lead-qualifier/
  📄 AGENTS.md 1.8 KB
  📄 USER.md 624 B
  📄 HEARTBEAT.md 412 B
  📁 KB/
    📄 profile.md 2.1 KB
    📄 workflow.md 3.4 KB
    📄 tone-guide.md 980 B
    📄 properties.md 5.6 KB
3

Plugins

How the agent talks to the outside world
8 enabled · 6 available

Channels in. Sync out. Nothing else.

Each plugin is a way the agent reaches the world — email in, Telegram in, dashboard chat in — plus the internal sync mechanisms that pull the latest spec and workspace from the platform.

Channels you don't enable can't be reached. There's no "default-on" surprise — a fresh agent talks to nobody until you tick the box.

plugins · lead-qualifier
email-channel inbound
telegram-channel operator
dashchat-channel live ops
bundle-sync platform
system-prompt-sync platform
workspace-sync platform
tool-loop-compactor runtime
dev-channel internal
sms-channel disabled
call-channel disabled
4

Skills

Procedural knowledge bundled per workflow
5 skills enabled

The agent doesn't reinvent your workflow on every message.

A skill is a tiny SOP — "when an inbound lead arrives, do X, then Y, then Z, never W." It's loaded on demand when the agent recognises the trigger, not stuffed into the system prompt forever.

Smaller context, more reliable behaviour. The agent for a bookkeeper doesn't carry around the skills for a property manager.

skills · lead-qualifier
lead-intakeParse new email, extract budget + dates
lead-qualifyScore 1–5 against ideal-tenant profile
draft-replyCompose answer in operator's voice
viewing-suggestOffer 3 slots from Lukas's calendar
operator-notifyPing Lukas on Telegram, await ✅/✗
followup-7dRe-check leads that ghosted
5

Tools

The verbs the agent is allowed to call
7 tools · 3 gated

Smaller toolbox, smaller blast radius.

Each tool is a function the LLM can invoke. Read-only tools (lookup CRM, read inbox) run straight through. Write tools that touch the real world — sending email, charging a card, moving a record — are gated by default. The agent drafts; you ship.

The arguments are validated at the gate, too. A hallucinated customer_id: 9999 never makes it through.

tools · lead-qualifier
gmail_readFetch inbox messages
gmail_sendOperator-approve before send
crm_lookupRead tenant + property records
crm_updateOperator-approve before write
calendar_readFree/busy on Lukas's calendar
calendar_createOperator-approve before book
kb_searchRead workspace KB files
6

Resolved

What the prior steps compiled to
bundle v3 · diff-able

The read-back. Not an editing surface — a verification one.

Spec + workspace + plugins + skills + tools resolve into one versioned bundle. Every bundle has a digest, a created-at, and a parent. You can diff v3 → v2 in the UI and revert in one click.

This is the file the bot actually loads at boot. What you see here is what runs.

bundle v3 · resolved openclaw.json
1"_meta": {
2 "bundle_version": 3,
3 "bundle_digest": "sha256:0cec2d…bbcd10",
4 "openclaw_version": "2026.5.7",
5 "compiled_at": "2026-05-09T22:01Z"
6},
7"agent": {
8 "id": "lead-qualifier",
9 "customer_id": "brunner-immobilien",
10 "model": "middleware/claude-sonnet-4-6"
11},
12"plugins": {
13 "allow": [
14 "email-channel",
15 "telegram-channel",
16 "dashchat-channel",
17 "bundle-sync", // +4 more
18 ]
19},
7

Deploy

Ship it. Watch it land.
live · v3

An eight-phase deploy that reports back as it runs.

Provisions a dedicated machine for this agent, copies the bundle, boots the runtime, runs a smoke check that fires a real tool call and waits for the side effect to land — only then is the bundle marked live. Container-started is not deployed.

If anything fails, you stay on the prior bundle. No partial deploy, no "well, it crashed in production".

deploy lead-qualifier · region fra
bundlev3 (latest)
regionfra
machinedeflation-lead-qualifier
rollbackv2 (one click)
deploy is async · polls /state every 3s · ~5 min worst case
What's in the harness

The four things every production agent needs.

None of these are novel ideas — they're what you end up building if you run an agent for long enough. Bundling them into one tool was the actual unlock.

Specialized tools

The agent only gets the verbs it actually needs for its job. A bookkeeper sees the books; a lead qualifier sees the CRM. Smaller toolbox, smaller blast radius.

Approval gates

Anything that touches the real world pauses for a one-tap approval before it runs. The agent drafts; you ship. Arguments are validated at the gate, so a hallucinated customer ID never makes it through.

Real-time cockpit

Every tool call appears in a live feed — what it called, what it got back, how long it took. No more reading logs after the damage.

Hosted with isolation

Each agent gets its own dedicated machine, its own private storage bucket, its own deploy slot. Privacy isolation is the default, not the upgrade tier.

— Early access —

Want to know when this opens up?

I'm using it on my own agents first while I find what's still broken. When it's ready for one more set of hands, I'll email you. No pitch deck, no waitlist drip — one email, when it's real.

One email per signup. No newsletter, no marketing automation.