Our Practice Engineering · 2026
Working With Agents

We build with agents — specs lead, we stay in control.

First-class specs, a spec-first flow, every role in the loop, and a harness that closes it end to end.
High-level spec API + UX spec Technical spec Code top-down · traceable
01
PART 01

New First-Class Citizens
Multi-Level Artifacts

The specs are the asset. Everything else is derived from them.
02
First-Class Citizens

Beyond code — documents go first-class.

Code was always the one artifact we version, review, and trust as the source of truth. Now these documents earn the same standing:

High-level specthe what & why
API specthe service contract
UX specthe interaction contract
Technical specthe how, per platform
03
Multi-level spec stack

Top-down, traceable back.

derives
traces back
00
Intent
The problem to solve — a bug, a request, a refactor
From Jira / Confluence / Redmine
01
High-level spec
openspec · proposal.md · design.md
What & why — behavior and requirements
02
API spec
OpenAPI schema
API contract — frontend and backend build to it
UX spec
markdown — screens, layouts, interactions, flows
+ HTML prototype
Interaction contract — screens and behavior
→ lives in references/
03
Technical specs (in openspec format)
Backend
Web
iOS
Android
Code & tests
Derived — and traceable back to every layer
04
Example · Site Management

What the artifacts actually look like.

One real feature in openspec — proposal, contracts, UX, and per-endpoint specs, all traceable.
High-level spec
openspec · site-management-for-portal/
proposal.md
specs/site-hierarchy/spec.md
specs/site-permissions/spec.md
specs/site-device-filtering/spec.md
specs/floor-plan-display/spec.md
specs/view-module/spec.md
API spec
OpenAPI · api-ssot
Site CRUD endpoints
Device ↔ site assignment
Permission scopes
the contract both ends build to
UX spec
artifacts/
ux-spec.md
prototype.html
filter-combo/
device-picker-dialog/
tree-view-list/
key-screens/ ×13
Technical spec
openspec · per screen / endpoint
spec.md
proposal.md
design.md
tasks.md
↓ code + tests
Filter Combo prototype
filter-combo/ · prototype.html
Device Picker prototype
device-picker-dialog/ · prototype.html
05
PART 02

Team
Workflow

Most stages pair a technique with the artifact it produces.
06
The Flow

From intent to implementation.

01 Clarify intent Humans align on the problem
02 Write the spec Finalize + PR / review
03 Break down API + UX spec, discuss
04 Build Per endpoint, implement
05 Fix in production Fix issues after go-live
04 fans out to Backend Web iOS Android
07
Team Workflow · Stage 01

Clarify intent

PO · PM · RD · UX · Architect·no AI
HumanPO presents the problem & contextframing
HumanTeam discusses to understand itfacilitation
HumanAgree scope: in / outscoping
OutputClarified intentagreed scope
project bglore
misc knowledge
constraints
Referencessuperpowers · AI Fluency — Delegation · Anthropic
AI Fluency
DelegationDescriptionDiscernmentDiligence
08
In practice · Stage 01

Clarify intent

Site Management · in development
used  brainstorming  ·  probing the PM  — FE · BE · UX, no agent yet
PMasks
“Build a Site module.” A feature request handed to the team — a solution, not yet a problem.
FE · BE · UXprobe
Pushed back on the PM — what are we actually trying to solve?don't build the feature as stated; find the problem behind it
FE · BE · UXbrainstorm
With the real problem clear, brainstormed the direction together across front end, back end, and UX.
Teamagree
Aligned on the problem and a rough scope — what's in, what's out (incl. breaking removals).
Outputclarified intent
The team owns the problem, not just the ticket. No agent has touched it yet — this is the human gate.
The real problem · dug out by probing
P1 Devices aren't tied to a physical site P2 Management complexity grows with device count P3 No inventory basis when expanding / replacing P4 Monitoring vs. management roles blur at scale P5 No site-level base for future features
Agreed scope · in / out
IN unified Site module · site-first workflow multi-level Site › Subsite · single assignment OUT · breaking View: Edit site Device: Create site, Add device
09
Team Workflow · Stage 02

Write the spec

PM · RD+coding agent
HumanHand off the refined intenthandoff
AgentFormalize into openspecopenspec workflow
AgentDraft proposal · design · specsextract-behavior
HumanValidate & PR reviewspec review
OutputHigh-level specproposal.md · design.md · specs/
clarified intent
conventionsproject bg
architecturethe codebase
conventions
ReferencesOpenSpec · AI Fluency — Description · Anthropic
AI Fluency
DelegationDescriptionDiscernmentDiligence
10
In practice · Stage 02

Write the spec

Site Management · in development
used  openspec workflow  ·  adversarial review  — the whole team shaped it with the agent, then hunted for gaps
PMinput
A high-level request, not a spec — a Confluence page.“Site / Subsite — High Level Spec for AI-coding” v0.9
Team ⇆ Agentshape together
PM · RD · UX · Architect shape it into openspec format together with the agent — proposal + capabilities + scenarios, gated by openspec validate.
Agentadversarial
Ran an adversarial review — the reviewer must find issues, “looks good” not allowed — hunting the spec for conflicts, missing constraints, undefined edge cases.
Team ⇆ Agentrevise ×5
Every gap became a real change to the specs, before any code:
+ error-handling + Google Maps fix conflicts, add constraints rename Subsite → Area map: center-pin drag device-move warning
Outputsource of truth
A validated openspec change — 8 capabilities, 2 breaking module changes. Code follows this, not the Confluence page.
proposal.md · ## Why
Currently, site management is fragmented across View and Device modules, leading to inconsistent experiences and duplicated functionality. A dedicated Site module will provide a single source of truth for organizational hierarchy…
specs/site-hierarchy/spec.md
### Requirement: Site hierarchy structure The system SHALL support Site > Area (L1–L4) > Device. #### Scenario: Prevent exceeding depth WHEN user adds an Area under a Layer-4 Area THEN system rejects — max depth reached
ReferencesOpenSpec · Adversarial Review · BMAD-Method
11
Team Workflow · Stage 03

Break down

RD · UX+coding agent
BaseHigh-level spec
RD · AgentDerive the API contractshape & validate / OpenAPI
UX · AgentDraft UX spec + HTML prototypereview the flow / headless UI
OutputContractsAPI spec · UX spec · HTML prototype
API design principlesexisting contracts
design referencescomponent library
AI Fluency
DelegationDescriptionDiscernmentDiligence
12
In practice · Stage 03

Break down (1)

Floor Plan · shipped
used  hadlc-breakdown  (api-designer · openspec-prototype)
Base The floor-plan high-level spec, from Stage 02.
RDAgent  · API contract
Derive the OpenAPI
floor-plan CRUD + device-position / FOV — the shape both ends build to
UXAgent  · prototype + UX
Build the clickable prototype
the screens as something you can click — the tree and the canvas
built in parallel — the prototype surfaces what the API needs; the contract shapes the prototype
and it loops back up — half-way through, the contracts can reveal the high-level spec itself is wrong. Change the spec, then re-derive.
Output API spec · UX spec · clickable prototype — reconciled and reviewable before a line of code.
API contract · NEW · OpenAPI · api-ssot
GET POST /sites/{siteId}/floor-plans GET PATCH DELETE /floor-plans/{id} POST /floor-plans/{id}/upload GET POST /floor-plans/{id}/device-positions the API spec that didn't exist before
UX spec · floor-plan screens
· Upload floor-plan image · Place devices on the plan · Set camera FOV & direction · Folder overview (recursive thumbnails)
13
In practice · Stage 03

Break down (2)

Site Management · in development
used  web-prototype-builder-vue  — the prototype kept moving, round after round against the Figma frames
BaseUX spec
The hard screens to prototype — the filter toolbar and the device picker.
UX ⇆ Agentbuild
Scaffolded the interactions — toolbar, filter menus, trigger pipeline, empty and lazy-load states.
UX ⇆ Agentalign to Figma
Round after round against the Figma frames; tokenized to the VORTEX 1.0 design system.
UX ⇆ Agentsync with shipped
Matched the prototype to the shipped behaviour, across all three toolbar variants.
Outputsigned off
20+ revisions before it matched — a working screen the team clicks and signs off, not a static mockup.
Filter Combo prototype
filter-combo/ · clickable prototype.html
Device Picker prototype
device-picker-dialog/ · clickable prototype.html
14
Team Workflow · Stage 04

Build

RD+coding agent
AgentGenerate per-endpoint technical specopenspec
HumanReview the technical specapi-and-interface-review
AgentImplement with testsTDD
HumanCode reviewcode-reviewer
OutputWorking codetechnical spec · code + tests
API + UX specthe codebase
API + UX specconventions
the codebaseteststech stack
conventions
AI Fluency
DelegationDescriptionDiscernmentDiligence
15
In practice · Stage 04

Build (1)

Floor Plan · shipped
used  openspec proposal → apply  ·  TDD  ·  code-reviewer
Basefrom breakdown
The floor-plan API contract + UX spec, handed down from Stage 03.
Agentopenspec proposal
Ran /openspec:proposal — scaffolded the change: proposal.md · design.md · tasks.md.
RDreview the tech spec
Validated the design & tasks before any code — the gate that catches problems early.
Agentopenspec apply
Ran /openspec:apply — implemented the tasks, test-first, on each surface.
RDreview the code
Five-axis code review — correctness, readability, architecture, security, performance — before merge.
Outputshipped
Merged, live — on every surface. Spec → code, traceable.
the contract · OpenAPI · api-ssot
POST /sites/{siteId}/floor-plans GET /floor-plans/{id} PATCH DELETE /floor-plans/{id} POST /floor-plans/{id}/upload FloorPlan: id · siteId · name · imageUrl
the technical spec · openspec · floor-plan-management
### Requirement: Floor Plan CRUD #### Scenario: Create new floor plan #### Scenario: Invalid floor plan name (pattern) #### Scenario: Delete floor plan
16
In practice · Stage 04

Build (2)

Floor Plan · shipped
The same OpenAPI contract — but each surface does a different job, all built test-first and reviewed before merge.
Backend
vortex-backend · Go
Owns the data & contract
floor-plan CRUD · device positions · FOV, persisted
floor_plan_controller.go
floor_plan_repository.go
Web
app-vsaas-portal · Vue 3
The editor
drag devices onto the plan, draw each camera's FOV on a canvas
FloorPlanCanvas.vue · renderers/
FloorPlanCanvas.test.js
iOS
ioscharmander · Swift
The viewer
fetch & render the plan, tap a camera marker → live view
FloorPlanManager.swift
FloorPlanManagerTest.swift
Android
AndroidCharmander · Kotlin
The viewer
plan + camera-marker overlay, search across floor plans
FloorPlanViewerScreen.kt · CameraMarkerOverlay.kt
FloorPlanViewerViewModelTest.kt
17
Team Workflow · Stage 05

Fix in production

RD+coding agent
AgentMonitor site & logsCI runner / monitor
QA · humanVerify & reproducemanual + regression
AgentTriage the issuelog / issue triage
HumanJudge & decide the fixinvestigation
OutputFix shippedvia the spec · post-mortem
logserror tracking
the specstest cases
ReferencesPostmortem culture · Google SRE · AI Fluency — Diligence
AI Fluency
DelegationDescriptionDiscernmentDiligence
18
In practice · Stage 05

Fix in production (1)

Floor Plan · shipped
used  Redmine  ·  systematic-debugging  — trace the symptom back to the spec
Liveshipped
Floor Plan running in production on Portal, iOS and Android.
QAreproduced
Moved a device to another Site — the camera count diverged: Portal 2 · iOS 2 · Android 1.Redmine #59610
Teamdiagnose
Not a rendering bug on any one platform — the spec never defined what happens when a device moves to another Site.
Team ⇆ Agentfix the spec
Three symptom tickets closed Won't Fix — define the behavior in the spec, not three local patches, then re-walk the flow.
Outputone definition
One spec change — every platform reads the same rule. Fix the spec, not the symptom.
Redmine #59610 · tracker: Spec
[Spec][Floor Plan] Spec doesn't define device “move to other sites” behaviour move 1 of 2 cameras to another Site → Portal 2 iOS 2 Android 1
one gap → three symptom tickets
#59495 Android — count differs Won't Fix #59533 Android/iOS — Site count Won't Fix #59727 Portal — camera wrong Won't Fix patch each = 3 patches that drift again. fix the spec = one source of truth.
19
In practice · Stage 05

Fix in production (2)

Floor Plan · shipped
used  Redmine  ·  openspec  — trace a cluster of bugs back to one component spec
Liveshipped
The Floor Plan left-hand navigation tree, running in production.
QAsymptoms
A run of tree tickets — search broken, wrong device counts, new sites not listed, hierarchy inconsistent.Redmine #59730 · #59725 · #59726 · #59627 …
Teamdiagnose
Not seven separate bugs — one broken tree component: clicking a node expanded instead of selecting; a fixed shallow depth when the data nests far deeper; every feature reinvented its own tree.
Team ⇆ Agentrespec & re-walk
Instead of seven patches — a new openspec change for a generic tree component, then back through breakdown and build from the top.
Outputone component
A shared tree-view component — floorplan consumes it now; device list and views are next. Fix the spec, not the symptom.
Rebuilt tree-view component prototype
TreeView · the rebuilt component · 7 levels
the spec change · openspec · add-tree-view-component
NEW tree-view-component spec click label = select, chevron = expand behavior by node kind, not depth search: auto-expand + highlight up to 7 levels · mixed siblings
the code change · app-vsaas-portal
NEW components/TreeView/TreeView.vue one component, node-kind driven floorplan/…/FloorPlanTreeSidebar.vue now consumes the shared TreeView replaced the per-feature trees
20
In practice · Beyond the five stages

Beyond the flow (1)

Dev harness · shipped
used  superpowers:brainstorming  ·  writing-skills  — the skill that test-drives the skills it writes
BE devthe itch
“The front end has no throwaway URL to validate a branch against.” Not a feature — just friction.
FE ⇆ Agentbrainstorm
Used superpowers:brainstorming to design a per-branch preview harness — a design spec, not code first.
FE ⇆ Agentwriting-skills
Turned the design into a reusable skill — a /harness-deploy command + provisioning scripts the whole team runs.
Agentself-verify
writing-skills is TDD for skills — it pressure-tests the new skill with subagents, confirms it triggers and complies, and closes the loopholes before anyone relies on it.
UATspec amended
The target app rejected manual deploys (git-connected Amplify) — retargeted mid-build to oblivionis-preview. The spec moved, like any other.
Outputa new skill
pnpm harness:deploy <slug> → a per-branch HTTPS preview URL. The team taught the harness a new trick.
the design · superpowers:brainstorming
specs/2026-04-21-portal-harness-preview-design.md pnpm harness:deploy <slug> → per-branch HTTPS URL pnpm harness:destroy <slug> → clean up guard rails: only harness- slugs, never protected branches
the skill · writing-skills
/harness-deploy /harness-destroy scripts/provision-harness.sh scripts/deploy-harness.sh · destroy-harness.sh agent contract: ::preview-url::<url> pressure-tested with subagents before ship
21
In practice · Beyond the five stages

Beyond the flow (2)

extract-behavior · in the skill library
used  writing-skills  — capture the method once, every agent runs the same rigor
Patternnoticed
Every brownfield domain hits the same wall — “the code is the spec” — and someone re-derives it by hand each time.
RD ⇆ Agentdesign the method
Rather than repeat the work, designed one repeatable method: scope → discover → draft → audit → review → register.
RD ⇆ Agentthe hard part
Baked in a leak audit so specs stay behavioural, not code — MQTT → “streaming”, S3 → “the system stores”, JWT → “token”.
RD ⇆ Agentcapture as a skill
Wrote it up as extract-behavior — the method + OpenSpec format — so any agent runs the same rigor, not just its author.
Outputreusable
A team skill — applied since to Permission (4 scattered sources → one spec) and every new brownfield domain.
the design · extract-behavior/SKILL.md
workflow scope → discover → draft → audit → review rule one capability = one spec.md discover parallel backend + frontend agents audit strip impl detail — keep behaviour
proof · applied to Permission
4 sources: Excel · Casbin · Vue · iOS → openspec/specs/permission/spec.md source: extracted+authored one skill → every brownfield domain after
22
PART 03

Roles
in the Loop

How each role works now — and stays in the loop with the others.
23
In the Loop · Role 01

RD — owns the spec, delegates the build.

Before, RD’s job ended at merge — now it never really does.
RD
Research & Development
The shift — RD moved up a loop
Outer loopCI/CD, deploy & ops · days–weeks
Middle loop RD nowSupervise agents — decompose work, calibrate trust, catch plausible-but-wrong output, keep architecture coherent · hours–days
Inner loopWrite, test, debug · minutes–hours
Inner / middle / outer loops · ThoughtWorks — Future of Software Engineering
Skills RD drives
brainstorming openspec api-and-interface-review TDD code-reviewer harness-deploy dogfood smoke-test
Collaboration loops — not hand-off-and-forget
Agent
Delegate the build; review the technical spec and the diff — the agent proposes, RD stays accountable.
PM
Co-shape the high-level spec — feasibility, cost, and risk flow back before scope locks.
UX
Build to the UX spec and prototype; the prototype ⇆ API loop keeps design and contract in sync.
QA
Ship with tests already written; triage production issues together, then patch the spec.
24
In the Loop · Role 02

PM — owns intent, keeps the spec alive.

Before, PM wrote a doc and moved on — now the spec is a living contract.
PM
Product
The shift
  • Own the high-level spec — the WHAT, not the HOW.
  • Turn vague ideas into an agreed, testable spec.
  • Keep the spec true as reality changes.
Skills PM drives
brainstorming openspec adversarial-review
Collaboration loops — not hand-off-and-forget
Agent
Refine intent into a structured spec with the agent; validate every draft, then iterate.
RD
Hand over intent and scope; get feasibility, cost, and risk back before committing.
UX
Align on user flows early — the UX spec grows from shared intent.
QA
Write acceptance criteria into the spec so QA can verify against it.
25
In the Loop · Role 03

QA — verifies against the spec, from day one.

Before, QA arrived at the end — now QA is in from the spec.
QA
Quality Assurance
The shift
  • Verify against the spec — not ad-hoc clicking.
  • Write test cases from the spec, early.
  • Watch production with the agent; loop bugs back to the spec.
Skills QA drives
brainstorming generate-test-cases smoke-test
Collaboration loops — not hand-off-and-forget
Agent
The agent monitors site and logs; QA verifies and reproduces — findings converge to triage.
PM
Turn acceptance criteria in the spec into concrete, verifiable cases.
RD
Tests are written before hand-off; triage production issues together.
UX
Verify the build against the UX spec and prototype — not a screenshot.
26
In the Loop · Role 04

UX — ships a spec others can build and verify.

Before, UX threw a mockup over the wall — now the prototype is a living contract.
UX
Experience Design
The shift
  • Write the UX spec and build an HTML prototype.
  • Iterate on headless components — prototype ⇆ API.
  • Hand off a spec QA can verify against.
Skills UX drives
brainstorming openspec prototype generate-flow-spec
Collaboration loops — not hand-off-and-forget
Agent
Draft the UX spec and generate the prototype with the agent on our headless components.
PM
Translate intent into flows — the UX spec grows from the high-level spec.
RD
The prototype ⇆ API loop keeps design and contract honest as both evolve.
QA
Deliver a verifiable UX spec into references/ so QA can check against it.
27
PART 04

Context
& Harness

What we feed the agents — and the loop we're automating.
28
Context Engineering

Engineering the agent's context.

Fill the agent's window with the right information — durable knowledge it always needs, plus what's pulled in just-in-time as it works.

Durable knowledge
Project backgroundWhat & why
Tech stackLanguages, frameworks, infra
Conventions & principlesStyle, patterns, architectural rules
Architecture & decisionsBoundaries, ADRs, the why
Constraints & NFRsPerformance, security, compliance
Specs & contractsbehavioral · API spec · technical  ↳ slides 3–5
LoreHistory, domain terms, tribal knowledge
Misc knowledgeGotchas, edge cases, references
Pulled in at runtime
The codebaseFiles by path / grep
Tools & integrationsMCP · Jira · Redmine · CI
TestsExpected behavior
MemoryCross-session state
ExamplesFew-shot, reference implementations
Agent skillsReusable procedures the agent invokes
29
In progress
What We're Building

Closing the loop, end to end.

Detect — the loop runs itself
01 CI runner Triggers the loop on every change
02 Agent Picks up and does the work
03 Monitor Watches the live site & logs
04 Generate issues Files what it finds for triage
Resolve — a human decides every change
05 Monitor issues Jira · Redmine · GitHub
06 Triage Cluster, dedupe, prioritize
07 Propose change Agent drafts the fix
spec changecode changewontfix
08 Review Human approves before it lands
30
Working With Agents

Mindset: AI Fluency

These run underneath every stage above — the human skills of Anthropic’s AI Fluency framework, in our own words.
01 Delegation Decide what to hand the agent — and what stays human.
In our practiceHumans own intent and the spec; agents draft, derive, and implement.
02 Description Say what you want clearly enough to build from.
In our practiceThe spec is the description — high-level intent, API contract, UX spec.
03 Discernment Judge the output, the process, and the behavior.
In our practiceCode review, api-and-interface review, and human validation gates.
04 Diligence Stay accountable — verify, trace, own the result.
In our practiceTests, spec provenance, and fixing the spec — not the symptom.
31
The Work

What we keep investing in.

01Build an agent-first environment and process.
02Design the context we feed the agents.
03Design the harness — the loop that runs the work.
04Keep fixing what blocks the flow — the harness of the harness.
05Judge and deliver with the AI Fluency 4D framework.
Questions & discussion
32