X:0000 Y:0000
DIEGO E. CASSO SOSA
AI AGENT ENGINEER · FORWARD DEPLOYED
REV. 2026
OPEN — US REMOTE (CT/MT)
00 / THESIS
I build
software factories.
Engineered for the long run.
PRODUCTION SYSTEM · NOT A DEMO ›
// HARNESS ENGINEERING
long agent runs ·
self-verifying ·
human merge gate
HOURS-LONG MULTI-WAVE RUNS·TWO EXECUTION METHODS·HUMAN MERGE GATE
01 / THE PROOFPROOF BEFORE CLAIMS

Not a side project. A production system.

// The numbers — verified in-code
Hours
Sustained multi-wave runs
per-story window up to 2h
0
Execution methods
local-tmux · remote fleet
0
DoD retry cap
doom-loop guarded
// Walkthrough
01·B / CENZONTLE — FILM
02 / ARCHITECTURECENZONTLE — HARNESS FOR LONG AGENT RUNS
[01] [02] [03] [04] [05] §01 · IN Epic business intent in §02 Story Gen DAG of stories §03 · AGENT Forge · Sonnet writes code in isolated worktree §04 · VERIFY 4-Layer Eval independent model audits the repo §05 · GATE PR · In Review human merge gate context ↑ ↓ artifacts context ↑ ↓ artifacts §00 · VAULT Git-backed knowledge base the brain — epics & stories read it and write back; the coder reads to build definitions · epics · stories · adrs · qa-matrices every write = a commit · versioned · access-scoped vault_glob · vault_grep · vault_read EPIC → REVIEW · GROUNDED BY THE VAULT · HUMAN-GATED
§00 · VAULT
Git-backed knowledge base
The brain. Every agent reads it for context and writes back — definitions, epics, stories, ADRs, QA matrices. Every write is a commit: versioned, access-scoped.
vault_glob · vault_grep · vault_read
grounds every step · context ↑ · artifacts ↓
  1. 01
    §01 · IN
    Epic
    business intent in
  2. 02
    §02
    Story Gen
    DAG of stories
  3. 03
    §03 · AGENT
    Forge · Sonnet
    writes code in an isolated worktree
  4. 04
    §04 · VERIFY
    4-Layer Eval
    independent model audits the repo
  5. 05
    §05 · GATE
    PR · In Review
    human merge gate
EPIC → REVIEW · GROUNDED BY THE VAULT · HUMAN-GATED
// How it actually works
N1

Harness, not prompts. Invariants live in code that always runs — mandatory .gitignore, build-artifact stripping, secret redaction. The model is never trusted to remember them.

N2

Built for long runs. An epic runs as a multi-wave DAG — waves execute in dependency order, so a full run spans hours. Per-story windows are configurable (default 30 min, up to 2 h) with stuck/hang detection and a doom-loop detector (regression vs stagnation) that stops the run and escalates to a human.

N3

Runs two ways. Auto-routed between in-process local-tmux and a remote compute-node fleet (claim + slots, wave-scheduled). Live output streams to the dashboard over tmux + ttyd; dropped completions persist and replay on reboot.

N4

4-layer eval + human gate. Deterministic exit-code gates → differential security scan → LLM-as-judge semantic check → OWASP-scoped review. Then every story lands in review for a human to merge.

N5

Multi-agent orchestration. Generator + independent verifier (different models, no confirmation bias); a Sonnet planner consults an Opus advisor before the coder runs.

N6

The Vault is the brain. A git-backed markdown knowledge base — definitions, epics, stories, ADRs, QA matrices — that every agent reads for context (vault_grep) and writes back to. Versioned, access-scoped, and the reason long runs stay grounded instead of hallucinating.

03 / TRACK RECORDIN PRODUCTION — BEFORE CENZONTLE
PLATE CPRODUCT LEADERSHIP

Pinnacle Data Hub

Chief Product Officer at an Agents-as-a-Service (AaaS) company. We build robust analysis layers and agentic workflows for industry — fusing autonomous agents with industrial sensor data.

RoleChief Product OfficerScopeAaaS platform · agentic workflows · roadmapSpanJan 2025 – Present
PLATE AFULL-STACK · SOLO

Telas y Tejidos Luciana

7-module ERP, built solo, full-stack. Excel → production platform. 10 years of historical data migrated and made queryable.

RoleTechnical PO + Full-Stack DevStackNext.js · NestJS · PostgreSQLSpanDec 2024 – Aug 2025
PLATE BPAYMENTS · ANTI-FRAUD

Aeroméxico / Orion Gonet

Anti-fraud engine orchestrator. 3 specialized engines, parameter-based routing. Technical + business requirements across payments infrastructure.

RoleTechnical Product OwnerStack[ domain knowledge, not code ]SpanNov 2025 – Present
04 / HOW I WORKPRINCIPLES · NOT TOOLS
01
Ship proof, not promises
Every claim I make links to code or a run you can inspect. Talk is cheap; the repo isn't.
02
Read the code before you trust the claim
I audit what an agent — or a vendor — tells me before it ships. "It works" is a hypothesis, not a result.
03
Boring tech where it counts, bleeding edge where it wins
Postgres and Docker hold the floor so agents and MCP can push the ceiling without the whole thing wobbling.
04
Own it end to end
From the schema to the pixel to the deploy. No handoffs to hide behind, no "that's not my layer."
05
Autonomy needs a gate, not a leash
Let the system run for hours on its own — but keep a human at the merge. That's how you scale trust.
05 / CREDENTIALSSTAMPED · VERIFIED
§ A0 · FLAGSHIP CREDENTIAL
CCAF
Claude Certified Architect — Foundations
Anthropic · 2025
Verified

One of the most sought-after AI credentials in the world right now — certifying production-grade architecture on Claude.

§ A1
Professional Scrum Product Owner I
Scrum.org · 2025
§ A2
Full-Stack Development Bootcamp
TripleTen · 2024
§ A3
B.A. Business Administration
UVM · 2023
§ A4
English — C1 Advanced
CEFR
06 / LET'S BUILDONE CALL · ONE CTA
Let's build something
that actually ships.

Open to AI Engineer · Agent Engineer · Forward Deployed Engineer roles with US-based companies. English C1.