Your agent has all the words and none of the grammar. Steps chain via prose: step three writes a sentence, step four re-reads it and hopes it understood. Nothing checks, before the workflow runs, that the chain even fits together. BROCA gives recipes the missing grammar — a gradual typed value model, compile-time contracts, a sandboxed control grammar, a combinator algebra, a skill standard library, catchable recovery, and derived-not-frozen budgets — while every untyped step stays exactly the prose it was. Part of our open Building Jarvis series.
Abstract — v1.1, 14 June 2026
Broca’s area is the brain’s grammar engine — not where words are stored, not where meaning is felt, but the structure that composes a finite vocabulary into an unbounded set of well-formed utterances and rejects the ill-formed ones before they are spoken. We take that as the design thesis of this paper: a workflow language is only as powerful as its grammar — the rules by which small typed pieces compose into larger ones, checked before they run.
An agent recipe — an ordered, portable workflow of steps — is a good unit of operational knowledge, but in its plain form it is a sequence of English instructions. A step’s result reaches the next step as prose that the next step must re-parse, re-interpret, and hope it understood; nothing checks, before the run, that step seven actually consumes a field step three actually produces. This paper describes seven primitives that turn that lossy English-passing pipeline into a typed, composable programming language that an agent can author, type-check, compose, run, recover from failure inside, and sharpen against its own usage.
The core is seven things: a gradual typed value model with named ports; compile-time contracts that check every port and guard reference before a single step is dispatched; a sandboxed boolean control grammar (when:/return:/done:) that is explicitly not a JavaScript eval; a combinator algebra (map/filter/compose/if-then-else) over typed array edges with dynamic worker resolution; a searchable skill standard library whose output schema is adopted at compile so downstream ports type-check against the real contract; catchable recovery with a classified-error taxonomy and an onError: policy router; and all bounds — re-dispatch, fan-out, depth, recovery-retry — derived from live signals, never frozen to a constant.
This is no longer a design that promises a target; it is a shipped language whose seven pillars are implemented and tested in the running substrate. This revision also situates the language against the live open-source ecosystem it now interoperates with — a context-compression layer (chopratejas/headroom) that answers the same lossy-prose problem from the opposite direction, an engineering-discipline plugin (addyosmani/agent-skills) and a marketing-skills corpus (coreyhaines31/marketingskills) whose authoring conventions sharpen ours, and a distribution registry (Journey / journeykits.ai) whose kit/1.0 schema is the same grammar BROCA emits, making recipes a cross-registry interchange format rather than a private DSL.
How it works, in one minute
- Typing is opt-in and additive — never a migration. A step may declare
out:(a JSON-Schema object) andin:(named ports bound from earlier steps’ typed outputs). A step that declares neither keeps its exact prose behavior. The directive parser enforces this precisely: a line that beginsout:but is not JSON-shaped is treated as ordinary prose, not an error. Three similar steps stay three similar steps; you type one when the guarantee matters. - Compile-time contracts catch mis-wired recipes before a single subagent is spawned. At seed time the runner checks every
in:port and everywhen:guard reference: is the referenced step real? Strictly earlier? Does itsout:schema declare the field? If any check fails, the run is refused with a precise, human-readable message. A skill’s output schema is adopted at compile so eveninvoke skill:steps type-check downstream. Integration errors that used to surface deep in a live run surface at authoring time instead. - A sandboxed boolean grammar, not
eval. Thewhen:directive is an OR-of-AND-of-comparisons over earlier steps’ typed fields and JSON literals. No parentheses, no arithmetic, no function calls, no path to arbitrary code. A workflow document that an agent authors and a nightly loop mutates must never become an injection vector — the closed grammar is the guarantee, not a lint rule. - A combinator algebra over typed array edges with dynamic workers.
map:andfilter:iterate a worker recipe over a typed array; the worker can be a{{steps.n.out.worker}}reference resolved at dispatch — the recipe applied across a collection is selected by an earlier step’s structured decision. Composition depth ticks once for the whole map step, not once per element.if-then-elseis thewhen:-guard plusreturn:/done:-exit pair — no new machinery needed. - A skill standard library whose contracts propagate at compile. An
invoke skill:step resolves against a versionedSkillLibrary(surface-embedding search with Jaccard fallback, Laplace-smoothed fitness ranking); the resolved skill’soutputSchemais adopted at compile so downstreamin:ports type-check against the real contract. Arecipe.composeRPC mechanically assembles a recipe from the top-ranked skill hits — the agent composes workflows from the library rather than re-deriving them in prose. - Every failure is classified, not raw. A closed
ErrorKindset — schema-mismatch, spawn-failure, timeout, budget-exceeded, guard-eval-error, sub-kit-failure, and more — drives theonError:policy router: retry (honored only for recoverable errors, clamped by a derived budget), fallback to a recovery recipe (depth-guarded to catch cycles), or continue-partial (a survivable failure is a recordeddone-partial, not a crash). In amap/filterstep, continue-partial drops failed elements and keeps aggregating survivors. - All bounds are derived, never frozen. Re-dispatch budget, combinator fan-out, composition depth, and recovery-retry budget are each a small pure function of live signals — value-of-work, historical confidence, prior effort spent, and affordability — never a hardcoded constant. More structure at stake earns more correction attempts; a thin budget cuts them. A floored constant, where it appears at all, is a safety ceiling on a pathological derivation, never the working value.
Composable, not competing: how this fits the OSS ecosystem
BROCA shares a problem space with two notable open-source projects and a distribution registry. All three are treated as composable pieces the language folds in, not as alternatives to route around.
🗜 chopratejas/headroom (~24.7k ★)
Reversible Compress-Cache-Retrieve: compresses context artifacts with content-type-aware compressors (statistical JSON-array crushing, AST-aware code compression), caches the originals, retrieves them on demand. headroom attacks the lossy-prose problem from the opposite direction — keep the prose, make it smaller and recoverable. BROCA’s gradual typing is exactly why the two compose: most steps in a living library stay prose by design, and for those untyped steps a CCR layer is the right tool. headroom’s reproducible eval suite (GSM8K, TruthfulQA, SQuAD, BFCL) also names the benchmark BROCA still owes for its own typed-edge efficiency claim.
🧠 addyosmani/agent-skills (~56.8k ★)
Anti-trigger + Loading Constraints authoring discipline: skills declare what they must not match and where they may safely load. BROCA’s recipe matcher already supports an anti-trigger field that subtracts from a recipe’s match score on an exact look-alike phrase — the runtime counterpart of Osmani’s “When NOT to use.” Adopting the full convention (declared triggers and anti-triggers, an evals set, a load-site constraint) as part of validateRecipeSpec imports a discipline two independent corpora already pay rather than reinventing it.
📣 coreyhaines31/marketingskills (~33k ★)
A 44-skill, 51-CLI marketing corpus whose per-skill user-phrase triggers and evals/evals.json acceptance sets are the second source of the authoring-standard tightening BROCA imports. Adopting declared triggers, anti-triggers, evals sets, and load-site constraints as part of validateRecipeSpec imports a discipline an independent corpus already pays, rather than reinventing it.
🗄 Journey / journeykits.ai (distribution registry)
Journey publishes workflows as kit/1.0 kits — the same schema BROCA recipes are written in. A real license-gated, attribution-stamped import pipeline distilled four external kits into the live library. Journey answers where a recipe travels; BROCA answers what guarantees it acquires once it is here. That a second producer emits this grammar is the strongest evidence BROCA is a language, not a private DSL.
🔧 Building Jarvis — open source
The substrate, the skills, and the surrounding cognitive stack are built in the open. Follow along or contribute.
Read the paper
Read the paper
📄 PDF upload pending — check back soon for the downloadable paper.
The full paper (v1.1, 14 June 2026) covers all seven pillars, the worked PR-review example, limitations (including the value-model efficiency claim that is argued but not yet benchmarked), and the cross-registry interop proof via a real Journey import pipeline.
Was this useful?
We’re building these in the open and we want your read on them. Did this land — 👍 or 👎? What would you want the next paper to dig into? Tell us in the comments below.


Leave a Reply