Humor Embeddings: Laughter from Inverted Memory — Bisociation in Computational Embedding Space

Your agent can recall the nearest memory in milliseconds — but it can’t make a joke, and it can’t tell you why a joke works. Humor Embeddings argues those are the same problem run backwards: memory retrieves what’s nearest to a query; humor searches for what’s at the right distance and can still be joined by an unexpectedly valid bridge. Same vector index, inverted objective — which means wit can become a native, explainable, audience-tunable capability instead of a prompt trick. Part of our open Building Jarvis series.

📄 Read the full paper (PDF) →

Abstract

We present a formal computational framework for humor generation that operationalizes Koestler’s (1964) bisociation theory as geometry in vector embedding spaces. The central claim is that humor and memory retrieval are dual operations on the same semantic machinery: memory retrieves what lies nearest to a query; humor searches for what lies at the right distance and can still be joined by an unexpectedly valid bridge. We formalize this memory–humor correspondence and define a computable humor potential function grounded in Suls’ (1972) two-stage incongruity-resolution model.

We identify 12 humor-generating semantic patterns organized into five meta-categories, each defined as a specific embedding-space operation, and propose humor associations as a first-class relationship type in agent memory architectures. A preliminary computational pilot (n = 15 concept triplets) falsifies the initial formulation, revealing that naive coherence conflates semantic proximity with comedic validity. This negative result motivates a revised surprise-weighted formulation that decomposes bridge quality into independent validity and surprise components.

Independent measurements on the same frozen 384-dimensional sentence-embedding substrate this framework runs on—reported in a separate offline study of an agent-safety subsystem—find that structural channels over that substrate carry real discriminative signal: a clause-cosine incongruity channel, the geometric primitive at the core of this framework, reaches AUROC 0.896, while a k-NN novelty channel reaches AUROC 0.875, and a supervised head trained on the same frozen embeddings to predict a semantic property collapses to AUROC 0.286 (below chance). We use that split to bound which parts of our pipeline the substrate can be trusted to carry.

We provide reproducible experimental protocols with power analysis (N ≥ 64 raters) for validation against human ratings. The framework is designed as the affective layer of a larger agent architecture: it draws its concept neighborhoods from a semantic-memory index and is gated by a persona layer that decides when humor is appropriate — the computational counterpart of the limbic reward circuit that makes humor feel good in humans.

The framework, in numbers

humor-generating semantic patterns, each a concrete embedding-space operation (grouped into 5 meta-categories)

[0.6, 0.95]

the proposed cosine-distance “sweet spot” where a pair is far enough to surprise but still bridgeable

n = 15

triplet pilot that falsified the naive formula — an honest negative result that drove the redesign

0.896

AUROC of the clause-cosine incongruity channel on the identical frozen MiniLM substrate — the geometric primitive this framework is built on

Claims from the paper (v4.8), stated here without the proofs — the definitions, the pilot, and the validation protocol are all in the PDF.

How it works, in one minute

Memory and humor run on the same index, backwards. Retrieval returns the nearest concept to a query. Humor returns the concept that is far away but still joinable — same embeddings, same distance metric, opposite optimization target. No separate “joke database.”
A joke is two distant concepts plus a bridge. The score multiplies three things: how far apart the two concepts are (the surprise), how validly a bridge concept connects to both (the resolution), and how unexpected that bridge is. All three must be non-zero — that multiplicative shape is Suls’ two-stage “incongruity then resolution.”
There’s a distance sweet spot. Too close and the connection is obvious and boring; too far (near-orthogonal vectors) and no coherent bridge exists. The paper proposes a productive band of cosine distance and shows the unfunny pairs sit just outside it.
The naive version failed — on purpose, in public. The first formula scored boring pairs higher than funny ones, because raw cosine coherence rewards proximity. Splitting bridge quality into independent validity and surprise terms is what fixes it.
The substrate holds up — where it matters. An independent offline study on the identical frozen 384-dimensional MiniLM encoder found that structural channels (distance, rank, incongruity geometry) reach AUROC 0.875–0.896, while a supervised head trained on the same frozen features to predict a higher-order semantic property fell to 0.286 — below chance. This validates the geometric machinery the framework relies on and sharpens the caveat around the sensitivity gate.
It plugs into an agent, not a vacuum. A semantic-memory layer supplies the candidate concepts; a persona layer flips a humor_enabled flag to decide whether a joke is even appropriate; this layer is the reward circuit that scores the connection. Read the room first, then be funny.

In the ecosystem

Two adjacent efforts exercise the same substrate and enforcement questions this paper depends on — not because they study humor, but because they share the load-bearing machinery.

chopratejas/headroom — reversible context compression. Section 8.7 of the paper notes that humor associations in a persistent event store are worthless if silently dropped when context is compacted. Headroom’s lossless Compress–Cache–Retrieve approach is the design pressure that motivates treating the humor-association store as recoverable rather than summarized-away. The differentiation is narrow: headroom compresses to preserve recall; the humor-association store records discrepancies to invert recall. Same substrate, different objective — composable, not competing.

The Humor Embeddings framework is part of Building Jarvis, an open series on persistent agent architecture. Follow the work and contribute at github.com/globalcaos/tinkerclaw.

Read the paper

First page of the Humor Embeddings paper

📄 Read the full paper (PDF) →

31 pages · the full humor potential function, the 12-pattern taxonomy, five bridge-discovery algorithms, the falsified pilot, and the human-rating validation protocol

Was this useful?

We’re building these in the open and we want your read on them. Did this land — 👍 or 👎? Would you wire humor onto your agent’s existing memory index this way? Tell us in the comments below.

More from Building Jarvis

See everything in Building Jarvis →