History

Coleman Dimensional Encoding is the result of asking myself one question:

Why are computers still so slow in 2026?

I've been programming since I was six, starting with BASIC and 6502 assembly. Decades later I'm playing XCOM 2, and the AI takes longer to decide its turn than it takes ZORK to load on my Commodore 64. What was the loading screen for if every turn computes everything from scratch? Cover, elevation, sight lines, and threat levels are all known, all structured, and all recomputed for every unit, for every turn. My rig is generating waste heat from the same questions it has asked over and over before, funneling electrons into the same answers out of an assumption: this is the way it is, and computers are just slow at these sorts of problems. But the machine isn't slow; the work is redundant. The structure was never designed to answer the questions being asked of it, and not only can we fix that, we can codify that structure using dimensional thinking.

The idea that dimensions hide things in plain sight has been with me as long as I can remember. I grew up watching Quantum Leap as a little kid every evening, Sam Beckett leaping through time trying to make right what once went wrong (cue the theme music). The premise was that time is not a wall but a corridor, something you can actually move through and aren't bound by. It's a dimension, and dimensions are things you can navigate.

Then I read Flatland just a few years later, a story about a square living in two dimensions who is unable to see the sphere passing through his world. Not because the sphere isn't there, but because the square doesn't have the axis to perceive it. The information exists, but the dimension to see it does not.

Then I read about Kaluza Klein theory, and the intuition became physics. In 1921 Theodor Kaluza took Einstein's field equations for general relativity, which describe gravity as curvature in four dimensional spacetime, and extended them to five dimensions. He didn't add new forces or new particles. He just added one more axis to the geometry.

When he worked out the math, Maxwell's equations for electromagnetism appeared automatically as components of the five dimensional metric. Five years later, Oskar Klein showed how this fifth dimension could be compactified, curled into a circle so small it's unobservable, while its effects remain everywhere. Electromagnetism wasn't a separate force bolted onto gravity. It was curvature in a dimension that had always been there, one that the four dimensional model simply couldn't represent.

Framework

The pattern is always the same: information that looks missing from one perspective is already there, encoded in a dimension that the perspective doesn't include. What if data systems worked the same way? What if you could add the right dimensions to a dataset so that every query followed a geodesic, the shortest possible path through the structure to the answer?

WASPWorkload Aware Sufficient Placement

Defines the problem. Every true answer is found, every false positive is filtered, work scales with the answer rather than the dataset, and no dimension is wasted.

CDEColeman Dimensional Encoding

Solves it. Analyze the workload, encode each record as a coordinate, build the index, translate queries into bounded probes.

MSSMinimally Sufficient Statistics

Keeps it honest. Every claim is classified as a definition, guarantee, assumption, or unknown. Nothing is stated without knowing which category it belongs to.

Formal Definitions

WASP

Let $D$ be a finite set of records and $Q$ a set of queries, where each query is a predicate $q: D \to \{0, 1\}$. Let the workload be $W \subseteq Q$. A WASP instance consists of five components:

  1. $k \in \mathbb{N}$: number of encoding dimensions
  2. $E: D \to \mathbb{Z}^k$: maps each record to a coordinate
  3. $I: \mathbb{Z}^k \to \mathcal{P}(D)$: maps each coordinate to the records stored there
  4. $T: Q \to \mathcal{P}(\mathbb{Z}^k)$: maps each query to coordinates to probe
  5. $F: Q \times D \to \{0, 1\}$: per query filter that removes false positives

Index well formedness: $\forall r \in D: r \in I(E(r))$.

Five building blocks. $k$ sets how many dimensions the space has. $E$ places each record at a coordinate. $I$ retrieves records from a coordinate. $T$ converts a query into coordinates to check. $F_q$ removes false positives from the results. The well formedness constraint says that every record is retrievable from the coordinate it was assigned to.

These define three quantities for any query $q$:

$$\mathrm{Ans}(q) = \{r \in D : q(r) = 1\}$$ $$\mathrm{Cand}(q) = \bigcup_{c \in T(q)} I(c)$$ $$\mathrm{Work}(q) = |T(q)| + |\mathrm{Cand}(q)|$$

$\mathrm{Ans}(q)$ is the perfect answer, every record that truly matches. $\mathrm{Cand}(q)$ is what the index returns; it may include extras. $\mathrm{Work}(q)$ is the total cost, coordinates probed plus records examined. Evaluating $F_q(r)$ is assumed O(1) per candidate.

A valid solution satisfies four properties:

Sufficiency

$$\mathrm{Ans}(q) \subseteq \mathrm{Cand}(q)$$

Equivalently: $\forall q \in W, \forall r \in D: q(r) = 1 \Rightarrow E(r) \in T(q)$.

Exactness (axiom)

$$\forall q \in W, \forall r \in \mathrm{Cand}(q): F_q(r) = q(r)$$

Theorem (from Sufficiency + Exactness): $\mathrm{Ans}(q) = \{r \in \mathrm{Cand}(q) : F_q(r) = 1\}$.

Bounded work

$$\exists\,\gamma, \beta \ge 0 \text{ such that } \forall q \in W: \mathrm{Work}(q) \le \gamma\,|\mathrm{Ans}(q)| + \beta$$

Minimality: removing dimension $i$ means projecting $\pi_i: \mathbb{Z}^k \to \mathbb{Z}^{k-1}$ and deriving $E', I', T'$ on $\mathbb{Z}^{k-1}$. Minimality holds when $\forall i \in \{1, \ldots, k\}$, the projected scheme violates at least one of Sufficiency, Exactness, or Bounded work on $W$.

You never miss a correct answer. The filter agrees with the query on every candidate. Effort grows with the answer size, not the dataset size. Every dimension earns its keep; remove one, project the coordinate space down one axis, rederive the scheme, and a guarantee breaks.

CDE

Given a workload $W$, construct the five WASP components in four phases:

  1. Workload analysis: derive candidate dimensions from $W$
  2. Coordinate encoding: define discretizers per dimension, compute $E(r)$
  3. Index construction: build $I(c)$ from observed coordinates
  4. Query translation: implement $T(q)$ and $F_q$ so the four properties hold

Study the questions to discover natural axes. Assign each record a position on those axes. Build the lookup from positions to records. Convert incoming queries into bounded coordinate scans.

MSS

Given a statement set $S$, define a labeling function $L: S \to \{\text{Def}, \text{Gua}, \text{Asm}, \text{Unk}\}$ such that:

  1. Partition: every $s \in S$ gets exactly one label
  2. Traceability: every Guarantee is derivable from Definitions and Assumptions (not from Unknowns)
  3. Independence: no Assumption is derivable from the other Assumptions and Definitions
  4. No laundering: no Unknown is used as if it were a Guarantee

A description is minimally sufficient when $L$ satisfies all four criteria.

Every sentence in the system has a label. Definitions are choices we made. Guarantees follow from those choices. Assumptions are bets; if one is wrong, every guarantee that depends on it breaks. Unknowns are honest gaps. The description is minimally sufficient when nothing is mislabeled and nothing is redundant.

Repositories

Working implementations. Each one applies the same dimensional thinking to a different constraint.

chronoforth Minimal Forth for the Commodore 64, built on DurexForth: a bare kernel with no graphics, sound, or float. It is subroutine threaded with tail call elimination and open coded hot primitives. DROP compiles to one INX, and inlining hot words runs 1.29 to 1.44x faster on stack heavy code. Every cycle count is measured against an exact 6502 core, never estimated.
chronosat A satisfiability verifier running on an unmodified Commodore 64, three literals per clause. Each clause packs into six bytes with bit 15 marking negation, 700:1 against a dense matrix. A hand written native 6502 kernel checks 2048 variables across 1024 clauses in 183K cycles, about 0.18 seconds at 1 MHz, near 5,700 clause checks a second. Roughly 13x the plain Forth version, and bit identical to it.
chrono6502 A headless, exact MOS 6502 and 6510 core in Rust with zero dependencies. It boots the real ChronoForth kernel in process, runs the full Forth 2012 suite in about 0.45 seconds, and reports any word's exact cycle cost. It is the instrument every chronoforth and chronosat number is verified against.
chronosynthea A synthetic patient generator in Rust. It samples a precomputed statistical fingerprint, calibrated to Java Synthea's marginal prevalence rates, instead of stepping a state machine for every patient. Archetype choice is one O(1) alias table draw, and conditions come from SIMD threshold comparisons. Millions of patients a second on the stats path, and 88 to 92K a second writing full records to Parquet. It matches the fingerprint's condition rates to within sampling noise, not Java's full causal structure.
chronohipaa A Python reference encoder that maps each health record to a fixed 20 byte (154 bit) vector across six dimensions: temporal, demographic, clinical, geographic, treatment, status. Compact and lossy on purpose. The clinical and treatment dimensions are noninvertible SHA256 fingerprints, and only the temporal field is encrypted (AES GCM under an HKDF derived key). A research encoder, not a certified deidentification or HIPAA compliance product.
chronoscribe A seven stage Python pipeline that turns Internet Archive hOCR scans into clean Markdown. CDE makes each token correction a deterministic O(1) lookup in a 96 entry decision table over four quantized dimensions. Every stage is linear in the token count, so it cleans the 322 page test book in about 2.9 seconds, roughly 110 pages a second.
chronocom A combat and tactical AI overhaul for XCOM 2: War of the Chosen, in UnrealScript. Hit chances become honest, so the number shown is the number rolled, and the AI adapts across missions. The guarantee is structural, not a measured speedup. Combat math is O(1), and the AI work per turn is bounded by a fixed 32 by 32 influence grid plus fixed capacity caches, so it does not grow with the number of units on the map.
chronoquit A macOS menu bar utility that encodes each running app as a seven dimensional state vector (activity, system impact, protection, interaction, priority, lifecycle, user preference) and runs a small lifecycle state machine. It quits idle, low priority apps once they cross a threshold, rather than on a blanket timer.
chronoboiler Zero dependency bash scaffolding with per language YAML configs and shared templates. It gives the chronomancy repositories one consistent structure and checks them against that standard. The four CDE dimensions here are a way to describe the scripts, not a queryable index.
Coda

The symbol at the top of this page is the nabla, ∇. It is the gradient operator. Applied to a scalar field, it returns a vector pointing in the direction of steepest ascent. Applied to a dataset with the right dimensions, it does the same thing; the shortest path to the answer reveals itself.

The shape of the symbol is not incidental. A T shaped person has one deep vertical specialty and a horizontal bar of broad but shallow knowledge. The nabla is the next step. It is the shape of a polyglot, a converging geometry where many domains, languages, and perspectives flow inward and focus into a single, decisive direction of movement.

This is the Coleman Dimensional Encoding framework.

Not faster hardware. Not cleverer algorithms acting on the same flat structures. Just the precise dimensions, derived from the questions themselves, encoded directly into the geometry of the data.

When the dimensions are right, the gradient does not need to be forced. The steepest path simply reveals itself. This happens not because the answer was hidden, but because the shape of the question finally possessed the capacity to find it.