DagDB

A 6-Bounded Ranked DAG Database

Built on the Savanna Engine · Apple M5 Max · Metal GPU
Norayr Matevosyan · 2026

DagDB. A six-bounded ranked directed acyclic graph database. Built from first principles on the Savanna engine. Running on Apple M five Max with Metal GPU compute. Every node connects to at most six edges. That single constraint makes everything possible.

1 / 24

The Problem

Traditional graph databases: unbounded degree
Cache misses, unpredictable memory, no parallelism guarantees
Can't fit on GPU. Can't deterministically replay.
What if we bounded every node to 6 edges?

Traditional graph databases allow unlimited edges per node. This sounds flexible, but it's a trap. Unbounded degree means unpredictable memory access, cache misses everywhere, and no parallelism guarantees. You can't fit that on a GPU. You can't replay it deterministically. What if we accepted a constraint: maximum six edges per node? That single rule changes everything.

2 / 24

The Rule of 6

Maximum directed edges per node

Domain	Why 6?
Physics	6 degrees of freedom describe any rigid body in 3D
Geometry	Hexagons pack space optimally (honeycombs, graphene)
Cognition	Miller's Law: 7 ± 2 items in working memory
Hardware	2⁶ = 64 = one UInt64 machine word
Reach	6¹⁰ = 60 million nodes in 10 hops

The Rule of Six. Every node connects to at most six directed edges. This isn't arbitrary. Six appears everywhere. Six degrees of freedom in physics. Hexagonal packing in geometry, the most efficient way to tile a surface. Miller's Law says working memory handles seven plus or minus two items. And critically: two to the sixth is sixty-four, one machine word. Every possible Boolean function of six inputs fits in a single sixty-four-bit integer. Six to the tenth is sixty million, enough reach for most real-world graphs in ten hops.

3 / 24

LUT6: The Core Primitive

Presets built in: AND6 0x8000000000000000 all inputs must be true OR6 0xFFFFFFFFFFFFFFFE any input triggers output XOR6 (computed parity) odd number of true inputs MAJ6 (computed majority) 4+ of 6 inputs true IDENT pass-through ghost node / elevator CONST 0x0 or 0xFFFF... fixed premises

LUT six. The core primitive. Six input bits form a six-bit index, zero through sixty-three. That index looks up one bit in a sixty-four-bit integer. One clock cycle. Zero branching. Zero warp divergence. The beauty: by choosing different sixty-four-bit values, each node independently becomes any Boolean gate. AND, OR, XOR, majority vote, identity pass-through, or any of the two to the sixty-four possible functions. This is how FPGAs work. We brought it to graph databases.

4 / 24

The Ranked DAG

Edges flow upward only: leaves to root. Topologically identical to a multilayer perceptron.

The ranked DAG. Every node lives at a rank. Leaf nodes at the bottom hold raw evidence, atomic facts. Intermediate ranks hold reasoning, arguments, aggregation. The root at rank zero holds the final decision. Edges flow strictly upward: leaves to root. This is topologically identical to a multilayer perceptron in neural networks. Input layer, hidden layers, output. The rank structure is the execution schedule.

5 / 24

Dual-Time Execution

Dual-time execution. Two clocks running at different scales. Micro-time is horizontal, within a single rank. Nodes debate locally, like neurons in a layer. Hex-coloring prevents race conditions. The cluster resonates until it reaches consensus. For monotone functions, Knaster-Tarski theorem guarantees convergence. For non-monotone, we cap at a hundred micro-ticks and output undefined, the paradox horizon. Macro-time is vertical: once a rank stabilizes, it locks and passes results strictly upward. One rank per tick. Perfectly acyclic. Perfectly deterministic.

6 / 24

Morton Z-Curve Memory Layout

Memory Address (64-bit): [ 8 bits: Rank MSB ] | [ 56 bits: Morton Z-Curve LSB ] Rank MSB: all nodes of same rank physically adjacent Z-curve LSB: spatially connected nodes in same cache line Bit-shift right = zoom out one octave (1 CPU instruction)

Morton Z-curve memory layout. Every sixty-four-bit address splits into eight rank bits on top and fifty-six Morton Z-curve bits below. The rank bits keep all nodes of the same rank physically adjacent on NVMe. The Z-curve bits ensure spatially connected nodes land in the same cache line. One bit-shift right zooms out one octave, a single CPU instruction. We already proved this works: two point one one times faster at one billion cells. The Z-curve is natively fractal, which means the hub node splitting tree maps directly to cache lines with zero extra work.

7 / 24

Octaves: Fractal Self-Similarity

Each note in a diatonic scale connects to 6 peers within its octave. When a cluster reaches consensus, it abstracts into a single macro-node one octave up. Renormalization in physics. Hierarchical modular networks in graph theory.

Octaves. Inspired by music theory. Each note in a diatonic scale connects to six peers within its octave. In DagDB, when a local cluster at rank N reaches consensus, it abstracts into a single macro-node at rank N minus one. This is renormalization in physics, hierarchical modular networks in graph theory. The octave is the rank boundary. Morton Z-curves make octaves native to hardware: same rank prefix means physically adjacent on disk.

8 / 24

The Engine: Savanna

GCUPS

128

GB Unified Memory

Apple Silicon

Component	What It Does
ELLPACK sparse matrix	Fixed 6 columns, perfect for GPU
Morton Z-curve	Connected nodes adjacent in cache
SpMV core	Adjacency x state = one tick
7-coloring	Lock-free parallel update
Carlos Delta	50x compression, ACID persistence

The engine. Savanna. Already built and battle-tested. Fourteen giga cell updates per second on Apple M five Max with one hundred twenty-eight gigabytes of unified memory. ELLPACK sparse matrix format with fixed six columns. Morton Z-curve ordering so connected nodes sit in the same cache line. Sparse matrix-vector multiplication as the core operation. Seven-coloring for lock-free parallel updates. And Carlos Delta Transport for persistence with fifty times compression. We do not build a new engine. We re-label the concepts.

9 / 24

Mathematical Isomorphism

Do not build a new engine. Re-label the concepts.

Savanna Simulation		DagDB Logic
Cell (Zebra)	=	Node (Premise)
Neighbor (Lion)	=	Dependency (Counter-argument)
State (Alive/Dead)	=	Truth-state (Valid/Invalid)
6 spatial neighbors	=	6 logical edges
Evaluation tick	=	Inference step
Scent diffusion	=	Truth propagation
Morton Z-curve	=	Morton Z-curve
Carlos Delta	=	Carlos Delta

The mathematical isomorphism. This is why DagDB exists. A zebra cell in Savanna is a premise node in DagDB. A lion neighbor is a counter-argument dependency. Alive or dead state maps to valid or invalid truth state. Six spatial neighbors become six logical edges. An evaluation tick becomes an inference step. Scent diffusion becomes truth propagation. Even the infrastructure carries over unchanged: Morton Z-curve, Carlos Delta, the Metal shaders. Same engine. Different interpretation.

10 / 24

Hub Nodes: 6-ary Fractal Fan-In

Hub nodes. Real knowledge graphs have power-law distributions. A node like water or substation might have forty-six thousand edges. That violates the rule of six. The resolution: six-ary fractal fan-in. A hub with six to the sixth edges compiles into a six-deep micro-tree of virtual nodes. Each virtual node has exactly six edges, all within bounds. The Morton Z-curve is natively fractal. The virtual nodes share the same rank prefix and sequential Z-curve bits, so they land in the same L one cache line. The sparse matrix engine doesn't even know it's a hub.

11 / 24

Convergence: Knaster-Tarski + Paradox Horizon

Reality doesn't always cleanly resolve. Sometimes the argument is a paradox. Let the DAG natively highlight the paradox.

Convergence. How do we know micro-time terminates? Two tiers. For monotone Boolean functions like AND, OR, and majority, the Knaster-Tarski theorem mathematically guarantees a fixed point. They always converge. For non-monotone functions like XOR rings or NOT gates, oscillation is possible. Here we enforce a paradox horizon: a maximum of one hundred micro-ticks. If the cluster hasn't stabilized, it outputs undefined, the third truth state. This is elegant: reality doesn't always cleanly resolve. Sometimes the argument is genuinely paradoxical. Let the database say so.

12 / 24

Skip Connections: Ghost Nodes

Memory is cheap. Execution stalls are fatal.

Skip connections. Real arguments skip ranks. Evidence at rank ten might directly support a rank zero conclusion. But direct rank-skipping breaks the layer-parallel schedule. The resolution: identity ghost nodes. The compiler pads intermediate ranks with ghost nodes whose LUT is set to identity, pure pass-through. Truth state rides the elevator up through ranks, migrating physically closer in memory at each step. Memory is cheap. Execution stalls are fatal. The M five processes millions of identity gates in nanoseconds because ELLPACK stays dense and predictable.

13 / 24

Orthogonal Wavefronts

Orthogonal wavefronts. The Fisher K P P equation from reaction-diffusion has a direct analog in DagDB. Vertical: macro-time propagation speed is exactly one rank per tick. Deterministic, universal. Horizontal: micro-time speed depends on the sensitivity of local LUT gates. OR gates are permissive: one true input fires immediately, truth flashes at maximum speed. AND gates are skeptical: all six inputs must be true, propagation is slow and requires massive corroboration. The logic topology IS the information velocity. Choose your gates. Choose your speed.

14 / 24

Persistence: Carlos Delta Transport

50x

Compression

ACID

Guarantees

CPU Copies

Property	How
Atomicity	Each delta block = one complete tick
Consistency	6-degree bound prevents invalid graphs
Isolation	Hex coloring + rank = lock-free
Durability	Compressed deltas on NVMe, bidirectional

GPU to NVMe direct (GPUDirect Storage style)
~98% of nodes static per tick = massive delta compression
Backward replay: stream in reverse for time-travel queries

Carlos Delta Transport. Already built and shipped. Fifty times delta compression because about ninety-eight percent of nodes remain static each tick. GPU writes directly to NVMe, bypassing the CPU entirely. Full ACID: atomicity from one delta block per tick. Consistency from the six-degree bound. Isolation from hex coloring and rank scheduling, all lock-free. Durability from compressed deltas on NVMe. And it's bidirectional: stream forward to persist, stream backward to reconstruct any historical state. Time-travel queries with zero extra infrastructure.

15 / 24

SQL Interface: PostgreSQL

[Any SQL Client] | v [Postgres Wire Protocol] | v [Berkeley Cost-Based Optimizer] (40 years of query planning) | v [CustomScan / ExecutorRun_hook] (intercept BEFORE Postgres CPU touches data) | v [Apple Metal API] (GPU executes SpMV in unified memory) | v [Virtual Tuples returned]

S Q L interface. PostgreSQL as query planner. Any S Q L client connects via the standard Postgres wire protocol. The Berkeley cost-based optimizer, forty years of query planning expertise, generates the execution plan. Then our custom scan hook intercepts before Postgres's CPU touches any data. The hook translates relational operations into Metal G P U graph operations. Sparse matrix-vector multiplication executes entirely in unified memory. Virtual tuples return to Postgres. Standard S Q L in, G P U-accelerated graph traversal out.

16 / 24

SQL to GPU Graph Operations

Postgres Op	GPU Equivalent
SeqScan	Color-wave sweep with predicate filter
IndexScan	Direct array index O(1)
Filter	Per-node predicate, pure SIMD
Join	Follow edges = graph traversal (6-bounded = O(1))
Aggregate	Parallel reduction kernels
Sort/OrderBy	Parallel radix sort
Group By	Parallel hash-based grouping

Every relational operation has a direct GPU graph equivalent. 6-bounded degree makes JOIN cost constant, not variable.

S Q L to G P U graph operations. Every relational operation maps directly. Sequential scan becomes a color-wave sweep. Index scan is a direct array index, O of one. Filter is per-node predicate in pure SIMD. The critical one: JOIN. In traditional databases, join cost depends on table size. In DagDB, a join is following edges in the graph. Six-bounded degree means join cost is constant, not variable. Aggregate becomes parallel reduction. Sort becomes parallel radix sort. The G P U does all of it in unified memory.

17 / 24

Platform Architecture

NVIDIA translates unstructured reality into 6-bounded Boolean leaves. The Mac reasons over the structured graph.

Platform architecture. Three machines, each with a distinct role. Agora, the M five Max with one hundred twenty-eight gigs of unified memory: the reasoner. Hot path. Active DAG evaluation, sparse matrix-vector multiplication, Carlos Delta persistence. Unified memory means zero-copy between GPU and CPU, critical for sparse graph traversal. Sparta, an NVIDIA forty-ninety with twenty-four gigs: the translator. Dense L L M inference, P D F ingestion, fact extraction, natural language to six-bounded Boolean leaves. Athens, a thirty-ninety: additional translation capacity. The NVIDIA boxes translate reality. The Mac reasons over the result.

18 / 24

Deterministic Replay

Seed = Murmur3( Genesis_Epoch_Hash + Tick_Number ) Carlos Delta guarantees tick atomicity Replay hashes tick number Perfectly predicts color shuffle Zero bytes added to delta stream

7-color dispatch order shuffled per tick (prevents chromatic advection)
Shuffle seed derived from tick number via Murmur3 hash
No need to log seed in Carlos stream
Any tick can be exactly reconstructed from genesis + tick number
Cryptographic-grade determinism at zero storage cost

Deterministic replay. We shuffle the seven-color dispatch order every tick to prevent chromatic advection, a subtle directional bias. But replay needs exact determinism. The solution: tick-derived pseudo-random number generation. Hardcode Murmur three in Metal. The seed is the genesis epoch hash plus the tick number. Carlos Delta guarantees tick atomicity, so replaying just hashes the tick number and perfectly predicts the shuffle. Zero bytes added to the delta stream. Any tick can be exactly reconstructed from genesis plus the tick number. Cryptographic-grade determinism at zero storage cost.

19 / 24

Review: 3 Passes, 621 Viewpoints

Review Passes

621

Viewpoints

Resolutions

Pass	Method	Result
Deep Research	8 areas analyzed	6 green, 1 yellow, 1 novel
Harmonic Resolver	3 models x 9 rounds	69 claims, Asymmetric Veto
Gemini Deep Think	5 critical questions	5/5 resolved

Models: Gemma 4 31B · Llama 4 Scout · Qwen Coder 30B · Gemini Deep Think

The review process. Before writing a single line of code, this architecture went through three independent review passes. First, Deep Research analyzed eight areas of feasibility: six green, one yellow for hub nodes, one flagged as novel. Second, the Harmonic Resolver: three local models, Gemma four, Llama Scout, and Qwen Coder, debated sixty-nine claims across nine rounds with Asymmetric Veto, generating six hundred twenty-one total viewpoints. Third, Gemini Deep Think resolved all five remaining critical questions. Only after all three passes gave green light did we build.

20 / 24

Phase 0.1: First Run

DagDB v0.1 -- 6-Bounded Ranked DAG Evaluator Grid: 16x16 = 256 nodes 7-coloring: [37, 37, 36, 36, 36, 37, 37] Engine ready. GPU: Apple M5 Max Running tick 0 (leaves-up evaluation)... Elapsed: 4.19 ms LUT6 unit tests: PASS AND6(0x3F) = 1 PASS AND6(0x3E) = 0 PASS OR6(0x01) = 1 PASS OR6(0x00) = 0 PASS XOR6(0x03) = 0 (even parity) PASS XOR6(0x07) = 1 (odd parity) PASS MAJ6(0x0F) = 1 (4 of 6) PASS MAJ6(0x07) = 0 (3 of 6) Build 0.1: COMPLETE

Phase zero point one: first run. The engine compiles and runs on Apple M five Max. Two hundred fifty-six nodes on a sixteen by sixteen hex grid. Seven-coloring balanced across thirty-six to thirty-seven nodes per group. Tick zero completes in four point one nine milliseconds with leaves-up rank evaluation. All eight LUT six unit tests pass: AND, OR, XOR parity, majority vote. Build zero point one is complete. The foundation works.

21 / 24

Verification Gates

Gate	Criterion	Status
G1	swift build passes with DagDB target	PASS
G2	LUT6 correct for AND/OR/XOR/MAJ	8/8 PASS
G3	Leaves-up execution on ranked DAG	PASS (4.19ms)
G4	Carlos Delta save/restore	Phase 2
G5	Postgres SELECT * returns data	Phase 3
G6	1M-node graph < 10ms	Phase 2
G7	Time-travel query reconstructs state	Phase 2

Files Built

DagDB/ Package.swift Sources/ DagDB/ DagDBState.swift (truth, rank, LUT6 buffers) DagDBEngine.swift (Metal compute engine) HexGrid.swift (reused from Savanna) CarlosDelta.swift (reused -- persistence) Shaders/dagdb.metal (LUT6 tick kernel) DagDBCLI/main.swift (test harness)

Verification gates. Three of seven gates are green. Gate one: swift build passes. Gate two: LUT six correct for all eight test cases. Gate three: leaves-up execution works at four point one nine milliseconds. Gates four through seven are pending for later phases: Carlos Delta persistence, Postgres integration, million-node scaling, and time-travel queries. The file structure is clean: DagDB state, engine, hex grid reused from Savanna, Carlos Delta reused, the Metal shader, and a C L I test harness.

22 / 24

Roadmap

Phase 0.1: Fork Savanna Engine
Phase 0.2: Edge weights + generalized neighbor table
Phase 1: Full ranked DAG kernel with intra-rank resonance
Phase 2: Carlos Delta persistence for DagDB
Phase 3: PostgreSQL CustomScan executor hook
Phase 4: SwiftUI application
Phase 5: MCP server + Sparta/Athens integration

Phases Remaining

Verification Gates

3/7

Gates Passed

Roadmap. Phase zero point one is done. Five phases remain. Phase zero point two: generalize the neighbor table to support edge weights, enabling continuous mode. Phase one: full ranked DAG kernel with intra-rank resonance and the paradox horizon. Phase two: wire Carlos Delta Transport for DagDB persistence and time-travel. Phase three: the PostgreSQL custom scan executor hook for S Q L access. Phase four: SwiftUI application for visualization. Phase five: M C P server integration with the NVIDIA translation cluster. Seven verification gates, three passed so far.

23 / 24

DagDB

Constraints do not limit a system. They make it viable.

6 edges · 64-bit LUT · Morton Z-curve · 14 GCUPS
Carlos Delta · PostgreSQL · Apple M5 Max

This is an amateur engineering project. We are not HPC or database professionals
and make no competitive claims. Numbers speak; ego does not. Errors likely.

Norayr Matevosyan · 2026

DagDB. A six-bounded ranked directed acyclic graph database. One constraint: six edges per node. From that, everything follows. LUT six gives you any Boolean function in one clock cycle. Morton Z-curve puts connected nodes in the same cache line. Savanna engine delivers fourteen giga cell updates per second. Carlos Delta provides ACID persistence with fifty times compression. PostgreSQL gives you forty years of query planning. And Apple's unified memory architecture makes sparse graph traversal viable at scale. Constraints do not limit a system. They make it viable. Thank you.

24 / 24