DagDB

A 6-Bounded Ranked DAG Database
Built on the Savanna Engine · Apple M5 Max · Metal GPU
Norayr Matevosyan · 2026
DagDB. A six-bounded ranked directed acyclic graph database. Built from first principles on the Savanna engine. Running on Apple M five Max with Metal GPU compute. Every node connects to at most six edges. That single constraint makes everything possible.
1 / 24

The Problem

Hub Unbounded Node 6-Bounded
Traditional graph databases allow unlimited edges per node. This sounds flexible, but it's a trap. Unbounded degree means unpredictable memory access, cache misses everywhere, and no parallelism guarantees. You can't fit that on a GPU. You can't replay it deterministically. What if we accepted a constraint: maximum six edges per node? That single rule changes everything.
2 / 24

The Rule of 6

6
Maximum directed edges per node
DomainWhy 6?
Physics6 degrees of freedom describe any rigid body in 3D
GeometryHexagons pack space optimally (honeycombs, graphene)
CognitionMiller's Law: 7 ± 2 items in working memory
Hardware26 = 64 = one UInt64 machine word
Reach610 = 60 million nodes in 10 hops
The Rule of Six. Every node connects to at most six directed edges. This isn't arbitrary. Six appears everywhere. Six degrees of freedom in physics. Hexagonal packing in geometry, the most efficient way to tile a surface. Miller's Law says working memory handles seven plus or minus two items. And critically: two to the sixth is sixty-four, one machine word. Every possible Boolean function of six inputs fits in a single sixty-four-bit integer. Six to the tenth is sixty million, enough reach for most real-world graphs in ten hops.
3 / 24

LUT6: The Core Primitive

in0 in1 in2 in3 in4 in5 6-bit index (0..63) UInt64 LUT 64 programmable bits = ANY Boolean function of 6 inputs 0|1
Presets built in: AND6 0x8000000000000000 all inputs must be true OR6 0xFFFFFFFFFFFFFFFE any input triggers output XOR6 (computed parity) odd number of true inputs MAJ6 (computed majority) 4+ of 6 inputs true IDENT pass-through ghost node / elevator CONST 0x0 or 0xFFFF... fixed premises
LUT six. The core primitive. Six input bits form a six-bit index, zero through sixty-three. That index looks up one bit in a sixty-four-bit integer. One clock cycle. Zero branching. Zero warp divergence. The beauty: by choosing different sixty-four-bit values, each node independently becomes any Boolean gate. AND, OR, XOR, majority vote, identity pass-through, or any of the two to the sixty-four possible functions. This is how FPGAs work. We brought it to graph databases.
4 / 24

The Ranked DAG

Rank 0 Rank 1 Rank 2 Rank N ROOT Decision Reasoning Arguments Evidence
Edges flow upward only: leaves to root. Topologically identical to a multilayer perceptron.
The ranked DAG. Every node lives at a rank. Leaf nodes at the bottom hold raw evidence, atomic facts. Intermediate ranks hold reasoning, arguments, aggregation. The root at rank zero holds the final decision. Edges flow strictly upward: leaves to root. This is topologically identical to a multilayer perceptron in neural networks. Input layer, hidden layers, output. The rank structure is the execution schedule.
5 / 24

Dual-Time Execution

Micro-Time Horizontal / Intra-Rank Nodes debate locally Hex-coloring prevents races Resonate until stable Knaster-Tarski for monotone Paradox Horizon for cycles Output: consensus or UNDEFINED Macro-Time Vertical / Inter-Rank Rank stabilizes, state locks Pass upward to rank N-1 One-way directed edge Perfectly acyclic Deterministic V = 1 rank / tick
Dual-time execution. Two clocks running at different scales. Micro-time is horizontal, within a single rank. Nodes debate locally, like neurons in a layer. Hex-coloring prevents race conditions. The cluster resonates until it reaches consensus. For monotone functions, Knaster-Tarski theorem guarantees convergence. For non-monotone, we cap at a hundred micro-ticks and output undefined, the paradox horizon. Macro-time is vertical: once a rank stabilizes, it locks and passes results strictly upward. One rank per tick. Perfectly acyclic. Perfectly deterministic.
6 / 24

Morton Z-Curve Memory Layout

Memory Address (64-bit): [ 8 bits: Rank MSB ] | [ 56 bits: Morton Z-Curve LSB ] Rank MSB: all nodes of same rank physically adjacent Z-curve LSB: spatially connected nodes in same cache line Bit-shift right = zoom out one octave (1 CPU instruction)
Benefits: Neighbors land in same cache line NVMe writes are sequential Fractal: Z-curve IS the hub tree One shift = one octave zoom Shipped to production: 2.11x at 1B
Morton Z-curve memory layout. Every sixty-four-bit address splits into eight rank bits on top and fifty-six Morton Z-curve bits below. The rank bits keep all nodes of the same rank physically adjacent on NVMe. The Z-curve bits ensure spatially connected nodes land in the same cache line. One bit-shift right zooms out one octave, a single CPU instruction. We already proved this works: two point one one times faster at one billion cells. The Z-curve is natively fractal, which means the hub node splitting tree maps directly to cache lines with zero extra work.
7 / 24

Octaves: Fractal Self-Similarity

Octave 0 (Root) 1 macro-node: final decision Octave 1 6 nodes: high-level conclusions Octave 2 36 nodes: intermediate reasoning Octave N (Leaves) 6^N nodes: raw evidence, facts
Each note in a diatonic scale connects to 6 peers within its octave. When a cluster reaches consensus, it abstracts into a single macro-node one octave up. Renormalization in physics. Hierarchical modular networks in graph theory.
Octaves. Inspired by music theory. Each note in a diatonic scale connects to six peers within its octave. In DagDB, when a local cluster at rank N reaches consensus, it abstracts into a single macro-node at rank N minus one. This is renormalization in physics, hierarchical modular networks in graph theory. The octave is the rank boundary. Morton Z-curves make octaves native to hardware: same rank prefix means physically adjacent on disk.
8 / 24

The Engine: Savanna

14
GCUPS
128
GB Unified Memory
M5
Apple Silicon
ComponentWhat It Does
ELLPACK sparse matrixFixed 6 columns, perfect for GPU
Morton Z-curveConnected nodes adjacent in cache
SpMV coreAdjacency x state = one tick
7-coloringLock-free parallel update
Carlos Delta50x compression, ACID persistence
The engine. Savanna. Already built and battle-tested. Fourteen giga cell updates per second on Apple M five Max with one hundred twenty-eight gigabytes of unified memory. ELLPACK sparse matrix format with fixed six columns. Morton Z-curve ordering so connected nodes sit in the same cache line. Sparse matrix-vector multiplication as the core operation. Seven-coloring for lock-free parallel updates. And Carlos Delta Transport for persistence with fifty times compression. We do not build a new engine. We re-label the concepts.
9 / 24

Mathematical Isomorphism

Do not build a new engine. Re-label the concepts.
Savanna SimulationDagDB Logic
Cell (Zebra)=Node (Premise)
Neighbor (Lion)=Dependency (Counter-argument)
State (Alive/Dead)=Truth-state (Valid/Invalid)
6 spatial neighbors=6 logical edges
Evaluation tick=Inference step
Scent diffusion=Truth propagation
Morton Z-curve=Morton Z-curve
Carlos Delta=Carlos Delta
The mathematical isomorphism. This is why DagDB exists. A zebra cell in Savanna is a premise node in DagDB. A lion neighbor is a counter-argument dependency. Alive or dead state maps to valid or invalid truth state. Six spatial neighbors become six logical edges. An evaluation tick becomes an inference step. Scent diffusion becomes truth propagation. Even the infrastructure carries over unchanged: Morton Z-curve, Carlos Delta, the Metal shaders. Same engine. Different interpretation.
10 / 24

Hub Nodes: 6-ary Fractal Fan-In

Problem 46,656 edges Solution: Hex-Tree hub Each virtual node: exactly 6 edges 46,656 = 6^6 depth-6 tree Morton Z: same cache line
Hub nodes. Real knowledge graphs have power-law distributions. A node like water or substation might have forty-six thousand edges. That violates the rule of six. The resolution: six-ary fractal fan-in. A hub with six to the sixth edges compiles into a six-deep micro-tree of virtual nodes. Each virtual node has exactly six edges, all within bounds. The Morton Z-curve is natively fractal. The virtual nodes share the same rank prefix and sequential Z-curve bits, so they land in the same L one cache line. The sparse matrix engine doesn't even know it's a hub.
11 / 24

Convergence: Knaster-Tarski + Paradox Horizon

Monotone Functions (AND, OR, MAJORITY) Guaranteed fixed point (Knaster-Tarski theorem) Non-Monotone (XOR, NOT, cycles) MAX_MICRO_TICKS = 100 Output: UNDEFINED
Reality doesn't always cleanly resolve. Sometimes the argument is a paradox. Let the DAG natively highlight the paradox.
Convergence. How do we know micro-time terminates? Two tiers. For monotone Boolean functions like AND, OR, and majority, the Knaster-Tarski theorem mathematically guarantees a fixed point. They always converge. For non-monotone functions like XOR rings or NOT gates, oscillation is possible. Here we enforce a paradox horizon: a maximum of one hundred micro-ticks. If the cluster hasn't stabilized, it outputs undefined, the third truth state. This is elegant: reality doesn't always cleanly resolve. Sometimes the argument is genuinely paradoxical. Let the database say so.
12 / 24

Skip Connections: Ghost Nodes

R0R1 R2R3 Leaf Root Leaf Ghost Ghost LUT = IDENTITY (pass-through)
Memory is cheap. Execution stalls are fatal.
Skip connections. Real arguments skip ranks. Evidence at rank ten might directly support a rank zero conclusion. But direct rank-skipping breaks the layer-parallel schedule. The resolution: identity ghost nodes. The compiler pads intermediate ranks with ghost nodes whose LUT is set to identity, pure pass-through. Truth state rides the elevator up through ranks, migrating physically closer in memory at each step. Memory is cheap. Execution stalls are fatal. The M five processes millions of identity gates in nanoseconds because ELLPACK stays dense and predictable.
13 / 24

Orthogonal Wavefronts

V_macro = 1 rank/tick Deterministic V_micro ~ S_LUT / Colors OR (permissive) FAST propagation AND (skeptical) SLOW propagation The logic topology IS the velocity.
Orthogonal wavefronts. The Fisher K P P equation from reaction-diffusion has a direct analog in DagDB. Vertical: macro-time propagation speed is exactly one rank per tick. Deterministic, universal. Horizontal: micro-time speed depends on the sensitivity of local LUT gates. OR gates are permissive: one true input fires immediately, truth flashes at maximum speed. AND gates are skeptical: all six inputs must be true, propagation is slow and requires massive corroboration. The logic topology IS the information velocity. Choose your gates. Choose your speed.
14 / 24

Persistence: Carlos Delta Transport

50x
Compression
ACID
Guarantees
0
CPU Copies
PropertyHow
AtomicityEach delta block = one complete tick
Consistency6-degree bound prevents invalid graphs
IsolationHex coloring + rank = lock-free
DurabilityCompressed deltas on NVMe, bidirectional
Carlos Delta Transport. Already built and shipped. Fifty times delta compression because about ninety-eight percent of nodes remain static each tick. GPU writes directly to NVMe, bypassing the CPU entirely. Full ACID: atomicity from one delta block per tick. Consistency from the six-degree bound. Isolation from hex coloring and rank scheduling, all lock-free. Durability from compressed deltas on NVMe. And it's bidirectional: stream forward to persist, stream backward to reconstruct any historical state. Time-travel queries with zero extra infrastructure.
15 / 24

SQL Interface: PostgreSQL

[Any SQL Client] | v [Postgres Wire Protocol] | v [Berkeley Cost-Based Optimizer] (40 years of query planning) | v [CustomScan / ExecutorRun_hook] (intercept BEFORE Postgres CPU touches data) | v [Apple Metal API] (GPU executes SpMV in unified memory) | v [Virtual Tuples returned]
S Q L interface. PostgreSQL as query planner. Any S Q L client connects via the standard Postgres wire protocol. The Berkeley cost-based optimizer, forty years of query planning expertise, generates the execution plan. Then our custom scan hook intercepts before Postgres's CPU touches any data. The hook translates relational operations into Metal G P U graph operations. Sparse matrix-vector multiplication executes entirely in unified memory. Virtual tuples return to Postgres. Standard S Q L in, G P U-accelerated graph traversal out.
16 / 24

SQL to GPU Graph Operations

Postgres OpGPU Equivalent
SeqScanColor-wave sweep with predicate filter
IndexScanDirect array index O(1)
FilterPer-node predicate, pure SIMD
JoinFollow edges = graph traversal (6-bounded = O(1))
AggregateParallel reduction kernels
Sort/OrderByParallel radix sort
Group ByParallel hash-based grouping
Every relational operation has a direct GPU graph equivalent. 6-bounded degree makes JOIN cost constant, not variable.
S Q L to G P U graph operations. Every relational operation maps directly. Sequential scan becomes a color-wave sweep. Index scan is a direct array index, O of one. Filter is per-node predicate in pure SIMD. The critical one: JOIN. In traditional databases, join cost depends on table size. In DagDB, a join is following edges in the graph. Six-bounded degree means join cost is constant, not variable. Aggregate becomes parallel reduction. Sort becomes parallel radix sort. The G P U does all of it in unified memory.
17 / 24

Platform Architecture

Agora (M5) The Reasoner 128GB Unified Sparse graph traversal SpMV + Carlos Delta 14 GCUPS Zero-copy GPU+CPU Sparta (4090) Translator 24GB VRAM Dense LLM inference PDF ingestion Fact extraction NL to 6-bounded Athens (3090) Translator 24GB VRAM Async backprop Additional LLM capacity
NVIDIA translates unstructured reality into 6-bounded Boolean leaves. The Mac reasons over the structured graph.
Platform architecture. Three machines, each with a distinct role. Agora, the M five Max with one hundred twenty-eight gigs of unified memory: the reasoner. Hot path. Active DAG evaluation, sparse matrix-vector multiplication, Carlos Delta persistence. Unified memory means zero-copy between GPU and CPU, critical for sparse graph traversal. Sparta, an NVIDIA forty-ninety with twenty-four gigs: the translator. Dense L L M inference, P D F ingestion, fact extraction, natural language to six-bounded Boolean leaves. Athens, a thirty-ninety: additional translation capacity. The NVIDIA boxes translate reality. The Mac reasons over the result.
18 / 24

Deterministic Replay

Seed = Murmur3( Genesis_Epoch_Hash + Tick_Number ) Carlos Delta guarantees tick atomicity Replay hashes tick number Perfectly predicts color shuffle Zero bytes added to delta stream
Deterministic replay. We shuffle the seven-color dispatch order every tick to prevent chromatic advection, a subtle directional bias. But replay needs exact determinism. The solution: tick-derived pseudo-random number generation. Hardcode Murmur three in Metal. The seed is the genesis epoch hash plus the tick number. Carlos Delta guarantees tick atomicity, so replaying just hashes the tick number and perfectly predicts the shuffle. Zero bytes added to the delta stream. Any tick can be exactly reconstructed from genesis plus the tick number. Cryptographic-grade determinism at zero storage cost.
19 / 24

Review: 3 Passes, 621 Viewpoints

3
Review Passes
621
Viewpoints
5
Resolutions
PassMethodResult
Deep Research8 areas analyzed6 green, 1 yellow, 1 novel
Harmonic Resolver3 models x 9 rounds69 claims, Asymmetric Veto
Gemini Deep Think5 critical questions5/5 resolved
Models: Gemma 4 31B · Llama 4 Scout · Qwen Coder 30B · Gemini Deep Think
The review process. Before writing a single line of code, this architecture went through three independent review passes. First, Deep Research analyzed eight areas of feasibility: six green, one yellow for hub nodes, one flagged as novel. Second, the Harmonic Resolver: three local models, Gemma four, Llama Scout, and Qwen Coder, debated sixty-nine claims across nine rounds with Asymmetric Veto, generating six hundred twenty-one total viewpoints. Third, Gemini Deep Think resolved all five remaining critical questions. Only after all three passes gave green light did we build.
20 / 24

Phase 0.1: First Run

DagDB v0.1 -- 6-Bounded Ranked DAG Evaluator Grid: 16x16 = 256 nodes 7-coloring: [37, 37, 36, 36, 36, 37, 37] Engine ready. GPU: Apple M5 Max Running tick 0 (leaves-up evaluation)... Elapsed: 4.19 ms LUT6 unit tests: PASS AND6(0x3F) = 1 PASS AND6(0x3E) = 0 PASS OR6(0x01) = 1 PASS OR6(0x00) = 0 PASS XOR6(0x03) = 0 (even parity) PASS XOR6(0x07) = 1 (odd parity) PASS MAJ6(0x0F) = 1 (4 of 6) PASS MAJ6(0x07) = 0 (3 of 6) Build 0.1: COMPLETE
Phase zero point one: first run. The engine compiles and runs on Apple M five Max. Two hundred fifty-six nodes on a sixteen by sixteen hex grid. Seven-coloring balanced across thirty-six to thirty-seven nodes per group. Tick zero completes in four point one nine milliseconds with leaves-up rank evaluation. All eight LUT six unit tests pass: AND, OR, XOR parity, majority vote. Build zero point one is complete. The foundation works.
21 / 24

Verification Gates

GateCriterionStatus
G1swift build passes with DagDB targetPASS
G2LUT6 correct for AND/OR/XOR/MAJ8/8 PASS
G3Leaves-up execution on ranked DAGPASS (4.19ms)
G4Carlos Delta save/restorePhase 2
G5Postgres SELECT * returns dataPhase 3
G61M-node graph < 10msPhase 2
G7Time-travel query reconstructs statePhase 2

Files Built

DagDB/ Package.swift Sources/ DagDB/ DagDBState.swift (truth, rank, LUT6 buffers) DagDBEngine.swift (Metal compute engine) HexGrid.swift (reused from Savanna) CarlosDelta.swift (reused -- persistence) Shaders/dagdb.metal (LUT6 tick kernel) DagDBCLI/main.swift (test harness)
Verification gates. Three of seven gates are green. Gate one: swift build passes. Gate two: LUT six correct for all eight test cases. Gate three: leaves-up execution works at four point one nine milliseconds. Gates four through seven are pending for later phases: Carlos Delta persistence, Postgres integration, million-node scaling, and time-travel queries. The file structure is clean: DagDB state, engine, hex grid reused from Savanna, Carlos Delta reused, the Metal shader, and a C L I test harness.
22 / 24

Roadmap

5
Phases Remaining
7
Verification Gates
3/7
Gates Passed
Roadmap. Phase zero point one is done. Five phases remain. Phase zero point two: generalize the neighbor table to support edge weights, enabling continuous mode. Phase one: full ranked DAG kernel with intra-rank resonance and the paradox horizon. Phase two: wire Carlos Delta Transport for DagDB persistence and time-travel. Phase three: the PostgreSQL custom scan executor hook for S Q L access. Phase four: SwiftUI application for visualization. Phase five: M C P server integration with the NVIDIA translation cluster. Seven verification gates, three passed so far.
23 / 24

DagDB

Constraints do not limit a system. They make it viable.
6 edges · 64-bit LUT · Morton Z-curve · 14 GCUPS
Carlos Delta · PostgreSQL · Apple M5 Max
This is an amateur engineering project. We are not HPC or database professionals
and make no competitive claims. Numbers speak; ego does not. Errors likely.
Norayr Matevosyan · 2026
DagDB. A six-bounded ranked directed acyclic graph database. One constraint: six edges per node. From that, everything follows. LUT six gives you any Boolean function in one clock cycle. Morton Z-curve puts connected nodes in the same cache line. Savanna engine delivers fourteen giga cell updates per second. Carlos Delta provides ACID persistence with fifty times compression. PostgreSQL gives you forty years of query planning. And Apple's unified memory architecture makes sparse graph traversal viable at scale. Constraints do not limit a system. They make it viable. Thank you.
24 / 24
Press Space to narrate · ↑↓ to navigate · Works offline