Abstractor
System design primitives for computing — memory hierarchies, derived data, synchronization, and the fundamental concepts that underlie all software systems.
System Primitives
These constraints are rooted in physics and fundamental CS. They apply at every layer. Abstractions can defer them, transform them, or trade one for another — but never eliminate them.
Derived Data - Systems
Computation and storage are separated by distance — in the memory hierarchy, across the network, even in time. When crossing that distance repeatedly is too expensive, you store a copy closer. That copy is derived data.
Every cache, replica, index, materialized view, denormalized table, and memoized result is the same pattern: store a transform of source data closer to consumption. Pay for sync.
The Three Choices
- 1. What transform?
- Identity → CPU cache, CDN, replica
- Projection → covering index, column store
- Aggregation → materialized view, rollup
- Structure change → B-tree, hash, inverted index
- Lossy → bloom filter, HyperLogLog, sketch
- 2. Where to store?
- Memory → register → L1 → L2 → L3 → RAM → SSD → disk
- Network → same-process → machine → rack → DC → region → edge
- Time → precomputed → on-demand → lazy
- 3. How to sync?
- Sync on write → strong consistency, write pays RTT
- Invalidate → strong consistency, invalidation fanout
- TTL → bounded staleness, no coordination
- Version on read → strong consistency, read pays check
- Never → immutable source, no cost
Unification
| Name | Transform | Location | Sync |
|---|---|---|---|
| CPU L1 cache | identity | RAM → L1 | coherence protocol |
| CDN | identity | origin → edge | TTL / purge |
| Redis cache | identity / projection | disk → RAM | TTL / invalidate |
| Database replica | identity | primary → secondary | sync / async |
| Materialized view | aggregation | compute → storage | refresh / periodic |
| Index | projection + structure | scan → lookup | sync on write |
| Denormalized table | join | join-time → storage | dual write |
| Memoization | full result | compute → lookup | none (pure) |
| Bloom filter | lossy projection | set → bits | rebuild |
What This Explains
- Cache invalidation is hard — it's distributed coordination
- Immutability is powerful — no sync needed, copies are forever valid
- Indexes slow writes — sync obligation on every mutation
- Eventual consistency exists — coordination is expensive, defer it
- CDNs use TTL — bounded staleness avoids coordination
- Denormalization is dangerous — multiple sync points, easy to diverge
- Memoization is easy — pure functions have immutable inputs
Concrete Tradeoff Examples
Real-world technology choices mapped to the primitives above. Hover to see which constraints apply.
Derived Data - Languages
This pattern applies at every layer of the stack.
LANGUAGE: Value → alias → copy → register
Isomorphism
| Derived Data (System) | Derived Data (Language) |
|---|---|
| Source | Original value / memory location |
| Transform | Copy / reference / move / borrow |
| Store closer | Register, stack, local variable, cache line |
| Two representations | Multiple references to same data |
| Sync obligation | Coherence: who can read/write when? |
Layers
| Layer | Source | Derived Copy | Sync Strategy |
|---|---|---|---|
| CPU cache | RAM | L1/L2/L3 line | MESI protocol |
| Compiler | Memory | Register | Register allocation |
| Language | Original binding | Alias / reference | Borrow checker / locks / GC |
| Thread | Shared heap | Thread-local | Mutex / channels / atomics |
| Process | Shared memory | Process-local | IPC / message passing |
| Database | Primary | Replica | Replication protocol |
| Network | Origin server | CDN edge | TTL / invalidation |
| Geo-distributed | Region A | Region B | Eventual consistency |
Triangle
Three primitives rooted in physics:
TIME
(when)
△
/|\
/ | \
/ | \
/ STATE \
/ | \
/ | \
/ | \
SPACE ----+---- IDENTITY
(where) (which)
| Primitive | Physical Root | Question |
|---|---|---|
| SPACE | Locality, memory hierarchy | Where does data reside? |
| TIME | Causality, sequence | When does it exist/change? |
| IDENTITY | Equivalence, sameness | Are these the same data? |
STATE emerges from the triangle:
Dimensions of TIME
TIME is overloaded in programming:
| Dimension | Question | Example |
|---|---|---|
| Execution | What order? | Statement A before B |
| Parallel | Simultaneous? | Thread 1 and Thread 2 |
| Existence | How long? | Value lifetime, reference validity |
Coherence problems arise at the intersections:
Existence TIME mismatch → dangling references, use-after-free
Coherence Problem
Three ingredients:
Remove any one:
| Remove | Strategy |
|---|---|
| Shared Identity | Value semantics, deep copy |
| Multiple Spaces | Single source of truth |
| Time flows | Immutability |
Operations
| Operation | Meaning | Parallel Hazard |
|---|---|---|
| Read | Observe value | Stale/torn reads |
| Write | Mutate value | Lost update, race |
| Copy | Create independent duplicate | Torn copy |
| Move | Transfer ownership, invalidate source | Double-move |
| Alias | Second reference to same location | Data race |
| Sync | Reconcile divergent copies | (the solution) |
Sync Strategies
| Strategy | Language Level |
|---|---|
| Forbid the problem | Ownership (move semantics) |
| Freeze time | Immutable bindings |
| Serialize access | Mutex, RwLock |
| Hardware arbitration | Atomics, CAS |
| Compile-time proof | Borrow checker |
| Copy-on-write | CoW data structures |
| Message passing | Channels |
| Optimistic | STM, persistent structures |
| Trust the user | unsafe, raw pointers |
Language Choices
| Language | IDENTITY | TIME | Coherence |
|---|---|---|---|
| Haskell | Shared freely | Frozen | No mutation → no problem |
| Erlang | Process isolation | Frozen + messages | Copy between processes |
| Clojure | Shared freely | Frozen | Persistent structures |
| Rust | Ownership + borrowing | Controlled | Compile-time proof |
| Go | Shared | Free | Channels or locks |
| Java | Shared | Free | Locks / volatile |
| C/C++ | Unrestricted | Free | Programmer responsibility |
| JavaScript | Shared | Free | Single-threaded |
| Python | Shared | Free | GIL |
Language Constructs
| Construct | TIME | SPACE | IDENTITY | Coherence |
|---|---|---|---|---|
| Register | Runtime | Register | Unique | N/A |
| Stack variable | Runtime | Stack | Scoped | Lexical scope |
| Heap allocation | Runtime | Heap | Reference(s) | Manual / GC / ownership |
| Compile-time const | Compile | Inlined | N/A | None needed |
| Static / global | Program lifetime | Data segment | Global | Atomic / lock / immutable |
| Thread-local | Runtime | Per-thread | Thread-scoped | No sharing |
| Immutable value | Frozen | Any | Shared freely | Frozen → valid |
| Mutable + locked | Serialized | Heap | Shared | Mutex |
| Atomic | Hardware | RAM | Shared | Hardware coherence |
| Channel message | Runtime | Copied | Transferred | No shared state |
Examples
- Immutability enables safe sharing — frozen TIME allows IDENTITY to span SPACEs
- Rust's fearless concurrency — compile-time proof of IDENTITY rules
- Locks are slow — serialize TIME globally, threads wait
- Lock-free is hard — hardware IDENTITY arbitration is subtle
- GC doesn't prevent data races — GC manages SPACE, not IDENTITY+TIME
- JavaScript is single-threaded — serialize TIME globally
- Python's GIL — serialize TIME at interpreter level
- Value types are easier — copy creates new IDENTITY
- Pointers are dangerous — unrestricted IDENTITY + free TIME
- const differs across languages — different TIME/IDENTITY choices
Tradeoff
FLEXIBILITY ◄─────────────────────────► SAFETY
Unrestricted aliasing Restricted IDENTITY
Free mutation Controlled TIME
Manual management Compiler/runtime enforced
│ │
▼ ▼
Maximum power Maximum guarantees
C, unsafe Rust Haskell, Rust safe, Erlang
Primitive Interactions
Programs add an expression layer to the primitives:
| Expression | What it introduces |
|---|---|
| Variables | Names for SPACE |
| Scopes | Bounded regions of TIME |
| Functions | Reusable TIME sequences |
| Types | Constraints on SPACE contents |
| References | IDENTITY relationships |
| Threads | Parallel TIME lines |
All programming concepts emerge from interactions between primitives and expression.
Pairwise Interactions
| Interaction | Question | Concepts |
|---|---|---|
| SPACE × TIME | When does memory exist? | Allocation, deallocation, lifetime, scope, mutation |
| SPACE × IDENTITY | How many paths to this memory? | Variable, pointer, alias, copy, move, null |
| TIME × IDENTITY | When is a name valid? | Declaration, scope, shadowing, rebinding, drop |
Three-way Interaction
When SPACE × TIME × IDENTITY interact simultaneously:
SPACE ────────── TIME
\ /
\ /
\ /
\ /
IDENTITY
Center = hard problems
| Scenario | Interaction | Result |
|---|---|---|
| Parallel TIME + shared SPACE + multiple IDENTITY | Concurrent mutation | Data race |
| IDENTITY outlives SPACE in TIME | Reference to freed memory | Dangling pointer |
| SPACE freed, IDENTITY used later | Access after deallocation | Use-after-free |
| SPACE freed twice in TIME | Double deallocation | Double free |
| IDENTITY transferred, old used in TIME | Access after move | Use-after-move |
| Multiple IDENTITY + mutation + overlapping TIME | Writes interleave | Race condition |
Bugs as Interaction Failures
| Bug | Failed Interaction | What went wrong |
|---|---|---|
| Memory leak | SPACE × TIME | SPACE exists past needed TIME |
| Use-after-free | SPACE × TIME × IDENTITY | IDENTITY used after SPACE's TIME ends |
| Dangling pointer | TIME × IDENTITY | IDENTITY outlives referent |
| Data race | SPACE × TIME × IDENTITY | Parallel TIME + shared SPACE + mutation |
| Double free | SPACE × TIME | SPACE deallocated twice |
| Null dereference | SPACE × IDENTITY | IDENTITY points to no SPACE |
| Buffer overflow | SPACE × IDENTITY | IDENTITY exceeds SPACE bounds |
| Uninitialized read | SPACE × TIME × IDENTITY | IDENTITY used before SPACE has value |
Features as Interaction Solutions
SPACE × TIME solutions:
| Feature | Mechanism |
|---|---|
| Garbage collection | Runtime tracks SPACE, frees when unreachable |
| RAII | Tie SPACE lifetime to scope (TIME) |
| Reference counting | Track IDENTITY count, free when zero |
| Stack allocation | SPACE lifetime = function TIME |
SPACE × IDENTITY solutions:
| Feature | Mechanism |
|---|---|
| Ownership | Unique IDENTITY to SPACE |
| Move semantics | Transfer IDENTITY, invalidate source |
| Value types | Copy creates new SPACE, new IDENTITY |
| Nullable types | Explicit "IDENTITY to no SPACE" |
TIME × IDENTITY solutions:
| Feature | Mechanism |
|---|---|
| Lexical scope | IDENTITY valid in TIME region |
| Lifetimes | Explicit IDENTITY validity bounds |
| Closures | Extend IDENTITY across TIME boundaries |
| Drop order | Defined IDENTITY end sequence |
SPACE × TIME × IDENTITY solutions:
| Feature | Mechanism |
|---|---|
| Borrow checker | Prove all three consistent at compile time |
| Locks/Mutex | Serialize TIME access to SPACE |
| Atomics | Hardware-arbitrated SPACE × TIME |
| Channels | Transfer IDENTITY, no shared SPACE |
| Immutability | Freeze TIME dimension, sharing safe |
| Actor model | Isolate SPACE per actor, message only |
| Linear types | IDENTITY used exactly once |
Paradigms as Interaction Strategies
| Paradigm | Strategy | Constrains | Tradeoff |
|---|---|---|---|
| Functional | Freeze mutation | TIME (no state change) | Easy concurrency ↔ efficiency |
| OOP | Encapsulate memory | SPACE (hide behind interface) | Modularity ↔ aliasing complexity |
| Rust | Restrict aliasing | IDENTITY (ownership) | Safety + performance ↔ learning curve |
| Actor | Isolate memory | SPACE (no sharing) | Fault isolation ↔ message overhead |
| Linear | Single use | IDENTITY (exactly once) | Resource safety ↔ flexibility |
Representation Constraints
Programs are expressed in a medium: text files, ASTs, bytecode. The medium has constraints independent of SPACE/TIME/IDENTITY.
(SPACE/TIME/IDENTITY) (representation) (workarounds)
Many features exist not because of computational necessity, but because the representation can’t express what we want directly.
Constraints
| Constraint | What it means |
|---|---|
| Forward-only TIME | Execution proceeds forward; can't undo a statement |
| Names persist | Once declared, a name exists until scope ends |
| Values persist | A value exists until scope ends; can't delete mid-scope |
| Stack is LIFO | Can only deallocate top of stack |
| Text is sequential | One statement after another |
Features as Workarounds
| Want to... | Can't because... | Workaround |
|---|---|---|
| Delete a name | Names persist in scope | Shadowing |
| Delete a value | Values persist until scope end | Move + invalidation |
| Free mid-stack | Stack is LIFO | Heap allocation |
| Go back in TIME | Forward-only | Loops |
| Undo mutation | Forward-only | Immutability |
| Parallel execution | Text is sequential | Explicit threads/async |
Shadowing — Can’t Delete Names
let x = 5; // Want: delete x, reclaim the name // Can't: name persists until scope ends // Workaround: shadow let x = "hello"; // New IDENTITY, same name // Old x still in memory, just unreachable
Shadowing exists because names can’t be undeclared. The old binding still exists — destructors run at scope end, not at shadow point.
Move — Can’t Delete Values
let x = vec![1, 2, 3]; let y = x; // Want: delete x entirely after transfer // Can't: name 'x' persists in scope // Workaround: invalidate the IDENTITY // x still exists as a name, but IDENTITY is severed // Compiler tracks: "name exists, IDENTITY gone"
Move semantics exist because values can’t be deleted mid-scope. We transfer IDENTITY and mark the source as invalid.
Heap — Can’t Free Mid-Stack
Stack (LIFO):
┌─────┐
│ c │ ← top, can free
├─────┤
│ b │ ← can't free until c is gone
├─────┤
│ a │ ← can't free until b, c are gone
└─────┘
Heap exists because stack forces LIFO TIME on SPACE. Heap allows independent lifetimes, shared IDENTITY, dynamic size.
SSA — Making Constraints Explicit
Compilers transform to SSA (Static Single Assignment):
// Source let mut x = 5; x = x + 1; x = x * 2; // SSA let x1 = 5; let x2 = x1 + 1; let x3 = x2 * 2;
SSA reveals the truth: we never “modified” x. We created new values and reused the name.
- Can’t delete names → each assignment is a new IDENTITY
- Can’t go back → values flow forward
- Mutation is illusion → it’s name rebinding
Alternative Representations
| Representation | TIME | IDENTITY | SPACE | Key Workarounds |
|---|---|---|---|---|
| Imperative text | Forward-only | Names persist | Stack LIFO | Shadow, move, heap, loops |
| SSA | Forward-only | Unique names | Explicit | Phi nodes |
| Stack-based | Forward-only | Position, not name | Explicit | Stack shuffling |
| Dataflow graph | By dependency | Nodes | Edges | Control dependencies |
| Logic | Declarative | Unification | Automatic | Cut, clause order |
Abstraction Layers
Every layer in computing—hardware or software—can be understood as an abstractor: hiding complexity below while exposing a simpler interface above.
Click a concept to see its dependency thread. Hover to see connections.
Layer Details
Expand each layer to see what it abstracts (hides), exposes (provides), leaks (breaks through), and escapes (workarounds).