Abstractor

Abstractor

Abstractor

System Primitives

These constraints are rooted in physics and fundamental CS. They apply at every layer. Abstractions can defer them, transform them, or trade one for another — but never eliminate them.


Derived Data - Systems

Physics creates distance. Distance forces copies. Copies require coherence.

Computation and storage are separated by distance — in the memory hierarchy, across the network, even in time. When crossing that distance repeatedly is too expensive, you store a copy closer. That copy is derived data.

Source → transform → store closer → faster access → two representations → sync obligation

Every cache, replica, index, materialized view, denormalized table, and memoized result is the same pattern: store a transform of source data closer to consumption. Pay for sync.

The Three Choices

  • 1. What transform?
    • Identity → CPU cache, CDN, replica
    • Projection → covering index, column store
    • Aggregation → materialized view, rollup
    • Structure change → B-tree, hash, inverted index
    • Lossy → bloom filter, HyperLogLog, sketch
  • 2. Where to store?
    • Memory → register → L1 → L2 → L3 → RAM → SSD → disk
    • Network → same-process → machine → rack → DC → region → edge
    • Time → precomputed → on-demand → lazy
  • 3. How to sync?
    • Sync on write → strong consistency, write pays RTT
    • Invalidate → strong consistency, invalidation fanout
    • TTL → bounded staleness, no coordination
    • Version on read → strong consistency, read pays check
    • Never → immutable source, no cost

Unification

NameTransformLocationSync
CPU L1 cacheidentityRAM → L1coherence protocol
CDNidentityorigin → edgeTTL / purge
Redis cacheidentity / projectiondisk → RAMTTL / invalidate
Database replicaidentityprimary → secondarysync / async
Materialized viewaggregationcompute → storagerefresh / periodic
Indexprojection + structurescan → lookupsync on write
Denormalized tablejoinjoin-time → storagedual write
Memoizationfull resultcompute → lookupnone (pure)
Bloom filterlossy projectionset → bitsrebuild

What This Explains

  • Cache invalidation is hard — it's distributed coordination
  • Immutability is powerful — no sync needed, copies are forever valid
  • Indexes slow writes — sync obligation on every mutation
  • Eventual consistency exists — coordination is expensive, defer it
  • CDNs use TTL — bounded staleness avoids coordination
  • Denormalization is dangerous — multiple sync points, easy to diverge
  • Memoization is easy — pure functions have immutable inputs

Concrete Tradeoff Examples

Real-world technology choices mapped to the primitives above. Hover to see which constraints apply.


Derived Data - Languages

Physics creates distance. Distance forces copies. Copies require coherence.

This pattern applies at every layer of the stack.

SYSTEM:   Source DB → replica → CDN edge → browser cache
LANGUAGE: Value → alias → copy → register

Isomorphism

Derived Data (System)Derived Data (Language)
SourceOriginal value / memory location
TransformCopy / reference / move / borrow
Store closerRegister, stack, local variable, cache line
Two representationsMultiple references to same data
Sync obligationCoherence: who can read/write when?

Layers

LayerSourceDerived CopySync Strategy
CPU cacheRAML1/L2/L3 lineMESI protocol
CompilerMemoryRegisterRegister allocation
LanguageOriginal bindingAlias / referenceBorrow checker / locks / GC
ThreadShared heapThread-localMutex / channels / atomics
ProcessShared memoryProcess-localIPC / message passing
DatabasePrimaryReplicaReplication protocol
NetworkOrigin serverCDN edgeTTL / invalidation
Geo-distributedRegion ARegion BEventual consistency

Triangle

Three primitives rooted in physics:

                      TIME
                     (when)
                       △
                      /|\
                     / | \
                    /  |  \
                   / STATE \
                  /    |    \
                 /     |     \
                /      |      \
             SPACE ----+---- IDENTITY
            (where)         (which)
PrimitivePhysical RootQuestion
SPACELocality, memory hierarchyWhere does data reside?
TIMECausality, sequenceWhen does it exist/change?
IDENTITYEquivalence, samenessAre these the same data?

STATE emerges from the triangle:

STATE = f(SPACE, TIME)

Dimensions of TIME

TIME is overloaded in programming:

DimensionQuestionExample
ExecutionWhat order?Statement A before B
ParallelSimultaneous?Thread 1 and Thread 2
ExistenceHow long?Value lifetime, reference validity

Coherence problems arise at the intersections:

Parallel TIME + shared SPACE → data races
Existence TIME mismatch → dangling references, use-after-free

Coherence Problem

Three ingredients:

SHARED IDENTITY Multiple paths to "the same" data + MULTIPLE SPACES Data exists in more than one location + TIME FLOWS Mutation can occur = COHERENCE PROBLEM Which STATE is true? How to sync?

Remove any one:

RemoveStrategy
Shared IdentityValue semantics, deep copy
Multiple SpacesSingle source of truth
Time flowsImmutability

Operations

OperationMeaningParallel Hazard
ReadObserve valueStale/torn reads
WriteMutate valueLost update, race
CopyCreate independent duplicateTorn copy
MoveTransfer ownership, invalidate sourceDouble-move
AliasSecond reference to same locationData race
SyncReconcile divergent copies(the solution)

Sync Strategies

StrategyLanguage Level
Forbid the problemOwnership (move semantics)
Freeze timeImmutable bindings
Serialize accessMutex, RwLock
Hardware arbitrationAtomics, CAS
Compile-time proofBorrow checker
Copy-on-writeCoW data structures
Message passingChannels
OptimisticSTM, persistent structures
Trust the userunsafe, raw pointers

Language Choices

LanguageIDENTITYTIMECoherence
HaskellShared freelyFrozenNo mutation → no problem
ErlangProcess isolationFrozen + messagesCopy between processes
ClojureShared freelyFrozenPersistent structures
RustOwnership + borrowingControlledCompile-time proof
GoSharedFreeChannels or locks
JavaSharedFreeLocks / volatile
C/C++UnrestrictedFreeProgrammer responsibility
JavaScriptSharedFreeSingle-threaded
PythonSharedFreeGIL

Language Constructs

ConstructTIMESPACEIDENTITYCoherence
RegisterRuntimeRegisterUniqueN/A
Stack variableRuntimeStackScopedLexical scope
Heap allocationRuntimeHeapReference(s)Manual / GC / ownership
Compile-time constCompileInlinedN/ANone needed
Static / globalProgram lifetimeData segmentGlobalAtomic / lock / immutable
Thread-localRuntimePer-threadThread-scopedNo sharing
Immutable valueFrozenAnyShared freelyFrozen → valid
Mutable + lockedSerializedHeapSharedMutex
AtomicHardwareRAMSharedHardware coherence
Channel messageRuntimeCopiedTransferredNo shared state

Examples

  • Immutability enables safe sharing — frozen TIME allows IDENTITY to span SPACEs
  • Rust's fearless concurrency — compile-time proof of IDENTITY rules
  • Locks are slow — serialize TIME globally, threads wait
  • Lock-free is hard — hardware IDENTITY arbitration is subtle
  • GC doesn't prevent data races — GC manages SPACE, not IDENTITY+TIME
  • JavaScript is single-threaded — serialize TIME globally
  • Python's GIL — serialize TIME at interpreter level
  • Value types are easier — copy creates new IDENTITY
  • Pointers are dangerous — unrestricted IDENTITY + free TIME
  • const differs across languages — different TIME/IDENTITY choices

Tradeoff

FLEXIBILITY ◄─────────────────────────► SAFETY

  Unrestricted aliasing         Restricted IDENTITY
  Free mutation                 Controlled TIME
  Manual management             Compiler/runtime enforced
         │                              │
         ▼                              ▼
  Maximum power                 Maximum guarantees
  C, unsafe Rust                Haskell, Rust safe, Erlang
1. Which axis to constrain?
TIME → immutability
SPACE → single location
IDENTITY → ownership, value semantics
2. Who enforces it?
• Programmer
• Compiler
• Runtime
• Hardware
3. When to check?
• Compile-time
• Runtime
• Never

Primitive Interactions

Programs add an expression layer to the primitives:

ExpressionWhat it introduces
VariablesNames for SPACE
ScopesBounded regions of TIME
FunctionsReusable TIME sequences
TypesConstraints on SPACE contents
ReferencesIDENTITY relationships
ThreadsParallel TIME lines

All programming concepts emerge from interactions between primitives and expression.

Pairwise Interactions

InteractionQuestionConcepts
SPACE × TIMEWhen does memory exist?Allocation, deallocation, lifetime, scope, mutation
SPACE × IDENTITYHow many paths to this memory?Variable, pointer, alias, copy, move, null
TIME × IDENTITYWhen is a name valid?Declaration, scope, shadowing, rebinding, drop

Three-way Interaction

When SPACE × TIME × IDENTITY interact simultaneously:

    SPACE ────────── TIME
        \           /
         \         /
          \       /
           \     /
          IDENTITY

    Center = hard problems
ScenarioInteractionResult
Parallel TIME + shared SPACE + multiple IDENTITYConcurrent mutationData race
IDENTITY outlives SPACE in TIMEReference to freed memoryDangling pointer
SPACE freed, IDENTITY used laterAccess after deallocationUse-after-free
SPACE freed twice in TIMEDouble deallocationDouble free
IDENTITY transferred, old used in TIMEAccess after moveUse-after-move
Multiple IDENTITY + mutation + overlapping TIMEWrites interleaveRace condition

Bugs as Interaction Failures

BugFailed InteractionWhat went wrong
Memory leakSPACE × TIMESPACE exists past needed TIME
Use-after-freeSPACE × TIME × IDENTITYIDENTITY used after SPACE's TIME ends
Dangling pointerTIME × IDENTITYIDENTITY outlives referent
Data raceSPACE × TIME × IDENTITYParallel TIME + shared SPACE + mutation
Double freeSPACE × TIMESPACE deallocated twice
Null dereferenceSPACE × IDENTITYIDENTITY points to no SPACE
Buffer overflowSPACE × IDENTITYIDENTITY exceeds SPACE bounds
Uninitialized readSPACE × TIME × IDENTITYIDENTITY used before SPACE has value

Features as Interaction Solutions

SPACE × TIME solutions:

FeatureMechanism
Garbage collectionRuntime tracks SPACE, frees when unreachable
RAIITie SPACE lifetime to scope (TIME)
Reference countingTrack IDENTITY count, free when zero
Stack allocationSPACE lifetime = function TIME

SPACE × IDENTITY solutions:

FeatureMechanism
OwnershipUnique IDENTITY to SPACE
Move semanticsTransfer IDENTITY, invalidate source
Value typesCopy creates new SPACE, new IDENTITY
Nullable typesExplicit "IDENTITY to no SPACE"

TIME × IDENTITY solutions:

FeatureMechanism
Lexical scopeIDENTITY valid in TIME region
LifetimesExplicit IDENTITY validity bounds
ClosuresExtend IDENTITY across TIME boundaries
Drop orderDefined IDENTITY end sequence

SPACE × TIME × IDENTITY solutions:

FeatureMechanism
Borrow checkerProve all three consistent at compile time
Locks/MutexSerialize TIME access to SPACE
AtomicsHardware-arbitrated SPACE × TIME
ChannelsTransfer IDENTITY, no shared SPACE
ImmutabilityFreeze TIME dimension, sharing safe
Actor modelIsolate SPACE per actor, message only
Linear typesIDENTITY used exactly once

Paradigms as Interaction Strategies

ParadigmStrategyConstrainsTradeoff
FunctionalFreeze mutationTIME (no state change)Easy concurrency ↔ efficiency
OOPEncapsulate memorySPACE (hide behind interface)Modularity ↔ aliasing complexity
RustRestrict aliasingIDENTITY (ownership)Safety + performance ↔ learning curve
ActorIsolate memorySPACE (no sharing)Fault isolation ↔ message overhead
LinearSingle useIDENTITY (exactly once)Resource safety ↔ flexibility
Bugs are interaction failures. Features are interaction solutions. Paradigms are holistic bets on which axis to constrain.

Representation Constraints

Programs are expressed in a medium: text files, ASTs, bytecode. The medium has constraints independent of SPACE/TIME/IDENTITY.

PRIMITIVES × MEDIUM CONSTRAINTS = FORCED FEATURES
(SPACE/TIME/IDENTITY)     (representation)          (workarounds)

Many features exist not because of computational necessity, but because the representation can’t express what we want directly.

Constraints

ConstraintWhat it means
Forward-only TIMEExecution proceeds forward; can't undo a statement
Names persistOnce declared, a name exists until scope ends
Values persistA value exists until scope ends; can't delete mid-scope
Stack is LIFOCan only deallocate top of stack
Text is sequentialOne statement after another

Features as Workarounds

Want to...Can't because...Workaround
Delete a nameNames persist in scopeShadowing
Delete a valueValues persist until scope endMove + invalidation
Free mid-stackStack is LIFOHeap allocation
Go back in TIMEForward-onlyLoops
Undo mutationForward-onlyImmutability
Parallel executionText is sequentialExplicit threads/async

Shadowing — Can’t Delete Names

let x = 5;
// Want: delete x, reclaim the name
// Can't: name persists until scope ends
// Workaround: shadow

let x = "hello";  // New IDENTITY, same name
                  // Old x still in memory, just unreachable

Shadowing exists because names can’t be undeclared. The old binding still exists — destructors run at scope end, not at shadow point.

Move — Can’t Delete Values

let x = vec![1, 2, 3];
let y = x;
// Want: delete x entirely after transfer
// Can't: name 'x' persists in scope
// Workaround: invalidate the IDENTITY

// x still exists as a name, but IDENTITY is severed
// Compiler tracks: "name exists, IDENTITY gone"

Move semantics exist because values can’t be deleted mid-scope. We transfer IDENTITY and mark the source as invalid.

Heap — Can’t Free Mid-Stack

Stack (LIFO):

    ┌─────┐
    │  c  │  ← top, can free
    ├─────┤
    │  b  │  ← can't free until c is gone
    ├─────┤
    │  a  │  ← can't free until b, c are gone
    └─────┘

Heap exists because stack forces LIFO TIME on SPACE. Heap allows independent lifetimes, shared IDENTITY, dynamic size.

SSA — Making Constraints Explicit

Compilers transform to SSA (Static Single Assignment):

// Source
let mut x = 5;
x = x + 1;
x = x * 2;

// SSA
let x1 = 5;
let x2 = x1 + 1;
let x3 = x2 * 2;

SSA reveals the truth: we never “modified” x. We created new values and reused the name.

  • Can’t delete names → each assignment is a new IDENTITY
  • Can’t go back → values flow forward
  • Mutation is illusion → it’s name rebinding

Alternative Representations

RepresentationTIMEIDENTITYSPACEKey Workarounds
Imperative textForward-onlyNames persistStack LIFOShadow, move, heap, loops
SSAForward-onlyUnique namesExplicitPhi nodes
Stack-basedForward-onlyPosition, not nameExplicitStack shuffling
Dataflow graphBy dependencyNodesEdgesControl dependencies
LogicDeclarativeUnificationAutomaticCut, clause order
Features are shaped by representation. The medium constrains what's expressible. Language designers create workarounds. Understanding the constraint explains the workaround.

Abstraction Layers

Every layer in computing—hardware or software—can be understood as an abstractor: hiding complexity below while exposing a simpler interface above.

Click a concept to see its dependency thread. Hover to see connections.

Engineering Profiles

Layer Details

Expand each layer to see what it abstracts (hides), exposes (provides), leaks (breaks through), and escapes (workarounds).

This post is licensed under CC BY 4.0 by the author.