Persistence Layer
Runtime state persistence for the simulation engine.
Module: babylon.persistence
Package Exports
Export |
Description |
|---|---|
|
Protocol for backend-agnostic tick persistence (5 methods) |
|
Protocol for Postgres-specific subsystem persistence (12 methods) |
|
Protocol for buffered execution trace collection |
|
Protocol for backend-agnostic vector search |
|
IntEnum controlling trace verbosity |
|
PostgreSQL implementation of both persistence protocols |
|
SQLite implementation of |
|
pgvector implementation of |
|
Buffered in-memory implementation of |
|
SQLite schema DDL statements (from |
Protocols
RuntimePersistence
from babylon.persistence.protocols import RuntimePersistence
@runtime_checkable protocol. The simulation engine interacts with storage
exclusively through this interface. Both SQLite and PostgreSQL backends
implement it.
Method |
Returns |
Description |
|---|---|---|
|
|
Full graph snapshot at tick. Idempotent via UPSERT. |
|
|
Load state snapshot. |
|
|
Record tick replay metadata (RNG state, timings). |
|
|
Store key-value metadata pair. |
|
|
Retrieve metadata value by key. |
Implementations: RuntimeDatabase (SQLite), PostgresRuntime (PostgreSQL).
PostgresRuntimeExtensions
from babylon.persistence.protocols import PostgresRuntimeExtensions
@runtime_checkable protocol for subsystem state added by Features 002,
022, 029, 032, and 036. PersistenceObserver accesses these via
isinstance() check.
Method |
Returns |
Description |
|---|---|---|
|
|
Graph-level metadata (economy, state finances, tick dynamics). |
|
|
Hypergraph community state and membership records. |
|
|
Load community state and memberships at tick. |
|
|
Per-hex economic state. Bulk insert ~1,500 rows. |
|
|
Feature 036 infrastructure topology state. |
|
|
Feature 002 contradiction field values and edge curvatures. |
|
|
OODA action resolution outcomes (Feature 032). |
|
|
Pre-aggregated tick summary for time-series endpoints. |
|
|
Bulk insert trace events to |
|
|
Create |
|
|
Drop |
|
|
Export session data to Parquet files. |
Implementation: PostgresRuntime only.
TraceCollector
from babylon.persistence.protocols import TraceCollector
@runtime_checkable protocol for execution trace collection. Systems
call trace() during tick computation (in-memory buffer, no I/O).
flush() writes buffered events to storage after tick completion.
Method / Property |
Returns |
Description |
|---|---|---|
|
|
Buffer a trace event (no I/O). |
|
|
Write buffered events to storage and clear buffer. |
|
|
Configured verbosity level. |
|
|
Number of events currently buffered. |
Implementation: TraceRecorder.
VectorStoreProtocol
from babylon.persistence.protocols import VectorStoreProtocol
@runtime_checkable protocol for semantic search. The Retriever
interacts with vector storage only through this interface.
Method |
Returns |
Description |
|---|---|---|
|
|
Store document chunks with embeddings. |
|
|
Find k most similar chunks. Returns (ids, documents, embeddings, metadatas, distances). |
|
|
Delete chunks by ID. |
|
|
Total number of chunks in the store. |
Implementations: VectorStore (ChromaDB, existing), PgVectorStore (pgvector).
TraceLevel
from babylon.persistence.protocols import TraceLevel
IntEnum controlling trace verbosity. Each level includes everything below it.
Name |
Value |
Description |
|---|---|---|
|
0 |
Tracing disabled. |
|
1 |
High-level tick summaries only. |
|
2 |
Detailed system-level events. |
|
3 |
Full per-node event logging. |
Concrete Implementations
PostgresRuntime
from psycopg_pool import ConnectionPool
from babylon.persistence import PostgresRuntime
pool = ConnectionPool(conninfo="dbname=babylon")
with PostgresRuntime(pool) as pg:
pg.init_schema()
pg.persist_tick(tick=0, graph=graph, session_id=session_id)
Constructor: PostgresRuntime(pool: ConnectionPool[Connection[Any]])
Context manager: __enter__ / __exit__ (calls close()).
Additional methods:
init_schema()— Execute all DDL statements. Safe to call multiple times (usesIF NOT EXISTS).close()— Close the connection pool.pool(property) — The underlyingConnectionPoolinstance.
Implements: RuntimePersistence + PostgresRuntimeExtensions.
Uses psycopg 3 with executemany() for bulk writes. Batch size
capped at 1,000 rows per executemany call.
RuntimeDatabase
from babylon.persistence import RuntimeDatabase
# In-memory for tests
with RuntimeDatabase(in_memory=True) as db:
db.persist_tick(tick=0, graph=graph)
# File-based
with RuntimeDatabase(run_id="experiment_001") as db:
graph = db.hydrate_graph(tick=5)
Constructor: RuntimeDatabase(run_id: str | None = None, in_memory: bool = False)
run_iddefaults to a timestamp-based ID.in_memory=Trueuses:memory:SQLite database.File-based databases are stored in
data/runs/{run_id}.sqlite.
Context manager: __enter__ / __exit__ (calls close()).
Additional methods (beyond RuntimePersistence):
get_events(tick=None)— Retrieve events for a tick or all events.record_tick_summary(tick, total_c, total_v, total_s, avg_consciousness, uprising_count)— Record aggregate metrics (legacySimulationDBcompatible).get_tick_log(tick)— Retrieve tick log for replay.transaction()— Context manager for explicit transactions.
Implements: RuntimePersistence only.
PgVectorStore
from psycopg_pool import ConnectionPool
from babylon.persistence import PgVectorStore
pool = ConnectionPool(conninfo="dbname=babylon")
store = PgVectorStore(pool, dimension=768)
store.add_chunks(chunks)
ids, docs, embeddings, metadatas, distances = store.query_similar(
query_embedding=embedding, k=5
)
Constructor:
PgVectorStore(pool: ConnectionPool[Connection[Any]],
dimension: int = 1536, collection: str = "default")
dimension: Embedding vector dimension. Thedocument_chunkschema definesvector(768)for Ollama embeddinggemma.collection: Logical namespace for multi-tenant isolation.
Uses HNSW index with cosine distance (<=> operator) for approximate
nearest neighbor search. Metadata filtering via JSONB containment (@>).
Implements: VectorStoreProtocol.
TraceRecorder
from babylon.persistence import TraceRecorder
from babylon.persistence.protocols import TraceLevel
recorder = TraceRecorder(level=TraceLevel.DEBUG, persistence=pg_runtime)
# During tick (in-memory only):
recorder.trace("ImperialRentSystem", "formula_eval", {"rent": 42.0})
# After tick (flushes to DB):
recorder.flush(session_id=session_id, tick=0)
Constructor: TraceRecorder(level: TraceLevel = TraceLevel.NONE, persistence: Any = None)
When
levelisNONE,trace()is a no-op.When
persistenceisNone,flush()clears the buffer without writing.persistenceshould implementPostgresRuntimeExtensions.persist_traces().
Implements: TraceCollector.
PersistenceObserver
from babylon.engine.observers.persistence_observer import PersistenceObserver
observer = PersistenceObserver(
persistence=postgres_runtime,
session_id=session_id,
tracer=trace_recorder,
)
simulation.attach_observer(observer)
Module: babylon.engine.observers.persistence_observer
Constructor:
PersistenceObserver(persistence: RuntimePersistence,
session_id: UUID, tracer: TraceCollector | None = None)
Implements: SimulationObserver protocol.
Lifecycle:
on_simulation_start(initial_state, config)— Stores config as metadata, persists initial state (tick 0).on_tick(previous_state, new_state)— Callspersist_tick()on theRuntimePersistencebackend. If the backend also implementsPostgresRuntimeExtensions, callspersist_graph_metadata()and other extended methods. Flushes tracer after each tick.on_simulation_end(final_state)— Setsend_tickandstatusmetadata. Flushes any remaining trace events.
Database Schema (Summary)
23 PostgreSQL tables and views across 9 layers. Full DDL in
babylon.persistence.postgres_schema.POSTGRES_SCHEMA_DDL.
Layer |
Tables |
Purpose |
|---|---|---|
Game Management (3) |
|
Session lifecycle, player turns, OODA action outcomes |
Simulation State (10) |
|
Full graph snapshots, community hypergraph, contradiction fields, events, replay metadata, aggregated summaries |
Spatial (3) |
|
H3 hex geometries (PostGIS), county-to-hex mapping, per-hex terrain |
Snapshot (4) |
|
Append-only per-tick journals for county economics, sparse hex events, aggregates, and events |
Hex Cache (2) |
|
Denormalized R7 frontend cache and static R8 terrain/infrastructure |
Composition Views (5) |
|
Column projections from |
Infrastructure (1) |
|
Per-edge infrastructure capacity and condition |
Trace (1) |
|
Execution trace events (UNLOGGED, partitioned by |
Semantic (1) |
|
Document embeddings for vector search (pgvector) |
All snapshot tables use composite primary keys of (game_id, tick, entity_id)
for session-scoped temporal isolation. game_session uses UUID PK.
hex_latest uses (game_id, h3_index) PK (current state only).
Required PostgreSQL extensions: postgis, vector (pgvector),
uuid-ossp.
Multi-Resolution Hex Journal (Feature 037)
The multi-resolution hex journal is a tiered persistence architecture optimized for national-scale simulation at H3 resolution 7 (~243K hexes for CONUS). It achieves a 305× storage reduction compared to flat per-hex per-tick snapshots.
Architecture
Four tables at three resolution tiers:
Table |
Resolution |
Cardinality |
Purpose |
|---|---|---|---|
|
County |
~3,100 rows/tick |
Economic state (ValueTensor4x3, profit rate, class distribution) |
|
R7 (sparse) |
~5K rows/tick |
Heat, organizations, player/AI actions (only hexes with events) |
|
R8 (static) |
~1.7M rows (once) |
Terrain, water, broadband, surveillance (written at init) |
|
R7 (all) |
~243K rows (current) |
Denormalized cache merging all tiers for frontend O(1) reads |
Data Flow
territory_snapshot ──► Phase 1: Broadcast ──► hex_latest ──► Frontend
hex_activity ──────► Phase 2: Overlay ───┘
hex_substrate ─────► Aggregated at init ─┘
Write path (each tick):
Systems write
territory_snapshot(~3,100 rows) andhex_activity(sparse, ~5K rows).refresh_hex_latest()runs two SQL UPDATEs to synchronize the cache.
Read path (frontend): SELECT from hex_latest or composition
views. No JOINs required.
refresh_hex_latest (Two-Phase UPSERT)
pg.refresh_hex_latest(game_id=game_id, tick=tick)
- Phase 1 — Territory broadcast (~20ms for 243K hexes):
UPDATE hex_latest FROM territory_snapshot JOIN hex_map. All hexes in a county receive identical economic values.- Phase 2 — Hex activity overlay (~0.5ms for ~5K sparse rows):
UPDATE hex_latest FROM hex_activityfor heat, org, and action fields. Only hexes with activity this tick are touched.
seed_hex_latest (Tick-0 ETL)
from babylon.persistence.hex_init import seed_hex_latest
inserted = seed_hex_latest(pool, game_id) # After territory_snapshot + hex_map init
INSERT...SELECT that JOINs territory_snapshot (tick 0),
hex_map, and optionally hex_substrate (R8 terrain via
MODE(), AVG(), BOOL_OR() aggregation).
reconstruct_hex_state (Historical Queries)
rows = pg.reconstruct_hex_state(game_id=game_id, tick=42)
Reconstructs any past tick by JOINing the append-only
territory_snapshot and hex_activity journals via hex_map.
Uses LATERAL join for efficient sparse event lookup. Returns a
list of per-hex dicts with economic, demographic, and activity fields.
Note
hex_latest only holds the current tick. For historical or
time-series analysis, always use reconstruct_hex_state().
Composition Views
Five views project subsets of hex_latest columns for frontend map
layers. All are trivial SELECT projections — no JOINs.
View |
Columns |
|---|---|
|
profit_rate, exploitation_rate, occ, imperial_rent, pop_total, heat |
|
mobilizable_pop, org_presence, hex_heat, dominant_class |
|
heat, heat_delta, org_count (filtered: heat > 0) |
|
internet_access, terrain_type, water_coverage |
|
faction_*, dominant_class, org_count, actions_taken |
Storage Budget
Component |
Rows / tick |
Bytes / tick |
260 ticks (10-year sim) |
|---|---|---|---|
|
3,100 |
~770 KB |
~200 MB |
|
~5,000 |
~200 KB |
~52 MB |
|
1,700,000 |
~300 MB |
300 MB (fixed) |
|
243,000 |
~30 MB |
30 MB (fixed) |
Total |
~582 MB / session |
Archival Pipeline (Stubs)
Four functions in babylon.persistence.archival. All raise
NotImplementedError (Phase 8, tasks T045-T048).
Function |
Description |
|---|---|
|
Export session data to Parquet files via PyArrow. |
|
Upload Parquet files to Cloudflare R2. |
|
Delete session data from Postgres after verified export. |
|
Query archived session data via DuckDB. |
See Also
Persistence Architecture — Design rationale for the persistence layer
Architecture: The Embedded Trinity — Embedded Trinity architecture overview