babylon.ai

AI layer for narrative generation and game mastering.

This package contains the Ideological Superstructure - AI components that observe the simulation but cannot modify its state.

The key principle: AI components are observers, not controllers. They generate narrative from state changes but never influence the Material Base (simulation mechanics).

Components: - NarrativeDirector: AI Game Master that observes and narrates - DialecticalPromptBuilder: Builds prompts following Marxist dialectical materialism - LLMProvider: Protocol for swappable LLM backends - MockLLM: Deterministic mock for testing - DeepSeekClient: Production DeepSeek API client - NarrativeCommissar: LLM-as-judge for narrative evaluation - JudgmentResult: Evaluation metrics from Commissar - MetaphorFamily: Metaphor category enum

Sprint 3.2: Added RAG integration for historical/theoretical context. Sprint 3.3: Added LLM Provider strategy pattern for text generation. Sprint 4.2: Added Persona system for customizable narrative voices. Sprint 4.3: Added NarrativeCommissar for automated narrative evaluation.

class babylon.ai.NarrativeDirector(use_llm=False, rag_pipeline=None, prompt_builder=None, llm=None, persona=None)[source]

Bases: object

AI Game Master that observes simulation and generates narrative.

The Director watches state transitions and produces human-readable narrative describing the class struggle dynamics.

Sprint 3.2: Added RAG integration for “The Materialist Retrieval”. The Director can now query the Archive for historical and theoretical context to inform narrative generation.

Sprint 3.4: Added Semantic Bridge to translate simulation event keywords into theoretical query strings for better RAG retrieval.

Parameters:

use_llm (bool)
rag_pipeline (RagPipeline | None)
prompt_builder (DialecticalPromptBuilder | None)
llm (LLMProvider | None)
persona (Persona | None)

name: Observer identifier (“NarrativeDirector”).

use_llm: Whether to use LLM for narrative (False = template-based).

rag_pipeline: Optional RAG pipeline for context retrieval.

SEMANTIC_MAP: Class constant mapping event keywords to theory queries.

Example

>>> from babylon.ai import NarrativeDirector
>>> from babylon.engine import Simulation
>>> from babylon.rag import RagPipeline
>>>
>>> # With RAG integration
>>> rag = RagPipeline()
>>> director = NarrativeDirector(rag_pipeline=rag)
>>> sim = Simulation(initial_state, config, observers=[director])
>>> sim.run(10)  # Director queries RAG for context
>>> sim.end()

FALLBACK_QUERY: str = 'dialectical materialism class struggle'

SEMANTIC_MAP: dict[EventType, str] = {EventType.CONSCIOUSNESS_TRANSMISSION: 'development of class consciousness and proletariat solidarity', EventType.ECONOMIC_CRISIS: 'tendency of the rate of profit to fall and capitalist crisis', EventType.ENDGAME_REACHED: 'historical materialism dialectical resolution revolutionary victory ecological crisis fascism', EventType.EXCESSIVE_FORCE: 'state violence police brutality and repression', EventType.IMPERIAL_SUBSIDY: 'role of repression in maintaining imperialist client states', EventType.MASS_AWAKENING: 'leninist theory of revolutionary situation and mass strike', EventType.PHASE_TRANSITION: 'phase transition revolutionary organization vanguard party', EventType.RUPTURE: 'dialectical contradiction rupture revolutionary crisis', EventType.SOLIDARITY_SPIKE: 'solidarity networks mutual aid class organization', EventType.SURPLUS_EXTRACTION: 'marxist theory of surplus value extraction and exploitation', EventType.UPRISING: 'mass uprising revolutionary insurrection george floyd protests'}

SIGNIFICANT_EVENT_TYPES: frozenset[EventType] = frozenset({EventType.ECONOMIC_CRISIS, EventType.ENDGAME_REACHED, EventType.EXCESSIVE_FORCE, EventType.MASS_AWAKENING, EventType.PHASE_TRANSITION, EventType.RUPTURE, EventType.SUPERWAGE_CRISIS, EventType.SURPLUS_EXTRACTION, EventType.TERMINAL_DECISION, EventType.UPRISING})

__init__(use_llm=False, rag_pipeline=None, prompt_builder=None, llm=None, persona=None)[source]

Initialize the NarrativeDirector.

Parameters:

use_llm (bool) – If True, use LLM for narrative generation. If False, use template-based generation (default).
rag_pipeline (RagPipeline | None) – Optional RagPipeline for context retrieval. If None, RAG features are disabled (backward compat).
prompt_builder (DialecticalPromptBuilder | None) – Optional custom DialecticalPromptBuilder. If None, creates default builder.
llm (LLMProvider | None) – Optional LLMProvider for text generation. If None, no LLM generation occurs (backward compat).
persona (Persona | None) – Optional Persona for customizing narrative voice. If provided (and no custom prompt_builder), creates a DialecticalPromptBuilder with this persona.

Return type:

None

property dual_narratives: dict[int, dict[str, Any]]

Return dual narratives indexed by tick.

Returns a copy of the internal dict to prevent external modification.

Returns:: Dict mapping tick numbers to narrative entries containing ‘event’, ‘corporate’, and ‘liberated’ keys.

property name: str

Return observer identifier.

Returns:: The string “NarrativeDirector”.

property narrative_log: list[str]

Return generated narrative entries.

Returns a copy of the internal list to prevent external modification.

Returns:: List of generated narrative strings.

on_simulation_end(final_state)[source]

Generate summary at simulation end.

Logs the simulation end event with final state info.

Parameters:: final_state (WorldState) – The final WorldState when simulation ends.
Return type:: None

on_simulation_start(initial_state, config)[source]

Initialize narrative context at simulation start.

Logs the simulation start event with initial state info.

Parameters:

initial_state (WorldState) – The WorldState at tick 0.
config (SimulationConfig) – The SimulationConfig for this run.

Return type:

None

on_tick(_previous_state, new_state)[source]

Analyze state change and log narrative.

Detects new typed events added during this tick, retrieves RAG context, and builds the full context hierarchy for narrative generation.

Sprint 4.1: Now processes typed SimulationEvent objects from state.events instead of string-based event_log.

Parameters:

previous_state – WorldState before the tick.
new_state (WorldState) – WorldState after the tick.
_previous_state (WorldState)

Return type:

None

property rag_pipeline: RagPipeline | None

Return the RAG pipeline if configured.

Returns:: RagPipeline instance or None if not configured.

property use_llm: bool

Return whether LLM is enabled.

Returns:: True if LLM-based narrative is enabled, False otherwise.

class babylon.ai.DialecticalPromptBuilder(persona=None)[source]

Bases: object

Builds prompts following Marxist dialectical materialism.

The builder creates structured prompts that ground AI responses in material conditions and class analysis. It follows the context hierarchy defined in AI_COMMS.md.

Sprint 4.2: Added persona support for customizable narrative voices. When a persona is provided, build_system_prompt() returns the persona’s rendered prompt instead of the default.

Parameters:: persona (Persona | None)

persona: Optional Persona to use for system prompt generation.

Example

>>> builder = DialecticalPromptBuilder()
>>> system_prompt = builder.build_system_prompt()
>>> context = builder.build_context_block(state, rag_docs, events)
>>>
>>> # With persona (Sprint 4.2)
>>> from babylon.ai.persona_loader import load_default_persona
>>> percy = load_default_persona()
>>> builder = DialecticalPromptBuilder(persona=percy)
>>> system_prompt = builder.build_system_prompt()

__init__(persona=None)[source]

Initialize the DialecticalPromptBuilder.

Parameters:: persona (Persona | None) – Optional Persona to use for system prompt generation. If provided, build_system_prompt() will use the persona’s render_system_prompt() method. If None, uses the default Marxist game master prompt.
Return type:: None

build_context_block(state, rag_context, events)[source]

Assemble the Context Hierarchy.

Builds context following AI_COMMS.md hierarchy: 1. Material Conditions (from WorldState) 2. Historical/Theoretical Context (from RAG) 3. Recent Events (from tick delta)

Sprint 4.1: Now accepts typed SimulationEvent objects instead of strings.

Parameters:

state (WorldState) – Current WorldState for material conditions.
rag_context (list[str]) – Retrieved documents from RAG pipeline.
events (list[SimulationEvent]) – New typed events from this tick (SimulationEvent objects).

Return type:

str

Returns:

Formatted context block string.

build_system_prompt()[source]

Return the immutable core identity of the Director.

If a persona is configured (Sprint 4.2), returns the persona’s rendered system prompt. Otherwise, returns the default Marxist game master prompt.

Return type:: str
Returns:: System prompt establishing the AI’s identity and role.

property persona: Persona | None

Return the persona if configured.

Returns:: The Persona instance or None if not configured.

class babylon.ai.LLMProvider(*args, **kwargs)[source]

Bases: Protocol

Protocol for LLM text generation providers.

Follows the same pattern as SimulationObserver - loose coupling via Protocol enables easy testing and provider swapping.

SYNC API: All implementations use synchronous interfaces to avoid event loop conflicts with other asyncio.run() callers (e.g., RAG).

__init__(*args, **kwargs)

generate(prompt, system_prompt=None, temperature=0.7)[source]

Generate text from prompt (synchronous).

Parameters:

prompt (str) – User prompt / context
system_prompt (str | None) – Optional system instructions
temperature (float) – Sampling temperature (0.0-1.0)

Return type:

str

Returns:

Generated text response

Raises:

LLMGenerationError – On API or generation failure

property name: str: Provider identifier for logging.

class babylon.ai.MockLLM(responses=None, default_response='Mock LLM response')[source]

Bases: object

Deterministic mock LLM for testing.

Returns pre-configured responses in queue order, or a fixed default response. Synchronous API.

This is the primary testing tool for NarrativeDirector - it allows tests to verify behavior without network calls.

Parameters:

responses (list[str] | None)
default_response (str)

__init__(responses=None, default_response='Mock LLM response')[source]

Initialize MockLLM.

Parameters:

responses (list[str] | None) – Queue of responses to return in FIFO order
default_response (str) – Response when queue is empty

Return type:

None

property call_count: int: Number of times generate() was called.

property call_history: list[dict[str, Any]]

History of all calls with arguments.

Returns a copy to prevent external modification.

generate(prompt, system_prompt=None, temperature=0.7)[source]

Generate response synchronously.

Parameters:

prompt (str) – User prompt / context
system_prompt (str | None) – Optional system instructions
temperature (float) – Sampling temperature (ignored by mock)

Return type:

str

Returns:

Next queued response or default response

property name: str: Provider identifier for logging.

class babylon.ai.DeepSeekClient(config=None)[source]

Bases: object

DeepSeek LLM client using OpenAI-compatible API.

Primary LLM provider for Babylon narrative generation. Uses the openai Python package with custom base_url.

SYNC API: Uses synchronous OpenAI client to avoid event loop conflicts with RAG queries that use asyncio.run().

Parameters:: config (type[LLMConfig] | None)

__init__(config=None)[source]

Initialize DeepSeekClient.

Parameters:: config (type[LLMConfig] | None) – LLM configuration class (defaults to LLMConfig)
Raises:: LLMGenerationError – If API key is not configured
Return type:: None

generate(prompt, system_prompt=None, temperature=0.7)[source]

Generate text synchronously.

Uses the sync OpenAI client directly, avoiding event loop conflicts with other code that uses asyncio.run().

Parameters:

prompt (str) – User prompt / context
system_prompt (str | None) – Optional system instructions
temperature (float) – Sampling temperature (0.0-1.0)

Return type:

str

Returns:

Generated text response

Raises:

LLMGenerationError – On API or generation failure

property name: str: Provider identifier for logging.

class babylon.ai.Persona(**data)[source]

Bases: BaseModel

AI narrative persona for the NarrativeDirector (immutable).

Defines the complete identity and behavioral rules for an AI persona. Personas are loaded from JSON files and validated against JSON Schema.

Parameters:

id (str)
name (str)
role (str)
voice (VoiceConfig)
obsessions (list[str])
directives (list[str])
restrictions (list[str])

id: Unique snake_case identifier.

name: Full display name of the persona.

role: Persona’s role or title.

voice: Voice characteristics (nested VoiceConfig).

obsessions: Topics the persona is obsessed with analyzing.

directives: Behavioral directives for the persona.

restrictions: Topics or behaviors to avoid (default empty).

Example

>>> from babylon.ai.persona import Persona, VoiceConfig
>>> voice = VoiceConfig(
...     tone="Clinical",
...     style="Marxist",
...     address_user_as="Architect",
... )
>>> persona = Persona(
...     id="test_persona",
...     name="Test",
...     role="Role",
...     voice=voice,
...     obsessions=["Material conditions"],
...     directives=["Analyze"],
... )
>>> persona.restrictions
[]

model_config: ClassVar[ConfigDict] = {'frozen': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

render_system_prompt()[source]

Render the persona as an LLM system prompt.

Formats all persona attributes into a structured prompt that establishes the AI’s identity and behavioral rules.

Return type:: str
Returns:: Formatted system prompt string for LLM consumption.

Example

>>> persona = Persona(...)
>>> prompt = persona.render_system_prompt()
>>> assert persona.name in prompt

id: str

name: str

role: str

voice: VoiceConfig

obsessions: list[str]

directives: list[str]

restrictions: list[str]

class babylon.ai.VoiceConfig(**data)[source]

Bases: BaseModel

Voice characteristics for a narrative persona (immutable).

Defines the tone, style, and user address pattern for an AI persona. This is a nested model within Persona.

Parameters:

tone (str)
style (str)
address_user_as (str)

tone: Emotional and rhetorical tone (e.g., “Clinical, Dialectical”).

style: Writing style and theoretical framework.

address_user_as: How the persona addresses the user (e.g., “Architect”).

Example

>>> voice = VoiceConfig(
...     tone="Clinical, Dialectical, Revolutionary",
...     style="High-theoretical Marxist analysis",
...     address_user_as="Architect",
... )
>>> voice.tone
'Clinical, Dialectical, Revolutionary'

model_config: ClassVar[ConfigDict] = {'frozen': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

tone: str

style: str

address_user_as: str

exception babylon.ai.PersonaLoadError(message, path, errors=None)[source]

Bases: Exception

Error raised when persona loading fails.

Parameters:

message (str)
path (Path)
errors (list[str] | None)

Return type:

None

path: Path to the persona file that failed to load.

errors: List of validation error messages (if validation failed).

__init__(message, path, errors=None)[source]

Initialize PersonaLoadError.

Parameters:

message (str) – Human-readable error message.
path (Path) – Path to the persona file.
errors (list[str] | None) – List of validation error messages.

Return type:

None

babylon.ai.load_persona(path)[source]

Load and validate a persona from a JSON file.

Parameters:: path (Path) – Path to the persona JSON file.
Return type:: Persona
Returns:: Validated Persona instance.
Raises:: PersonaLoadError – If file doesn’t exist, JSON is invalid, or schema validation fails.

Example

>>> from pathlib import Path
>>> from babylon.ai.persona_loader import load_persona
>>> persona = load_persona(Path("path/to/persona.json"))
>>> persona.name
"Persephone 'Percy' Raskova"

babylon.ai.load_default_persona()[source]

Load the default persona (Persephone ‘Percy’ Raskova).

Return type:: Persona
Returns:: The default Persona instance.
Raises:: PersonaLoadError – If the default persona file is missing or invalid.

Example

>>> from babylon.ai.persona_loader import load_default_persona
>>> percy = load_default_persona()
>>> percy.id
'persephone_raskova'

class babylon.ai.NarrativeCommissar(llm)[source]

Bases: object

LLM-powered judge that evaluates narrative quality.

The Commissar uses structured prompting to extract consistent metrics from narrative text. It follows the LLM-as-judge pattern where the LLM acts as a consistent evaluator rather than a generator.

This enables automated verification of narrative hypotheses like the “Dialectical U-Curve” - that narrative certainty follows a U-shape across economic conditions.

Parameters:: llm (LLMProvider)

name: Identifier for logging (“NarrativeCommissar”).

Example

>>> from babylon.ai import MockLLM
>>> from babylon.ai.judge import NarrativeCommissar
>>> mock = MockLLM(responses=['{"ominousness": 5, "certainty": 5, "drama": 5, "metaphor_family": "none"}'])
>>> commissar = NarrativeCommissar(llm=mock)
>>> result = commissar.evaluate("The workers unite.")
>>> isinstance(result.ominousness, int)
True

__init__(llm)[source]

Initialize the NarrativeCommissar.

Parameters:: llm (LLMProvider) – LLMProvider implementation for text generation. Use MockLLM for testing, DeepSeekClient for production.
Return type:: None

evaluate(text)[source]

Evaluate a narrative text and return structured metrics.

Sends the text to the LLM with a structured prompt requesting JSON output. Parses the response and returns a JudgmentResult.

Parameters:

text (str) – The narrative text to evaluate.

Return type:

JudgmentResult

Returns:

JudgmentResult containing the evaluation metrics.

Raises:

json.JSONDecodeError – If LLM response is not valid JSON.
pydantic.ValidationError – If JSON values are out of bounds.

Example

>>> from babylon.ai import MockLLM
>>> from babylon.ai.judge import NarrativeCommissar
>>> mock = MockLLM(responses=['{"ominousness": 7, "certainty": 8, "drama": 6, "metaphor_family": "biological"}'])
>>> commissar = NarrativeCommissar(llm=mock)
>>> result = commissar.evaluate("The crisis deepens.")
>>> result.ominousness
7

property name: str

Return identifier for logging.

Returns:: The string “NarrativeCommissar”.

class babylon.ai.JudgmentResult(**data)[source]

Bases: BaseModel

Result of narrative evaluation by the Commissar.

An immutable record of how the LLM-judge evaluated a narrative text. All metrics use a 1-10 scale for consistency.

Parameters:

ominousness (int)
certainty (int)
drama (int)
metaphor_family (MetaphorFamily)

ominousness: How threatening/foreboding the narrative (1-10).

certainty: How confident/absolute the assertions (1-10).

drama: Emotional intensity of the narrative (1-10).

metaphor_family: Dominant metaphorical domain used.

Example

>>> from babylon.ai.judge import JudgmentResult, MetaphorFamily
>>> result = JudgmentResult(
...     ominousness=8,
...     certainty=9,
...     drama=7,
...     metaphor_family=MetaphorFamily.BIOLOGICAL,
... )
>>> result.ominousness
8

model_config: ClassVar[ConfigDict] = {'frozen': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

ominousness: int

certainty: int

drama: int

metaphor_family: MetaphorFamily

class babylon.ai.MetaphorFamily(*values)[source]

Bases: str, Enum

Categories of metaphorical language in narratives.

Narratives about economic crisis tend to cluster around certain metaphorical domains. Tracking these helps identify rhetorical patterns in how the AI describes class struggle.

BIOLOGICAL: Bodies, organs, disease, parasites, health.

PHYSICS: Pressure, tension, phase transitions, energy.

MECHANICAL: Gears, machines, breaking, grinding.

NONE: No strong metaphorical clustering detected.

BIOLOGICAL = 'biological'

PHYSICS = 'physics'

MECHANICAL = 'mechanical'

NONE = 'none'

Modules

`director`	Narrative Director - AI Game Master observing the simulation.
`judge`	Narrative evaluation via LLM-as-judge pattern.
`llm_provider`	LLM Provider strategy pattern for text generation.
`persona`	Persona models for AI narrative voice customization (Sprint 4.2).
`persona_loader`	Persona loader with JSON Schema validation (Sprint 4.2).
`prompt_builder`	Builder for Dialectical Prompts and Context Hierarchy.