babylon.metrics

Metrics collection and analysis for Babylon/Babylon.

Metrics are the nervous system of the simulation. They provide feedback on the health of all subsystems.

class babylon.metrics.MetricsCollector[source]

Bases: object

Centralized metrics collection and aggregation.

This class creates independent instances for dependency injection. Each ServiceContainer should have its own MetricsCollector instance.

Note: MetricsCollector is NOT thread-safe for concurrent access. Each ServiceContainer instance MUST have its own MetricsCollector. Concurrent access from multiple threads to the same collector is not supported (simulation runs single-threaded).

Metrics Categories: - performance: Timing, throughput, latency - simulation: Game state metrics (P(S|A), P(S|R), Rent flow) - cache: Hit rates, evictions, memory usage - embedding: Generation times, batch sizes, errors

__init__()[source]

Initialize the metrics collector.

Return type:

None

clear()[source]

Clear all collected metrics.

Used primarily for testing or resetting between game sessions.

Return type:

None

property enabled: bool

Check if metrics collection is enabled.

gauge(name, value)[source]

Set a gauge metric to a specific value.

Parameters:
  • name (str) – Gauge name

  • value (float) – Current value

Return type:

None

increment(name, value=1)[source]

Increment a counter metric.

Parameters:
  • name (str) – Counter name

  • value (int) – Amount to increment

Return type:

None

record(name, value, tags=None, metadata=None)[source]

Record a metric value.

Parameters:
  • name (str) – Metric name (e.g., “simulation.p_revolution”)

  • value (float) – Numeric value

  • tags (dict[str, str] | None) – Key-value pairs for filtering/grouping

  • metadata (dict[str, Any] | None) – Additional context

Return type:

None

record_cache_event(level, hit)[source]

Record a cache hit or miss event.

Parameters:
  • level (str) – Cache level (e.g., “L1”, “L2”, “embedding”)

  • hit (bool) – Whether this was a cache hit (True) or miss (False)

Return type:

None

record_memory_usage(memory_bytes)[source]

Record memory usage.

Parameters:

memory_bytes (float) – Memory usage in bytes

Return type:

None

record_metric(name, value, context='', object_id=None, context_level=None)[source]

Record a named metric with context.

Parameters:
  • name (str) – Metric name

  • value (float) – Metric value

  • context (str) – Optional context string

  • object_id (str | None) – Optional object identifier

  • context_level (str | None) – Optional context level

Return type:

None

record_timing(name, duration)[source]

Record a timing measurement directly.

Parameters:
  • name (str) – Timer name

  • duration (float) – Duration in seconds

Return type:

None

record_token_usage(tokens)[source]

Record token usage.

Parameters:

tokens (int) – Number of tokens used

Return type:

None

summary()[source]

Get a summary of all collected metrics.

Return type:

dict[str, Any]

Returns:

Dict with counters, gauges, and timer statistics

time(name)[source]

Context manager for timing operations.

Parameters:

name (str) – Timer name

Return type:

TimerContext

Returns:

TimerContext that records duration on exit

Example

with collector.time(“embedding.generation”):

embeddings = generate_embeddings(texts)

class babylon.metrics.MetricsCollectorProtocol(*args, **kwargs)[source]

Bases: Protocol

Protocol defining the contract for metrics collectors.

This protocol ensures both the real MetricsCollector and any test doubles (spies, mocks) implement the same interface.

The protocol follows the “Dumb Spy” pattern for test doubles: implementations should record what was called, not calculate statistics. Statistical analysis belongs in the production MetricsCollector only.

Example

def process_data(collector: MetricsCollectorProtocol) -> None:
    collector.increment("items_processed")
    with collector.time("processing_duration"):
        # ... do work ...
        pass
__init__(*args, **kwargs)
clear()[source]

Clear all recorded metrics.

Used primarily for testing or resetting between game sessions.

Return type:

None

gauge(name, value)[source]

Set a gauge value.

Gauges represent point-in-time values that can go up or down.

Parameters:
  • name (str) – Gauge name.

  • value (float) – Current value.

Return type:

None

increment(name, value=1)[source]

Increment a counter.

Parameters:
  • name (str) – Counter name.

  • value (int) – Amount to increment (default 1).

Return type:

None

record(name, value, tags=None, metadata=None)[source]

Record a metric value.

Parameters:
  • name (str) – Metric name (e.g., “simulation.p_revolution”).

  • value (float) – Numeric value to record.

  • tags (dict[str, str] | None) – Key-value pairs for filtering/grouping.

  • metadata (dict[str, Any] | None) – Additional context.

Return type:

None

record_cache_event(level, hit)[source]

Record a cache hit or miss event.

Parameters:
  • level (str) – Cache level (e.g., “L1”, “L2”, “embedding”).

  • hit (bool) – Whether this was a cache hit (True) or miss (False).

Return type:

None

record_memory_usage(memory_bytes)[source]

Record memory usage.

Parameters:

memory_bytes (float) – Memory usage in bytes.

Return type:

None

record_metric(name, value, context='', object_id=None, context_level=None)[source]

Record a named metric with context.

Parameters:
  • name (str) – Metric name.

  • value (float) – Metric value.

  • context (str) – Optional context string.

  • object_id (str | None) – Optional object identifier.

  • context_level (str | None) – Optional context level.

Return type:

None

record_token_usage(tokens)[source]

Record token usage.

Parameters:

tokens (int) – Number of tokens used.

Return type:

None

summary()[source]

Get aggregated summary of all metrics.

Return type:

dict[str, Any]

Returns:

Dict containing counters, gauges, timer statistics, etc.

time(name)[source]

Context manager for timing operations.

Parameters:

name (str) – Timer name.

Return type:

AbstractContextManager[Any]

Returns:

Context manager that records duration on exit.

Example

with collector.time("embedding.generation"):
    embeddings = generate_embeddings(texts)

Modules

collector

Metrics collection system for Babylon/Babylon.

interfaces

Metrics collector interfaces and protocols.