babylon.rag.context_window

Context window management for the RAG system.

class babylon.rag.context_window.ContextWindowManager(config=None, metrics_collector=None, lifecycle_manager=None)[source]

Bases: object

Manages the token usage and content prioritization in the RAG context window.

The ContextWindowManager ensures that the total token usage stays within the limits of the AI model while prioritizing the most relevant content. It implements:

  1. Token counting and tracking

  2. Content prioritization based on relevance

  3. Automatic optimization when approaching limits

  4. Integration with metrics collection

Parameters:
config

Configuration for the context window

metrics_collector

Collector for performance metrics

lifecycle_manager

Optional manager for object lifecycles

__init__(config=None, metrics_collector=None, lifecycle_manager=None)[source]

Initialize the context window manager.

Parameters:
  • config (ContextWindowConfig | None) – Configuration for token limits and optimization thresholds

  • metrics_collector (MetricsCollector | None) – Collector for performance metrics

  • lifecycle_manager (Any) – Optional manager for object lifecycles

add_content(content_id, content, token_count, importance=0.5)[source]

Add content to the context window.

Parameters:
  • content_id (str) – Unique identifier for the content

  • content (Any) – The content to add

  • token_count (int) – Number of tokens in the content

  • importance (float) – Importance score for the content (0.0 to 1.0)

Return type:

bool

Returns:

True if content was added successfully, False if optimization was required

Raises:
  • ContentInsertionError – If content could not be added

  • CapacityExceededError – If capacity is exceeded and optimization fails

property capacity_percentage: float

Get the current percentage of capacity used.

property content_count: int

Get the number of content items in the context window.

count_tokens(content)[source]

Count the number of tokens in a string.

This is a simple implementation that could be enhanced with a proper tokenizer.

Parameters:

content (str) – String content to count tokens for

Return type:

int

Returns:

Estimated token count

get_content(content_id)[source]

Get content from the context window and update its priority.

Parameters:

content_id (str) – Unique identifier for the content

Return type:

Any

Returns:

The requested content

Raises:

KeyError – If content_id is not found

get_stats()[source]

Get statistics about the context window.

Return type:

dict[str, Any]

optimize(target_tokens=None)[source]

Optimize the context window to reduce token usage.

Parameters:

target_tokens (int | None) – Target number of tokens to reduce to, defaults to threshold

Return type:

bool

Returns:

True if optimization was successful, False otherwise

Raises:

OptimizationFailedError – If optimization fails

remove_content(content_id)[source]

Remove content from the context window.

Parameters:

content_id (str) – Unique identifier for the content

Return type:

bool

Returns:

True if content was removed, False if not found

Raises:

ContentRemovalError – If an error occurs during removal

property total_tokens: int

Get the total number of tokens in the context window.

class babylon.rag.context_window.ContextWindowConfig(**data)[source]

Bases: BaseModel

Configuration for the Context Window Management system.

Parameters:
  • max_token_limit (int)

  • capacity_threshold (float)

  • prioritization_strategy (str)

  • min_content_importance (float)

max_token_limit

Maximum number of tokens allowed in the context window

capacity_threshold

Percentage of capacity at which optimization should trigger

prioritization_strategy

Strategy for content prioritization (relevance, recency, hybrid)

min_content_importance

Minimum importance score for content to be kept in context

classmethod from_base_config()[source]

Create context window config from BaseConfig.

Return type:

ContextWindowConfig

model_config: ClassVar[ConfigDict] = {'frozen': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

max_token_limit: int
capacity_threshold: float
prioritization_strategy: str
min_content_importance: float
babylon.rag.context_window.ContextWindowError

alias of RagError

babylon.rag.context_window.TokenCountError

alias of RagError

babylon.rag.context_window.CapacityExceededError

alias of RagError

babylon.rag.context_window.OptimizationFailedError

alias of RagError

babylon.rag.context_window.ContentPriorityError

alias of RagError

babylon.rag.context_window.ContentRemovalError

alias of RagError

babylon.rag.context_window.ContentInsertionError

alias of RagError

babylon.rag.context_window.count_tokens(content)[source]

Count the number of tokens in content of various types.

This is a simple implementation that estimates token counts. For production, this should be replaced with a proper tokenizer for the target model.

Parameters:

content (str | list[Any] | dict[str, Any] | Any) – Content to count tokens for. Can be string, list, dict, or other type.

Return type:

int

Returns:

Estimated token count

Modules

config

Configuration model for the context window management system.

manager

Context window manager for token-aware content prioritization.

token_counter

Token counting utilities for Context Window Management.