babylon.rag.context_window

Context window management for the RAG system.

class babylon.rag.context_window.ContextWindowManager(config=None, metrics_collector=None, lifecycle_manager=None)[source]

Bases: object

Manages the token usage and content prioritization in the RAG context window.

The ContextWindowManager ensures that the total token usage stays within the limits of the AI model while prioritizing the most relevant content. It implements:

Token counting and tracking
Content prioritization based on relevance
Automatic optimization when approaching limits
Integration with metrics collection

Parameters:

config (ContextWindowConfig | None)
metrics_collector (MetricsCollector | None)
lifecycle_manager (Any)

config: Configuration for the context window

metrics_collector: Collector for performance metrics

lifecycle_manager: Optional manager for object lifecycles

__init__(config=None, metrics_collector=None, lifecycle_manager=None)[source]

Initialize the context window manager.

Parameters:

config (ContextWindowConfig | None) – Configuration for token limits and optimization thresholds
metrics_collector (MetricsCollector | None) – Collector for performance metrics
lifecycle_manager (Any) – Optional manager for object lifecycles

add_content(content_id, content, token_count, importance=0.5)[source]

Add content to the context window.

Parameters:

content_id (str) – Unique identifier for the content
content (Any) – The content to add
token_count (int) – Number of tokens in the content
importance (float) – Importance score for the content (0.0 to 1.0)

Return type:

bool

Returns:

True if content was added successfully, False if optimization was required

Raises:

ContentInsertionError – If content could not be added
CapacityExceededError – If capacity is exceeded and optimization fails

property capacity_percentage: float: Get the current percentage of capacity used.

property content_count: int: Get the number of content items in the context window.

count_tokens(content)[source]

Count the number of tokens in a string.

This is a simple implementation that could be enhanced with a proper tokenizer.

Parameters:: content (str) – String content to count tokens for
Return type:: int
Returns:: Estimated token count

get_content(content_id)[source]

Get content from the context window and update its priority.

Parameters:: content_id (str) – Unique identifier for the content
Return type:: Any
Returns:: The requested content
Raises:: KeyError – If content_id is not found

get_stats()[source]

Get statistics about the context window.

Return type:: dict[str, Any]

optimize(target_tokens=None)[source]

Optimize the context window to reduce token usage.

Parameters:: target_tokens (int | None) – Target number of tokens to reduce to, defaults to threshold
Return type:: bool
Returns:: True if optimization was successful, False otherwise
Raises:: OptimizationFailedError – If optimization fails

remove_content(content_id)[source]

Remove content from the context window.

Parameters:: content_id (str) – Unique identifier for the content
Return type:: bool
Returns:: True if content was removed, False if not found
Raises:: ContentRemovalError – If an error occurs during removal

property total_tokens: int: Get the total number of tokens in the context window.

class babylon.rag.context_window.ContextWindowConfig(**data)[source]

Bases: BaseModel

Configuration for the Context Window Management system.

Parameters:

max_token_limit (int)
capacity_threshold (float)
prioritization_strategy (str)
min_content_importance (float)

max_token_limit: Maximum number of tokens allowed in the context window

capacity_threshold: Percentage of capacity at which optimization should trigger

prioritization_strategy: Strategy for content prioritization (relevance, recency, hybrid)

min_content_importance: Minimum importance score for content to be kept in context

classmethod from_base_config()[source]

Create context window config from BaseConfig.

Return type:: ContextWindowConfig

model_config: ClassVar[ConfigDict] = {'frozen': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

max_token_limit: int

capacity_threshold: float

prioritization_strategy: str

min_content_importance: float

babylon.rag.context_window.ContextWindowError: alias of RagError

babylon.rag.context_window.TokenCountError: alias of RagError

babylon.rag.context_window.CapacityExceededError: alias of RagError

babylon.rag.context_window.OptimizationFailedError: alias of RagError

babylon.rag.context_window.ContentPriorityError: alias of RagError

babylon.rag.context_window.ContentRemovalError: alias of RagError

babylon.rag.context_window.ContentInsertionError: alias of RagError

babylon.rag.context_window.count_tokens(content)[source]

Count the number of tokens in content of various types.

This is a simple implementation that estimates token counts. For production, this should be replaced with a proper tokenizer for the target model.

Parameters:: content (str | list[Any] | dict[str, Any] | Any) – Content to count tokens for. Can be string, list, dict, or other type.
Return type:: int
Returns:: Estimated token count

Modules

`config`	Configuration model for the context window management system.
`manager`	Context window manager for token-aware content prioritization.
`token_counter`	Token counting utilities for Context Window Management.