babylon.rag.context_window.manager

Context window manager for token-aware content prioritization.

This module implements dynamic context window management for RAG systems, ensuring AI model context limits are respected while maximizing information density. Key features:

Token counting and capacity tracking
Priority-based content eviction (relevance, recency, hybrid strategies)
Automatic optimization at configurable thresholds
Metrics integration for monitoring and tuning

The manager is designed for high-throughput scenarios where context must be continuously optimized as new content arrives and old content ages.

Example

>>> from babylon.rag.context_window import ContextWindowManager
>>> manager = ContextWindowManager()
>>> manager.add_content("key1", "Some text content", importance=0.8)
>>> manager.current_usage
4  # tokens

Classes

ContextWindowManager([config, ...])

Manages the token usage and content prioritization in the RAG context window.

class babylon.rag.context_window.manager.ContextWindowManager(config=None, metrics_collector=None, lifecycle_manager=None)[source]

Bases: object

Manages the token usage and content prioritization in the RAG context window.

The ContextWindowManager ensures that the total token usage stays within the limits of the AI model while prioritizing the most relevant content. It implements:

Token counting and tracking
Content prioritization based on relevance
Automatic optimization when approaching limits
Integration with metrics collection

Parameters:

config (ContextWindowConfig | None)
metrics_collector (MetricsCollector | None)
lifecycle_manager (Any)

config: Configuration for the context window

metrics_collector: Collector for performance metrics

lifecycle_manager: Optional manager for object lifecycles

__init__(config=None, metrics_collector=None, lifecycle_manager=None)[source]

Initialize the context window manager.

Parameters:

config (ContextWindowConfig | None) – Configuration for token limits and optimization thresholds
metrics_collector (MetricsCollector | None) – Collector for performance metrics
lifecycle_manager (Any) – Optional manager for object lifecycles

property total_tokens: int: Get the total number of tokens in the context window.

property capacity_percentage: float: Get the current percentage of capacity used.

property content_count: int: Get the number of content items in the context window.

add_content(content_id, content, token_count, importance=0.5)[source]

Add content to the context window.

Parameters:

content_id (str) – Unique identifier for the content
content (Any) – The content to add
token_count (int) – Number of tokens in the content
importance (float) – Importance score for the content (0.0 to 1.0)

Return type:

bool

Returns:

True if content was added successfully, False if optimization was required

Raises:

ContentInsertionError – If content could not be added
CapacityExceededError – If capacity is exceeded and optimization fails

get_content(content_id)[source]

Get content from the context window and update its priority.

Parameters:: content_id (str) – Unique identifier for the content
Return type:: Any
Returns:: The requested content
Raises:: KeyError – If content_id is not found

remove_content(content_id)[source]

Remove content from the context window.

Parameters:: content_id (str) – Unique identifier for the content
Return type:: bool
Returns:: True if content was removed, False if not found
Raises:: ContentRemovalError – If an error occurs during removal

optimize(target_tokens=None)[source]

Optimize the context window to reduce token usage.

Parameters:: target_tokens (int | None) – Target number of tokens to reduce to, defaults to threshold
Return type:: bool
Returns:: True if optimization was successful, False otherwise
Raises:: OptimizationFailedError – If optimization fails

get_stats()[source]

Get statistics about the context window.

Return type:: dict[str, Any]

count_tokens(content)[source]

Count the number of tokens in a string.

This is a simple implementation that could be enhanced with a proper tokenizer.

Parameters:: content (str) – String content to count tokens for
Return type:: int
Returns:: Estimated token count