babylon.rag.embeddings

Embedding management for the RAG system.

Supports both local (Ollama) and cloud (OpenAI) embedding providers. Default: Ollama with embeddinggemma for fully offline operation.

Classes

`Embeddable`(args, *kwargs)	Protocol for objects that can be embedded.
`EmbeddingManager`([embedding_dimension, ...])	Manages embeddings for RAG objects.

class babylon.rag.embeddings.Embeddable(*args, **kwargs)[source]

Bases: Protocol

Protocol for objects that can be embedded.

id: str

content: str

embedding: list[float] | None

__init__(*args, **kwargs)

class babylon.rag.embeddings.EmbeddingManager(embedding_dimension=None, batch_size=None, max_cache_size=1000, max_concurrent_requests=4, metrics=None)[source]

Bases: object

Manages embeddings for RAG objects.

The EmbeddingManager handles: - Generating embeddings via Ollama (local) or OpenAI (cloud) API - Caching embeddings for reuse with LRU eviction - Rate limiting and retry logic - Batch operations for efficiency - Error handling and recovery - Performance metrics collection - Concurrent operations

Default: Ollama with embeddinggemma for fully offline operation.

Parameters:

embedding_dimension (int | None)
batch_size (int | None)
max_cache_size (int)
max_concurrent_requests (int)
metrics (MetricsCollectorProtocol | None)

__init__(embedding_dimension=None, batch_size=None, max_cache_size=1000, max_concurrent_requests=4, metrics=None)[source]

Initialize the embedding manager.

Parameters:

embedding_dimension (int | None) – Size of embedding vectors (default: from LLMConfig)
batch_size (int | None) – Number of objects to embed in each batch (default: from LLMConfig)
max_cache_size (int) – Maximum number of embeddings to keep in cache (default: 1000)
max_concurrent_requests (int) – Maximum number of concurrent embedding requests (default: 4)
metrics (MetricsCollectorProtocol | None) – Optional metrics collector for DI (default: creates new MetricsCollector)

Raises:

ValueError – If embedding configuration is invalid

property cache_size: int: Get current number of embeddings in cache.

async aembed(obj)[source]

Asynchronously generate and attach embedding for a single object.

Parameters:

obj (TypeVar(E, bound= Embeddable)) – Object to embed

Return type:

TypeVar(E, bound= Embeddable)

Returns:

Object with embedding attached

Raises:

ValueError – If object content is invalid
EmbeddingError – If embedding generation fails

embed(obj)[source]

Synchronously generate and attach embedding for a single object.

This is a convenience wrapper around aembed for synchronous code. For better performance in async contexts, use aembed directly.

Parameters:

obj (TypeVar(E, bound= Embeddable)) – Object to embed

Return type:

TypeVar(E, bound= Embeddable)

Returns:

Object with embedding attached

Raises:

ValueError – If object content is invalid
EmbeddingError – If embedding generation fails

async aembed_batch(objects)[source]

Asynchronously generate embeddings for multiple objects efficiently.

Parameters:: objects (Sequence[TypeVar(E, bound= Embeddable)]) – List of objects to embed
Return type:: list[TypeVar(E, bound= Embeddable)]
Returns:: List of objects with embeddings attached
Raises:: EmbeddingError – If any object’s embedding generation fails

embed_batch(objects)[source]

Synchronously generate embeddings for multiple objects efficiently.

This is a convenience wrapper around aembed_batch for synchronous code. For better performance in async contexts, use aembed_batch directly.

Parameters:: objects (Sequence[TypeVar(E, bound= Embeddable)]) – List of objects to embed
Return type:: list[TypeVar(E, bound= Embeddable)]
Returns:: List of objects with embeddings attached
Raises:: EmbeddingError – If any object’s embedding generation fails

debed(obj)[source]

Remove embedding from an object.

Parameters:: obj (TypeVar(E, bound= Embeddable)) – Object to remove embedding from
Return type:: TypeVar(E, bound= Embeddable)
Returns:: Object with embedding removed

debed_batch(objects)[source]

Remove embeddings from multiple objects.

Parameters:: objects (Sequence[TypeVar(E, bound= Embeddable)]) – List of objects to remove embeddings from
Return type:: list[TypeVar(E, bound= Embeddable)]
Returns:: List of objects with embeddings removed

async close()[source]

Close resources used by the embedding manager.

Return type:: None