babylon.rag.embeddings
Embedding management for the RAG system.
Supports both local (Ollama) and cloud (OpenAI) embedding providers. Default: Ollama with embeddinggemma for fully offline operation.
Classes
|
Protocol for objects that can be embedded. |
|
Manages embeddings for RAG objects. |
- class babylon.rag.embeddings.Embeddable(*args, **kwargs)[source]
Bases:
ProtocolProtocol for objects that can be embedded.
- __init__(*args, **kwargs)
- class babylon.rag.embeddings.EmbeddingManager(embedding_dimension=None, batch_size=None, max_cache_size=1000, max_concurrent_requests=4)[source]
Bases:
objectManages embeddings for RAG objects.
The EmbeddingManager handles: - Generating embeddings via Ollama (local) or OpenAI (cloud) API - Caching embeddings for reuse with LRU eviction - Rate limiting and retry logic - Batch operations for efficiency - Error handling and recovery - Performance metrics collection - Concurrent operations
Default: Ollama with embeddinggemma for fully offline operation.
- Parameters:
- __init__(embedding_dimension=None, batch_size=None, max_cache_size=1000, max_concurrent_requests=4)[source]
Initialize the embedding manager.
- Parameters:
embedding_dimension (
int|None) – Size of embedding vectors (default: from LLMConfig)batch_size (
int|None) – Number of objects to embed in each batch (default: from LLMConfig)max_cache_size (
int) – Maximum number of embeddings to keep in cache (default: 1000)max_concurrent_requests (
int) – Maximum number of concurrent embedding requests (default: 4)
- Raises:
ValueError – If embedding configuration is invalid
- async aembed(obj)[source]
Asynchronously generate and attach embedding for a single object.
- Parameters:
obj (
TypeVar(E, bound=Embeddable)) – Object to embed- Return type:
TypeVar(E, bound=Embeddable)- Returns:
Object with embedding attached
- Raises:
ValueError – If object content is invalid
EmbeddingError – If embedding generation fails
- embed(obj)[source]
Synchronously generate and attach embedding for a single object.
This is a convenience wrapper around aembed for synchronous code. For better performance in async contexts, use aembed directly.
- Parameters:
obj (
TypeVar(E, bound=Embeddable)) – Object to embed- Return type:
TypeVar(E, bound=Embeddable)- Returns:
Object with embedding attached
- Raises:
ValueError – If object content is invalid
EmbeddingError – If embedding generation fails
- async aembed_batch(objects)[source]
Asynchronously generate embeddings for multiple objects efficiently.
- Parameters:
objects (
Sequence[TypeVar(E, bound=Embeddable)]) – List of objects to embed- Return type:
list[TypeVar(E, bound=Embeddable)]- Returns:
List of objects with embeddings attached
- Raises:
EmbeddingError – If any object’s embedding generation fails
- embed_batch(objects)[source]
Synchronously generate embeddings for multiple objects efficiently.
This is a convenience wrapper around aembed_batch for synchronous code. For better performance in async contexts, use aembed_batch directly.
- Parameters:
objects (
Sequence[TypeVar(E, bound=Embeddable)]) – List of objects to embed- Return type:
list[TypeVar(E, bound=Embeddable)]- Returns:
List of objects with embeddings attached
- Raises:
EmbeddingError – If any object’s embedding generation fails
- debed(obj)[source]
Remove embedding from an object.
- Parameters:
obj (
TypeVar(E, bound=Embeddable)) – Object to remove embedding from- Return type:
TypeVar(E, bound=Embeddable)- Returns:
Object with embedding removed
- debed_batch(objects)[source]
Remove embeddings from multiple objects.
- Parameters:
objects (
Sequence[TypeVar(E, bound=Embeddable)]) – List of objects to remove embeddings from- Return type:
list[TypeVar(E, bound=Embeddable)]- Returns:
List of objects with embeddings removed