babylon.data.chroma_manager

ChromaDB client management module.

Classes

ChromaManager()

Manages ChromaDB client lifecycle and operations.

class babylon.data.chroma_manager.ChromaManager[source]

Bases: object

Manages ChromaDB client lifecycle and operations.

This class implements the Singleton pattern to ensure only one ChromaDB client exists throughout the application lifecycle. It provides centralized management of vector database operations for storing and retrieving entity embeddings.

Error Handling:
  • Implements retries for transient failures

  • Provides detailed error context

  • Logs performance metrics

  • Tracks operation correlation IDs

Implementation Details:
  • Uses DuckDB+Parquet backend for efficient local storage and querying

  • Implements lazy initialization to optimize resource usage

  • Provides automatic persistence and backup capabilities

  • Handles graceful cleanup during shutdown

Key Features:
  • Thread-safe singleton implementation

  • Automatic connection management

  • Collection creation and access

  • Resource cleanup and persistence

Performance Considerations:
  • Maintains connection pool for efficient queries

  • Implements caching for frequently accessed collections

  • Uses batch operations for better throughput

  • Handles memory management through lazy loading

Usage Example:

manager = ChromaManager() collection = manager.get_or_create_collection(“entities”) collection.add(documents=[…], embeddings=[…])

Return type:

ChromaManager

_instance

Singleton instance of the manager

_client

The ChromaDB client instance

Note

The class uses lazy initialization - the client is only created when first needed. This helps optimize resource usage and startup time.

static __new__(cls)[source]

Implement singleton pattern for ChromaManager.

Returns:

The singleton instance of the manager

Return type:

ChromaManager

__init__()[source]

Initialize the ChromaManager instance.

The initialization is lazy - the client is only created when needed. This method is safe to call multiple times due to the singleton pattern.

Return type:

None

property client: ClientAPI

Get the ChromaDB client instance.

This property implements lazy initialization of the client. If the client doesn’t exist, it will be created on first access.

Returns:

The initialized ChromaDB client instance

Return type:

ClientAPI

Raises:

DatabaseError – If client initialization fails

Note

This is the preferred way to access the client throughout the application as it ensures proper initialization and singleton pattern compliance.

get_or_create_collection(name)[source]

Get an existing collection or create a new one if it doesn’t exist.

This method provides a safe way to access collections, ensuring they exist before use. It’s idempotent - calling it multiple times with the same name will return the same collection.

Parameters:

name (str) – Name of the collection to get or create

Returns:

The ChromaDB collection instance

Return type:

Collection

Note

Collections are the main way to organize embeddings in ChromaDB. Each collection can store documents, embeddings, and metadata.

cleanup()[source]

Cleanup ChromaDB resources.

This method performs a graceful shutdown of the ChromaDB client: 1. Resets the client connection (if allow_reset=True in settings) 2. Clears the client reference

Note

In ChromaDB 1.x with PersistentClient, data is automatically persisted - no explicit persist() call needed.

Note

This method should be called during application shutdown or when you’re done using ChromaDB to ensure proper cleanup.

Return type:

None