babylon.rag.retrieval

Query and retrieval interface for the RAG system.

Classes

QueryResponse(**data)

Represents the complete response to a query.

QueryResult(**data)

Represents a single query result with similarity score.

Retriever(vector_store, embedding_manager)

High-level retrieval interface for RAG queries.

VectorStore([collection_name, chroma_manager])

Interface to ChromaDB for storing and retrieving document vectors.

class babylon.rag.retrieval.QueryResult(**data)[source]

Bases: BaseModel

Represents a single query result with similarity score.

Parameters:
model_config: ClassVar[ConfigDict] = {'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

chunk: DocumentChunk
similarity_score: float
distance: float
metadata: dict[str, Any] | None
convert_similarity_to_distance()[source]

Convert similarity score to distance if not provided.

Return type:

QueryResult

class babylon.rag.retrieval.QueryResponse(**data)[source]

Bases: BaseModel

Represents the complete response to a query.

Parameters:
model_config: ClassVar[ConfigDict] = {'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

query: str
results: list[QueryResult]
total_results: int
processing_time_ms: float
embedding_time_ms: float
search_time_ms: float
metadata: dict[str, Any] | None
get_top_k(k)[source]

Get the top k results by similarity score.

Return type:

list[QueryResult]

Parameters:

k (int)

get_combined_context(max_length=4000, separator='\\n\\n')[source]

Combine result chunks into a single context string.

Return type:

str

Parameters:
  • max_length (int)

  • separator (str)

class babylon.rag.retrieval.VectorStore(collection_name='documents', chroma_manager=None)[source]

Bases: object

Interface to ChromaDB for storing and retrieving document vectors.

Parameters:
__init__(collection_name='documents', chroma_manager=None)[source]

Initialize the vector store.

Parameters:
  • collection_name (str) – Name of the ChromaDB collection

  • chroma_manager (ChromaManager | None) – Optional ChromaManager instance (creates new if None)

property collection: Any

Get or create the ChromaDB collection.

add_chunks(chunks)[source]

Add document chunks to the vector store.

Parameters:

chunks (list[DocumentChunk]) – List of DocumentChunk objects with embeddings

Raises:

RagError – If chunks are missing embeddings or storage fails

Return type:

None

query_similar(query_embedding, k=10, where=None, include=None)[source]

Query for similar chunks using embedding.

Parameters:
  • query_embedding (list[float]) – Query vector embedding

  • k (int) – Number of results to return

  • where (dict[str, Any] | None) – Optional metadata filters

  • include (list[str] | None) – Fields to include in results

Return type:

tuple[list[str], list[str], list[list[float]], list[dict[str, Any]], list[float]]

Returns:

Tuple of (ids, documents, embeddings, metadatas, distances)

Raises:

RagError – If query fails

delete_chunks(chunk_ids)[source]

Delete chunks from the vector store.

Parameters:

chunk_ids (list[str]) – List of chunk IDs to delete

Raises:

RagError – If deletion fails

Return type:

None

get_collection_count()[source]

Get the number of chunks in the collection.

Return type:

int

class babylon.rag.retrieval.Retriever(vector_store, embedding_manager)[source]

Bases: object

High-level retrieval interface for RAG queries.

Parameters:
__init__(vector_store, embedding_manager)[source]

Initialize the retriever.

Parameters:
  • vector_store (VectorStore) – VectorStore instance for similarity search

  • embedding_manager (EmbeddingManager) – EmbeddingManager for query embedding

async aquery(query, k=10, similarity_threshold=0.0, metadata_filter=None)[source]

Asynchronously query for relevant document chunks.

Parameters:
  • query (str) – Query text to search for

  • k (int) – Number of results to return

  • similarity_threshold (float) – Minimum similarity score for results

  • metadata_filter (dict[str, Any] | None) – Optional filters for chunk metadata

Return type:

QueryResponse

Returns:

QueryResponse with results and timing information

Raises:

RagError – If query processing fails

query(query, k=10, similarity_threshold=0.0, metadata_filter=None)[source]

Synchronously query for relevant document chunks.

This is a convenience wrapper around aquery for synchronous code. For better performance in async contexts, use aquery directly.

Parameters:
  • query (str) – Query text to search for

  • k (int) – Number of results to return

  • similarity_threshold (float) – Minimum similarity score for results

  • metadata_filter (dict[str, Any] | None) – Optional filters for chunk metadata

Return type:

QueryResponse

Returns:

QueryResponse with results and timing information

Raises:

RagError – If query processing fails