babylon.rag.rag_pipeline
Main RAG pipeline service that orchestrates document ingestion and query processing.
Uses VectorStoreProtocol from babylon.persistence for backend-agnostic vector storage. ChromaDB has been removed; use PgVectorStore (Feature 037).
Classes
|
Result of document ingestion process. |
|
Configuration for RAG pipeline. |
|
Main RAG pipeline that orchestrates ingestion and retrieval. |
- class babylon.rag.rag_pipeline.IngestionResult(**data)[source]
Bases:
BaseModelResult of document ingestion process.
- Parameters:
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class babylon.rag.rag_pipeline.RagConfig(**data)[source]
Bases:
BaseModelConfiguration for RAG pipeline.
- Parameters:
- model_config: ClassVar[ConfigDict] = {'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class babylon.rag.rag_pipeline.RagPipeline(vector_store, config=None, embedding_manager=None)[source]
Bases:
objectMain RAG pipeline that orchestrates ingestion and retrieval.
Requires a VectorStoreProtocol implementation (e.g. PgVectorStore).
- Parameters:
vector_store (VectorStoreProtocol)
config (RagConfig | None)
embedding_manager (EmbeddingManager | None)
- __init__(vector_store, config=None, embedding_manager=None)[source]
Initialize the RAG pipeline.
- Parameters:
vector_store (
VectorStoreProtocol) – VectorStoreProtocol implementation for storage.config (
RagConfig|None) – RAG configuration (uses default if None).embedding_manager (
EmbeddingManager|None) – Embedding manager (creates new if None).
- async aingest_text(content, source_id, metadata=None)[source]
Asynchronously ingest text content into the RAG system.
- Parameters:
- Return type:
- Returns:
IngestionResult with processing statistics
- Raises:
RagError – If ingestion fails
- ingest_text(content, source_id, metadata=None)[source]
Synchronously ingest text content into the RAG system.
- async aingest_file(file_path, encoding='utf-8')[source]
Asynchronously ingest a text file into the RAG system.
- Parameters:
- Return type:
- Returns:
IngestionResult with processing statistics
- Raises:
FileNotFoundError – If file doesn’t exist
RagError – If ingestion fails
- ingest_file(file_path, encoding='utf-8')[source]
Synchronously ingest a text file into the RAG system.
- Parameters:
- Return type:
- Returns:
IngestionResult with processing statistics
- async aingest_files(file_paths, encoding='utf-8', max_concurrent=5)[source]
Asynchronously ingest multiple files concurrently.
- ingest_files(file_paths, encoding='utf-8', max_concurrent=5)[source]
Synchronously ingest multiple files.
- async aquery(query, top_k=None, similarity_threshold=None, metadata_filter=None)[source]
Asynchronously query the RAG system for relevant content.
- Parameters:
- Return type:
- Returns:
QueryResponse with search results
- Raises:
RagError – If query processing fails
- query(query, top_k=None, similarity_threshold=None, metadata_filter=None)[source]
Synchronously query the RAG system for relevant content.
- Parameters:
- Return type:
- Returns:
QueryResponse with search results
- clear_collection()[source]
Clear all documents from the collection.
WARNING: This will delete all ingested documents!
- Return type:
- __exit__(exc_type, exc_val, exc_tb)[source]
Context manager exit with cleanup.
- Return type:
- Parameters:
exc_type (type[BaseException] | None)
exc_val (BaseException | None)
exc_tb (TracebackType | None)