src package

Submodules

src.embed module

Embedding Pipeline

Handles vector embedding generation for document chunks and FAISS index management. Supports local embedding models and efficient similarity search.

class src.embed.EmbeddingConfig(model_name, normalize_embeddings, device, similarity_threshold, top_k)[source]

Bases: object

Configuration for embedding generation.

model_name: str
normalize_embeddings: bool
device: str
similarity_threshold: float
top_k: int
__init__(model_name, normalize_embeddings, device, similarity_threshold, top_k)
class src.embed.EmbeddingModel(config)[source]

Bases: object

Handles embedding model loading and text embedding generation.

__init__(config)[source]

Initialize embedding model.

Parameters:

config (EmbeddingConfig) – Embedding configuration

generate_embeddings(texts)[source]

Generate embeddings for a list of texts.

Parameters:

texts (list[str]) – List of text strings to embed

Return type:

ndarray

Returns:

numpy array of embeddings

generate_single_embedding(text)[source]

Generate embedding for a single text.

Parameters:

text (str) – Text string to embed

Return type:

ndarray

Returns:

numpy array of embedding

class src.embed.FAISSIndex(dimension, index_type='IndexFlatIP')[source]

Bases: object

Handles FAISS index creation and management.

__init__(dimension, index_type='IndexFlatIP')[source]

Initialize FAISS index.

Parameters:
  • dimension (int) – Dimension of embeddings

  • index_type (str) – Type of FAISS index to use

add_embeddings(embeddings, chunk_metadata)[source]

Add embeddings to the index.

Parameters:
  • embeddings (ndarray) – numpy array of embeddings

  • chunk_metadata (list[ChunkMetadata]) – List of chunk metadata corresponding to embeddings

Return type:

None

search(query_embedding, k)[source]

Search for similar embeddings.

Parameters:
  • query_embedding (ndarray) – Query embedding

  • k (int) – Number of results to return

Return type:

tuple[ndarray, ndarray]

Returns:

Tuple of (distances, indices)

get_chunk_by_index(index)[source]

Get chunk metadata by index.

Parameters:

index (int) – Index in the metadata list

Return type:

ChunkMetadata | None

Returns:

Chunk metadata or None if index is invalid

get_total_embeddings()[source]

Get total number of embeddings in index.

Return type:

int

save_index(index_path)[source]

Save FAISS index and metadata to disk.

Parameters:

index_path (Path) – Path to save index

Return type:

None

load_index(index_path)[source]

Load FAISS index and metadata from disk.

Parameters:

index_path (Path) – Path to load index from

Return type:

None

class src.embed.EmbeddingPipeline(config)[source]

Bases: object

Main class for embedding generation and index management.

__init__(config)[source]

Initialize embedding pipeline.

Parameters:

config (dict[str, Any]) – Configuration dictionary

create_embeddings_from_chunks(chunks)[source]

Create embeddings from document chunks and build FAISS index.

Parameters:

chunks (list[DocumentChunk]) – List of document chunks

Return type:

None

save_index(index_path)[source]

Save the FAISS index and metadata.

Parameters:

index_path (Path) – Path to save index

Return type:

None

load_index(index_path)[source]

Load the FAISS index and metadata.

Parameters:

index_path (Path) – Path to load index from

Return type:

None

search_similar_chunks(query, top_k=None)[source]

Search for chunks similar to the query.

Parameters:
  • query (str) – Query text

  • top_k (int | None) – Number of results to return (uses config default if None)

Return type:

list[tuple[DocumentChunk, float]]

Returns:

List of (chunk, similarity_score) tuples

get_index_stats()[source]

Get statistics about the index.

Return type:

dict[str, Any]

Returns:

Dictionary with index statistics

src.embed.create_embeddings_from_chunks_file(chunks_file, config, output_path)[source]

Create embeddings from a chunks.json file.

Parameters:
  • chunks_file (Path) – Path to chunks.json file

  • config (dict[str, Any]) – Configuration dictionary

  • output_path (Path) – Path to save index

Return type:

None

src.embed.load_embedding_pipeline(config, index_path)[source]

Load an embedding pipeline with existing index.

Parameters:
  • config (dict[str, Any]) – Configuration dictionary

  • index_path (Path) – Path to index directory

Return type:

EmbeddingPipeline

Returns:

Loaded EmbeddingPipeline

src.ingest module

Document Ingestion Pipeline

Handles PDF text extraction, cleaning, chunking, and metadata storage. Supports multiple PDF engines and configurable chunking parameters.

class src.ingest.ChunkMetadata(file_name, page_number, chunk_index, chunk_start, chunk_end, chunk_size, text_length)[source]

Bases: object

Metadata for a text chunk.

file_name: str
page_number: int
chunk_index: int
chunk_start: int
chunk_end: int
chunk_size: int
text_length: int
__init__(file_name, page_number, chunk_index, chunk_start, chunk_end, chunk_size, text_length)
class src.ingest.DocumentChunk(text, metadata)[source]

Bases: object

A chunk of text from a document with metadata.

text: str
metadata: ChunkMetadata
__init__(text, metadata)
class src.ingest.PDFProcessor(engine='pymupdf')[source]

Bases: object

Handles PDF text extraction using different engines.

__init__(engine='pymupdf')[source]

Initialize PDF processor.

Parameters:

engine (str) – PDF processing engine (“pymupdf”, “pdfminer”, “pdfplumber”)

extract_text(pdf_path)[source]

Extract text from PDF with page numbers.

Parameters:

pdf_path (Path) – Path to PDF file

Return type:

list[tuple[str, int]]

Returns:

List of (text, page_number) tuples

Raises:
class src.ingest.TextCleaner(config)[source]

Bases: object

Handles text cleaning and normalization.

__init__(config)[source]

Initialize text cleaner.

Parameters:

config (dict[str, Any]) – Configuration dictionary with cleaning parameters

clean_text(text)[source]

Clean and normalize text.

Parameters:

text (str) – Raw text to clean

Return type:

str

Returns:

Cleaned text

class src.ingest.TextChunker(chunk_size=1000, chunk_overlap=200)[source]

Bases: object

Handles text chunking with sliding window.

__init__(chunk_size=1000, chunk_overlap=200)[source]

Initialize text chunker.

Parameters:
  • chunk_size (int) – Size of each chunk in characters

  • chunk_overlap (int) – Overlap between chunks in characters

chunk_text(text, file_name, page_number)[source]

Split text into overlapping chunks.

Parameters:
  • text (str) – Text to chunk

  • file_name (str) – Name of the source file

  • page_number (int) – Page number

Return type:

list[DocumentChunk]

Returns:

List of DocumentChunk objects

class src.ingest.DocumentIngester(config)[source]

Bases: object

Main class for document ingestion pipeline.

__init__(config)[source]

Initialize document ingester.

Parameters:

config (dict[str, Any]) – Configuration dictionary

ingest_documents(documents_path)[source]

Ingest all PDF documents from the given path.

Parameters:

documents_path (Path) – Path to directory containing PDF files

Return type:

list[DocumentChunk]

Returns:

List of all document chunks

Raises:

ValueError – If documents_path doesn’t exist or contains no PDFs

save_chunks(chunks, output_path)[source]

Save chunks and metadata to disk.

Parameters:
Return type:

None

src.ingest.ingest_documents(documents_path, config, args)[source]

Main function for document ingestion.

Parameters:
  • documents_path (str) – Path to documents directory

  • config (dict[str, Any]) – Configuration dictionary

  • args (Any) – Command line arguments

Return type:

None

src.llm module

LLM Interface

Handles local LLM loading, prompt formatting, and answer generation. Supports multiple backends: transformers, llama-cpp, and OpenAI (optional).

class src.llm.LLMConfig(backend, model_path, temperature, max_tokens, top_p, repeat_penalty, context_window)[source]

Bases: object

Configuration for LLM settings.

backend: str
model_path: str
temperature: float
max_tokens: int
top_p: float
repeat_penalty: float
context_window: int
__init__(backend, model_path, temperature, max_tokens, top_p, repeat_penalty, context_window)
class src.llm.LLMResponse(answer, prompt_tokens, response_tokens, generation_time_ms, model_used)[source]

Bases: object

Response from LLM with metadata.

answer: str
prompt_tokens: int
response_tokens: int
generation_time_ms: float
model_used: str
__init__(answer, prompt_tokens, response_tokens, generation_time_ms, model_used)
class src.llm.BaseLLM(config)[source]

Bases: object

Base class for LLM implementations.

__init__(config)[source]

Initialize LLM with configuration.

Parameters:

config (LLMConfig) – LLM configuration

generate(prompt)[source]

Generate response from prompt.

Parameters:

prompt (str) – Input prompt

Return type:

LLMResponse

Returns:

LLMResponse with answer and metadata

class src.llm.TransformersLLM(config)[source]

Bases: BaseLLM

LLM implementation using transformers library.

generate(prompt)[source]

Generate response using transformers.

Parameters:

prompt (str) – Input prompt

Return type:

LLMResponse

Returns:

LLMResponse with answer and metadata

class src.llm.LlamaCppLLM(config)[source]

Bases: BaseLLM

LLM implementation using llama-cpp-python.

generate(prompt)[source]

Generate response using llama-cpp.

Parameters:

prompt (str) – Input prompt

Return type:

LLMResponse

Returns:

LLMResponse with answer and metadata

class src.llm.OpenAILLM(config)[source]

Bases: BaseLLM

LLM implementation using OpenAI API (optional).

generate(prompt)[source]

Generate response using OpenAI API.

Parameters:

prompt (str) – Input prompt

Return type:

LLMResponse

Returns:

LLMResponse with answer and metadata

class src.llm.LLMInterface(config)[source]

Bases: object

Main interface for LLM operations.

__init__(config)[source]

Initialize LLM interface.

Parameters:

config (dict[str, Any]) – Configuration dictionary

format_prompt(query, context)[source]

Format prompt with query and context.

Parameters:
  • query (str) – User query

  • context (str) – Retrieved document context

Return type:

str

Returns:

Formatted prompt

generate_answer(query, query_result)[source]

Generate answer from query and retrieved chunks.

Parameters:
  • query (str) – User query

  • query_result (QueryResult) – QueryResult with retrieved chunks

Return type:

LLMResponse

Returns:

LLMResponse with generated answer

get_model_info()[source]

Get information about the loaded model.

Return type:

dict[str, Any]

Returns:

Dictionary with model information

src.llm.create_llm_interface(config)[source]

Create LLM interface from configuration.

Parameters:

config (dict[str, Any]) – Configuration dictionary

Return type:

LLMInterface

Returns:

LLMInterface instance

src.llm.generate_answer_from_query(query, query_result, config)[source]

Generate answer from query and query result.

Parameters:
  • query (str) – User query

  • query_result (QueryResult) – QueryResult with retrieved chunks

  • config (dict[str, Any]) – Configuration dictionary

Return type:

str

Returns:

Generated answer string

src.llm.format_llm_response(response, verbose=False)[source]

Format LLM response for output.

Parameters:
  • response (LLMResponse) – LLMResponse to format

  • verbose (bool) – Whether to include metadata

Return type:

str

Returns:

Formatted output string

src.query module

Query Engine

Handles query processing, similarity search, and chunk retrieval. Loads FAISS index and embedding model for efficient query processing.

class src.query.QueryResult(query, chunks, similarities, total_chunks_searched, search_time_ms)[source]

Bases: object

Result of a query with relevant chunks and metadata.

query: str
chunks: list[DocumentChunk]
similarities: list[float]
total_chunks_searched: int
search_time_ms: float
__init__(query, chunks, similarities, total_chunks_searched, search_time_ms)
class src.query.QueryEngine(config, index_path=None)[source]

Bases: object

Main class for query processing and similarity search.

__init__(config, index_path=None)[source]

Initialize query engine.

Parameters:
  • config (dict[str, Any]) – Configuration dictionary

  • index_path (Path | None) – Path to FAISS index (if None, will use config default)

search(query, top_k=None, similarity_threshold=None)[source]

Search for chunks similar to the query.

Parameters:
  • query (str) – User query text

  • top_k (int | None) – Number of results to return (uses config default if None)

  • similarity_threshold (float | None) – Minimum similarity score (uses config default if None)

Return type:

QueryResult

Returns:

QueryResult with relevant chunks and metadata

get_index_stats()[source]

Get statistics about the loaded index.

Return type:

dict[str, Any]

Returns:

Dictionary with index statistics

validate_index()[source]

Validate that the index is properly loaded and functional.

Return type:

bool

Returns:

True if index is valid, False otherwise

class src.query.QueryProcessor(config, index_path=None)[source]

Bases: object

High-level query processor with additional functionality.

__init__(config, index_path=None)[source]

Initialize query processor.

Parameters:
  • config (dict[str, Any]) – Configuration dictionary

  • index_path (Path | None) – Path to FAISS index

process_query(query, top_k=None, similarity_threshold=None)[source]

Process a user query and return relevant chunks.

Parameters:
  • query (str) – User query text

  • top_k (int | None) – Number of results to return

  • similarity_threshold (float | None) – Minimum similarity score

Return type:

QueryResult

Returns:

QueryResult with relevant chunks and metadata

format_results(result, include_metadata=True)[source]

Format query results as a readable string.

Parameters:
  • result (QueryResult) – QueryResult to format

  • include_metadata (bool) – Whether to include chunk metadata

Return type:

str

Returns:

Formatted string representation of results

get_relevant_context(result, max_chars=2000)[source]

Get relevant context from search results for LLM input.

Parameters:
  • result (QueryResult) – QueryResult from search

  • max_chars (int) – Maximum characters to include

Return type:

str

Returns:

Formatted context string for LLM

src.query.process_query(query, config, args)[source]

Main function for query processing.

Parameters:
  • query (str) – User query text

  • config (dict[str, Any]) – Configuration dictionary

  • args (Any) – Command line arguments

Return type:

QueryResult

Returns:

QueryResult with relevant chunks

src.query.format_query_output(result, verbose=False)[source]

Format query results for output.

Parameters:
  • result (QueryResult) – QueryResult to format

  • verbose (bool) – Whether to include detailed output

Return type:

str

Returns:

Formatted output string

src.utils module

Utility functions and logging configuration for the document-based question answering system.

src.utils.setup_logging(log_level='INFO', log_file=None, log_format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')[source]

Set up centralized logging configuration.

Parameters:
  • log_level (str) – Logging level (DEBUG, INFO, WARNING, ERROR)

  • log_file (str | None) – Optional log file path

  • log_format (str) – Log message format

Return type:

None

src.utils.get_logger(name)[source]

Get a logger instance with the given name.

Parameters:

name (str) – Logger name

Return type:

Logger

Returns:

Configured logger instance

src.utils.log_memory_usage(logger, context='')[source]

Log current memory usage.

Parameters:
  • logger (Logger) – Logger instance

  • context (str) – Context string for the log message

Return type:

None

src.utils.log_performance(func)[source]

Decorator to log function performance metrics.

Parameters:

func – Function to decorate

Returns:

Decorated function

src.utils.batch_process(items, batch_size, process_func, logger, description='Processing')[source]

Process items in batches with progress logging.

Parameters:
  • items (list) – List of items to process

  • batch_size (int) – Size of each batch

  • process_func – Function to apply to each batch

  • logger (Logger) – Logger instance

  • description (str) – Description for progress logging

Return type:

list

Returns:

List of processed results

src.utils.optimize_memory()[source]

Perform memory optimization operations.

src.utils.create_cache_directory(cache_dir)[source]

Create and validate cache directory.

Parameters:

cache_dir (str) – Cache directory path

Return type:

Path

Returns:

Path to cache directory

src.utils.get_system_info()[source]

Get system information for logging.

Return type:

dict[str, Any]

Returns:

Dictionary with system information

src.utils.log_system_info(logger)[source]

Log system information.

Parameters:

logger (Logger) – Logger instance

Return type:

None

class src.utils.ProgressTracker(total_items, logger, description='Processing')[source]

Bases: object

Track and log progress of long-running operations.

__init__(total_items, logger, description='Processing')[source]
update(count=1)[source]

Update progress and log periodically.

Return type:

None

finish()[source]

Log completion statistics.

Return type:

None

Module contents

Document-based Question Answering System

A local, modular RAG (retrieval-augmented generation) system.