API Reference ============ This section provides detailed documentation for the AI Engineer Code Challenge API. Core Modules ------------ .. automodule:: src.ingest :members: :undoc-members: :show-inheritance: .. automodule:: src.embed :members: :undoc-members: :show-inheritance: .. automodule:: src.query :members: :undoc-members: :show-inheritance: .. automodule:: src.llm :members: :undoc-members: :show-inheritance: .. automodule:: src.utils :members: :undoc-members: :show-inheritance: Data Structures --------------- .. automodule:: src.ingest :members: DocumentChunk, ChunkMetadata :undoc-members: :show-inheritance: .. automodule:: src.embed :members: EmbeddingConfig, FAISSIndex :undoc-members: :show-inheritance: .. automodule:: src.query :members: QueryResult :undoc-members: :show-inheritance: .. automodule:: src.llm :members: LLMConfig, LLMResponse :undoc-members: :show-inheritance: Configuration ------------- The system uses YAML configuration files for all settings. Here's the structure: .. code-block:: yaml # PDF Processing Configuration pdf: engine: "pymupdf" # Options: "pymupdf", "pdfminer", "pdfplumber" chunk_size: 1000 chunk_overlap: 200 # Embedding Configuration embedding: model: "all-MiniLM-L6-v2" top_k: 5 similarity_threshold: 0.7 # LLM Configuration llm: backend: "llama-cpp" # Options: "transformers", "llama-cpp", "openai" model_path: "./models/mistral-7b-instruct-v0.2.Q4_K_M.gguf" temperature: 0.1 max_tokens: 200 top_p: 0.9 repeat_penalty: 1.1 context_window: 4096 # Storage Configuration storage: index_dir: "./index" chunk_dir: "./index/chunks" # System Configuration system: log_level: "INFO" batch_size: 100 max_workers: 4 Command Line Interface --------------------- The main entry point provides a command-line interface: .. code-block:: bash # Ingest documents python main.py --mode ingest --documents ./data/ # Query the system python main.py --mode query --query "What are the key features?" # With verbose output python main.py --mode query --query "Your question" --verbose # Override configuration python main.py --mode query --query "Your question" --similarity-threshold 0.5 Available CLI Options: .. code-block:: text --mode: Choose between 'ingest' or 'query' --documents: Path to documents directory (for ingest mode) --query: Your question (for query mode) --verbose: Enable verbose output --similarity-threshold: Override similarity threshold --top-k: Override number of chunks to retrieve --chunk-size: Override chunk size for ingestion --chunk-overlap: Override chunk overlap for ingestion --embedding-model: Override embedding model --llm-backend: Override LLM backend --llm-model: Override LLM model path --temperature: Override LLM temperature --max-tokens: Override LLM max tokens Examples -------- Basic Usage ~~~~~~~~~~~ .. code-block:: python from src.ingest import ingest_documents from src.query import process_query from src.llm import generate_answer_from_query # Ingest documents config = load_config("config.yaml") ingest_documents("./data/", config, args) # Query the system result = process_query("What is this about?", config, args) answer = generate_answer_from_query("What is this about?", result, config) Advanced Usage ~~~~~~~~~~~~~ .. code-block:: python from src.embed import EmbeddingPipeline from src.query import QueryProcessor from src.llm import LLMInterface # Custom embedding pipeline embedding_pipeline = EmbeddingPipeline(config) embedding_pipeline.create_embeddings_from_chunks(chunks) # Custom query processing query_processor = QueryProcessor(config, index_path) result = query_processor.process_query("Your question", top_k=10, similarity_threshold=0.8) # Custom LLM interface llm_interface = LLMInterface(config) response = llm_interface.generate_answer("Your question", result) Error Handling ------------- The system provides comprehensive error handling: .. code-block:: python try: result = process_query("Your question", config, args) except ValueError as e: print(f"Configuration error: {e}") except FileNotFoundError as e: print(f"File not found: {e}") except Exception as e: print(f"Unexpected error: {e}") Performance Optimization ----------------------- For optimal performance: 1. **Use appropriate chunk sizes**: 1000-2000 characters work well for most documents 2. **Adjust similarity threshold**: 0.7-0.8 provides good balance 3. **Batch processing**: Use batch_size in system config for large datasets 4. **Model selection**: Choose quantized models for faster inference 5. **Hardware utilization**: Use GPU if available for LLM inference Monitoring and Logging --------------------- The system provides comprehensive logging: .. code-block:: python import logging # Configure logging logging.basicConfig(level=logging.INFO) # Monitor performance from src.utils import log_performance, log_memory_usage @log_performance def your_function(): # Your code here pass Testing ------- The system includes comprehensive tests: .. code-block:: bash # Run all tests pytest tests/ # Run with coverage pytest --cov=src tests/ # Run specific test categories pytest -m unit tests/ pytest -m integration tests/