# Configuration Management ## Overview The RAG System uses a unified configuration management system built with Pydantic BaseSettings. This provides type-safe configuration with validation, supporting multiple configuration sources with clear priority. ## Configuration Priority Configuration values are loaded in the following order (highest to lowest priority): 1. **Environment Variables** - Highest priority 2. **`.env` File** - Medium priority 3. **Default Values** - Lowest priority ## Configuration Structure Configuration is organized by functional domains: - **Application Settings** - General app configuration - **Database Settings** - MySQL/PostgreSQL configuration - **Vector Database Settings** - Infinity and Elasticsearch configuration - **Model Settings** - AI model configuration - **API Settings** - API server configuration - **Ragflow Settings** - Ragflow integration configuration - **MinIO Settings** - Object storage configuration - **Tag Search Settings** - Tag search configuration ## Usage ### Basic Usage ```python from src.config import get_settings # Get settings singleton settings = get_settings() # Access configuration print(settings.app_name) print(settings.database.host) print(settings.vector_db_type) print(settings.api.port) ``` ### Accessing Sub-configurations ```python # Database configuration db_config = settings.database print(f"Connecting to {db_config.host}:{db_config.port}") # Vector database configuration if settings.vector_db_type == "infinity": infinity_config = settings.infinity print(f"Using Infinity at {infinity_config.host}:{infinity_config.port}") elif settings.vector_db_type == "elasticsearch": es_config = settings.elasticsearch print(f"Using Elasticsearch at {es_config.nodes}") # Model configuration model_config = settings.model print(f"Using {model_config.provider} with {model_config.chat_model_name}") ``` ### Environment-Specific Configuration ```python settings = get_settings() if settings.environment == "production": # Production-specific logic assert not settings.api.debug, "Debug mode should be disabled in production" elif settings.environment == "development": # Development-specific logic pass ``` ## Configuration File ### Creating .env File 1. Copy the example file: ```bash cp .env.example .env ``` 2. Edit `.env` with your values: ```bash # Application APP_NAME=My RAG System ENVIRONMENT=production # Database DB_HOST=db.example.com DB_PORT=3306 DB_USERNAME=myuser DB_PASSWORD=mypassword DB_DATABASE=rag_db # Vector Database VECTOR_DB_TYPE=elasticsearch ES_NODES=["http://es1.example.com:9200", "http://es2.example.com:9200"] ES_USERNAME=elastic ES_PASSWORD=changeme # API API_PORT=8000 API_LOG_LEVEL=INFO API_WORKERS=4 ``` ### Environment Variables You can override any configuration using environment variables: ```bash # Linux/Mac export DB_HOST=localhost export API_PORT=9000 export VECTOR_DB_TYPE=infinity # Windows PowerShell $env:DB_HOST="localhost" $env:API_PORT="9000" $env:VECTOR_DB_TYPE="infinity" # Windows CMD set DB_HOST=localhost set API_PORT=9000 set VECTOR_DB_TYPE=infinity ``` ## Configuration Validation The configuration system validates all values on load: ### Port Validation ```python # Valid DB_PORT=3306 API_PORT=8000 # Invalid - will raise ValidationError DB_PORT=99999 # Port must be between 1 and 65535 ``` ### Log Level Validation ```python # Valid API_LOG_LEVEL=INFO API_LOG_LEVEL=DEBUG # Invalid - will raise ValidationError API_LOG_LEVEL=INVALID # Must be DEBUG, INFO, WARNING, ERROR, or CRITICAL ``` ### Vector DB Type Validation ```python # Valid VECTOR_DB_TYPE=infinity VECTOR_DB_TYPE=elasticsearch VECTOR_DB_TYPE=es # Normalized to 'elasticsearch' # Invalid - will raise ValidationError VECTOR_DB_TYPE=invalid # Must be infinity, elasticsearch, or es ``` ### Environment Validation ```python # Valid ENVIRONMENT=development ENVIRONMENT=testing ENVIRONMENT=production # Invalid - will raise ValidationError ENVIRONMENT=staging # Must be development, testing, or production ``` ## Testing Configuration ### Clearing Cache In tests, you may need to reload configuration: ```python from src.config import get_settings, clear_settings_cache # Clear cache to reload settings clear_settings_cache() settings = get_settings() ``` ### Testing with Different Configurations ```python import os from src.config import get_settings, clear_settings_cache # Set test environment variables os.environ['DB_HOST'] = 'test-db' os.environ['VECTOR_DB_TYPE'] = 'elasticsearch' # Clear cache and reload clear_settings_cache() settings = get_settings() # Run tests assert settings.database.host == 'test-db' assert settings.vector_db_type == 'elasticsearch' ``` ## Migration from Old Configuration If you're migrating from the old configuration system (`src/conf/settings.py`), here's the mapping: ### Old Configuration ```python from src.conf.settings import model_settings, vector_db_settings model_name = model_settings.chat_model_name infinity_host = vector_db_settings.infinity_host ``` ### New Configuration ```python from src.config import get_settings settings = get_settings() model_name = settings.model.chat_model_name infinity_host = settings.infinity.host ``` ### Environment Variable Changes | Old Variable | New Variable | |-------------|-------------| | `MODEL_PROVIDER` | `MODEL_PROVIDER` (same) | | `CHAT_MODEL_NAME` | `MODEL_CHAT_MODEL_NAME` | | `EMBEDDING_MODEL_NAME` | `MODEL_EMBEDDING_MODEL_NAME` | | `INFINITY_HOST` | `INFINITY_HOST` (same) | | `MYSQL_HOST` | `DB_HOST` | | `MYSQL_PORT` | `DB_PORT` | | `MYSQL_USER` | `DB_USERNAME` | | `MYSQL_PASSWORD` | `DB_PASSWORD` | | `MYSQL_DATABASE` | `DB_DATABASE` | | `LOG_LEVEL` | `API_LOG_LEVEL` | ## Best Practices 1. **Never commit `.env` file** - It contains sensitive information 2. **Always use `.env.example`** - Keep it updated with all configuration options 3. **Use environment variables in production** - More secure than files 4. **Validate early** - Configuration is validated on application startup 5. **Use type hints** - The configuration system provides full type safety 6. **Document changes** - Update `.env.example` when adding new configuration ## Troubleshooting ### Configuration Not Loading 1. Check `.env` file exists and is in the project root 2. Verify environment variable names match the expected format 3. Check for validation errors in the logs ### Validation Errors ```python # Example error ValidationError: 1 validation error for Settings vector_db_type Value error, Vector DB type must be one of ['infinity', 'elasticsearch', 'es'] ``` Solution: Check the value matches one of the allowed options. ### Cache Issues If configuration changes aren't reflected: ```python from src.config import clear_settings_cache clear_settings_cache() ``` ## Reference For complete configuration options, see: - `src/config/settings.py` - Configuration class definitions - `.env.example` - All available configuration options with descriptions