# CI/CD Pipeline Guide ## Overview The RAG System uses GitHub Actions for continuous integration and continuous deployment. This guide explains how the CI/CD pipeline works and how to use it effectively. ## Table of Contents - [Pipeline Architecture](#pipeline-architecture) - [Workflow Jobs](#workflow-jobs) - [Test Execution](#test-execution) - [Coverage Reporting](#coverage-reporting) - [Setting Up Codecov](#setting-up-codecov) - [Running Tests Locally](#running-tests-locally) - [Troubleshooting](#troubleshooting) - [Best Practices](#best-practices) ## Pipeline Architecture The CI/CD pipeline consists of four main jobs that run in parallel: ``` ┌─────────────────────────────────────────────────────────┐ │ GitHub Actions │ ├─────────────────────────────────────────────────────────┤ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ Test │ │ Lint │ │ Security │ │ │ │ Job │ │ Job │ │ Job │ │ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ │ │ │ └─────────────┴──────────────┘ │ │ │ │ │ ┌──────▼──────┐ │ │ │Build Status │ │ │ │ Job │ │ │ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────┘ ``` ## Workflow Jobs ### 1. Test Job Runs the complete test suite with coverage measurement. **Matrix Strategy**: Tests run on Python 3.11 and 3.12 **Steps**: 1. Checkout code 2. Set up Python environment 3. Install dependencies 4. Create necessary directories 5. Run unit tests with coverage 6. Run integration tests with coverage 7. Run end-to-end tests with coverage 8. Generate coverage report 9. Upload coverage to Codecov 10. Upload artifacts (coverage reports and test logs) **Test Execution Order**: ``` Unit Tests → Integration Tests → E2E Tests ``` **Coverage Requirements**: - Minimum: 80% - Domain Layer: 90% - Application Layer: 85% - Infrastructure Layer: 70% - Presentation Layer: 75% ### 2. Lint Job Performs code quality checks. **Tools**: - **flake8**: Python linting for code style and errors - **black**: Code formatting verification - **isort**: Import statement sorting verification - **mypy**: Static type checking **Configuration**: - Max line length: 127 characters - Max complexity: 10 - Type checking: Ignore missing imports ### 3. Security Job Scans for security vulnerabilities. **Tools**: - **safety**: Checks dependencies for known vulnerabilities - **bandit**: Scans code for common security issues **Reports**: - `bandit-report.json`: Detailed security scan results ### 4. Build Status Job Aggregates results from all jobs and determines overall build status. **Behavior**: - ✅ Passes if test job succeeds - ❌ Fails if test job fails - Runs regardless of lint and security job results ## Test Execution ### Test Types #### Unit Tests - **Purpose**: Test individual components in isolation - **Location**: `tests/unit/` - **Marker**: `@pytest.mark.unit` - **Speed**: Fast (< 100ms per test) - **Dependencies**: Mocked ```bash pytest tests/unit -v --cov=src --cov-report=term -m unit ``` #### Integration Tests - **Purpose**: Test component interactions - **Location**: `tests/integration/` - **Marker**: `@pytest.mark.integration` - **Speed**: Medium (< 1s per test) - **Dependencies**: Test databases/services ```bash pytest tests/integration -v --cov=src --cov-report=term -m integration ``` #### End-to-End Tests - **Purpose**: Test complete user workflows - **Location**: `tests/e2e/` - **Marker**: `@pytest.mark.e2e` - **Speed**: Slow (< 5s per test) - **Dependencies**: Full system ```bash pytest tests/e2e -v --cov=src --cov-report=term -m e2e ``` ### Test Markers Use markers to organize and selectively run tests: ```python import pytest @pytest.mark.unit def test_vector_creation(): """Unit test for vector creation""" pass @pytest.mark.integration @pytest.mark.requires_db def test_document_repository(): """Integration test requiring database""" pass @pytest.mark.e2e @pytest.mark.slow def test_complete_workflow(): """End-to-end test of complete workflow""" pass @pytest.mark.property def test_vector_properties(): """Property-based test""" pass ``` ### Running Specific Tests ```bash # Run only unit tests pytest -m unit # Run only integration tests pytest -m integration # Run only e2e tests pytest -m e2e # Exclude slow tests pytest -m "not slow" # Run tests requiring database pytest -m requires_db # Run property-based tests pytest -m property # Combine markers pytest -m "unit and not slow" ``` ## Coverage Reporting ### Coverage Configuration Coverage is configured in `.coveragerc` and `codecov.yml`. **Key Settings**: - Source: `src/` - Omit: Tests, cache, migrations - Branch coverage: Enabled - Minimum coverage: 80% ### Coverage Reports #### Terminal Report ```bash pytest --cov=src --cov-report=term ``` Output: ``` Name Stmts Miss Branch BrPart Cover ---------------------------------------------------------------------- src/domain/vector_search/entities.py 45 2 8 1 94% src/application/handlers.py 67 5 12 2 89% ---------------------------------------------------------------------- TOTAL 512 23 89 7 93% ``` #### HTML Report ```bash pytest --cov=src --cov-report=html ``` Open `htmlcov/index.html` in browser for interactive report. #### XML Report (for CI/CD) ```bash pytest --cov=src --cov-report=xml ``` Generates `coverage.xml` for Codecov upload. ### Coverage by Layer Codecov tracks coverage for each architectural layer: - **Domain Layer** (`src/domain/`): Target 90% - **Application Layer** (`src/application/`): Target 85% - **Infrastructure Layer** (`src/infrastructure/`): Target 70% - **Presentation Layer** (`src/presentation/`): Target 75% - **Configuration** (`src/config/`): Target 80% - **Shared Utilities** (`src/shared/`): Target 80% ## Setting Up Codecov ### Step 1: Sign Up 1. Go to [codecov.io](https://codecov.io) 2. Sign in with GitHub 3. Authorize Codecov to access your repositories ### Step 2: Add Repository 1. Find your repository in Codecov dashboard 2. Click "Add Repository" 3. Copy the upload token ### Step 3: Configure GitHub Secret 1. Go to your GitHub repository 2. Navigate to Settings → Secrets and variables → Actions 3. Click "New repository secret" 4. Name: `CODECOV_TOKEN` 5. Value: Paste the token from Codecov 6. Click "Add secret" ### Step 4: Verify Integration 1. Push a commit to trigger the workflow 2. Check GitHub Actions tab for workflow run 3. Verify coverage upload in Codecov dashboard ### Step 5: Add Coverage Badge Add to your `README.md`: ```markdown [![codecov](https://codecov.io/gh/YOUR_USERNAME/YOUR_REPO/branch/main/graph/badge.svg?token=YOUR_TOKEN)](https://codecov.io/gh/YOUR_USERNAME/YOUR_REPO) ``` Replace `YOUR_USERNAME`, `YOUR_REPO`, and `YOUR_TOKEN` with your values. ## Running Tests Locally ### Prerequisites ```bash # Install test dependencies pip install pytest pytest-asyncio pytest-cov hypothesis httpx ``` ### Basic Test Execution ```bash # Run all tests pytest # Run with verbose output pytest -v # Run with coverage pytest --cov=src # Run specific test file pytest tests/unit/domain/test_entities.py # Run specific test function pytest tests/unit/domain/test_entities.py::test_vector_creation ``` ### Advanced Options ```bash # Run tests in parallel (requires pytest-xdist) pytest -n auto # Stop on first failure pytest -x # Show local variables in tracebacks pytest --showlocals # Run only failed tests from last run pytest --lf # Run failed tests first, then others pytest --ff # Generate all report types pytest --cov=src --cov-report=html --cov-report=term --cov-report=xml ``` ### Debugging Tests ```bash # Run with Python debugger pytest --pdb # Drop into debugger on failure pytest --pdb --maxfail=1 # Print output (disable capture) pytest -s # Show extra test summary info pytest -ra ``` ## Troubleshooting ### Common Issues #### 1. Tests Pass Locally but Fail in CI **Possible Causes**: - Python version differences - Missing dependencies - Environment-specific configuration - Timing issues in async tests **Solutions**: ```bash # Test with specific Python version pyenv install 3.11 pyenv local 3.11 pytest # Check for missing dependencies pip freeze > current-deps.txt diff requirements.txt current-deps.txt # Run tests with same settings as CI pytest -v --cov=src --cov-report=term ``` #### 2. Coverage Not Uploading to Codecov **Possible Causes**: - Missing or incorrect `CODECOV_TOKEN` - Coverage file not generated - Network issues **Solutions**: 1. Verify token in GitHub Secrets 2. Check if `coverage.xml` exists after test run 3. Review Codecov upload logs in workflow 4. Try manual upload: ```bash bash <(curl -s https://codecov.io/bash) -t YOUR_TOKEN ``` #### 3. Slow Test Execution **Possible Causes**: - Too many integration/e2e tests - Inefficient test setup/teardown - External service calls **Solutions**: ```bash # Identify slow tests pytest --durations=10 # Run only fast tests pytest -m "not slow" # Use parallel execution pytest -n auto # Profile test execution pytest --profile ``` #### 4. Import Errors **Possible Causes**: - Missing `__init__.py` files - Incorrect PYTHONPATH - Circular imports **Solutions**: ```bash # Check Python path python -c "import sys; print('\n'.join(sys.path))" # Run tests with explicit path PYTHONPATH=. pytest # Check for circular imports pytest --collect-only ``` ### Viewing Artifacts 1. Go to GitHub Actions tab 2. Click on a workflow run 3. Scroll to "Artifacts" section 4. Download: - Coverage reports - Test logs - Security reports ## Best Practices ### 1. Write Fast Tests ```python # Good: Fast unit test @pytest.mark.unit def test_vector_dimension_count(): vector = Vector([1.0, 2.0, 3.0]) assert vector.dimension_count == 3 # Avoid: Slow test with unnecessary delays def test_slow(): time.sleep(5) # Don't do this! assert True ``` ### 2. Use Appropriate Markers ```python # Mark tests appropriately @pytest.mark.unit @pytest.mark.fast def test_value_object(): pass @pytest.mark.integration @pytest.mark.requires_db def test_repository(): pass @pytest.mark.e2e @pytest.mark.slow def test_workflow(): pass ``` ### 3. Mock External Dependencies ```python # Good: Mock external service @pytest.mark.unit def test_document_handler(mock_repository): handler = CreateDocumentHandler(mock_repository) result = handler.handle(command) assert result is not None # Avoid: Real external calls in unit tests def test_with_real_api(): response = requests.get("https://api.example.com") # Don't do this! assert response.status_code == 200 ``` ### 4. Maintain High Coverage ```python # Cover edge cases def test_vector_empty_dimensions(): with pytest.raises(ValueError): Vector([]) def test_vector_invalid_dimensions(): with pytest.raises(ValueError): Vector([1, "invalid", 3]) def test_vector_normal_case(): vector = Vector([1.0, 2.0]) assert vector.dimension_count == 2 ``` ### 5. Keep Tests Independent ```python # Good: Independent test @pytest.fixture def clean_database(): db = create_test_db() yield db db.cleanup() def test_create_document(clean_database): # Test uses fresh database pass # Avoid: Tests depending on each other def test_step_1(): global shared_state shared_state = "value" def test_step_2(): # Depends on test_step_1 running first assert shared_state == "value" ``` ### 6. Use Descriptive Test Names ```python # Good: Descriptive names def test_vector_creation_with_valid_dimensions_succeeds(): pass def test_vector_creation_with_empty_dimensions_raises_value_error(): pass # Avoid: Vague names def test_vector_1(): pass def test_vector_2(): pass ``` ### 7. Document Complex Tests ```python def test_hybrid_search_score_combination(): """ Test that hybrid search correctly combines vector and text search scores. Given: - Vector search results with scores [0.9, 0.7, 0.5] - Text search results with scores [0.8, 0.6, 0.4] - Weight configuration: vector=0.7, text=0.3 When: - Scores are combined using weighted average Then: - Combined scores should be [0.87, 0.67, 0.47] - Results should be sorted by combined score """ # Test implementation pass ``` ## Continuous Improvement ### Monitoring - Review coverage trends in Codecov - Monitor test execution time - Track flaky tests - Review security scan results ### Optimization - Refactor slow tests - Add more unit tests - Reduce integration test dependencies - Parallelize test execution ### Documentation - Keep this guide updated - Document new test patterns - Share troubleshooting solutions - Update best practices ## References - [GitHub Actions Documentation](https://docs.github.com/en/actions) - [pytest Documentation](https://docs.pytest.org/) - [pytest-cov Documentation](https://pytest-cov.readthedocs.io/) - [Codecov Documentation](https://docs.codecov.com/) - [Coverage.py Documentation](https://coverage.readthedocs.io/) - [Hypothesis Documentation](https://hypothesis.readthedocs.io/)