|
| 1 | +# Code Index MCP System Architecture |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Code Index MCP is a Model Context Protocol (MCP) server that provides intelligent code indexing and analysis capabilities. The system follows SCIP (Source Code Intelligence Protocol) standards and uses a service-oriented architecture with clear separation of concerns. |
| 6 | + |
| 7 | +## High-Level Architecture |
| 8 | + |
| 9 | +``` |
| 10 | +┌─────────────────────────────────────────────────────────────────┐ |
| 11 | +│ MCP Interface Layer │ |
| 12 | +├─────────────────────────────────────────────────────────────────┤ |
| 13 | +│ Service Layer │ |
| 14 | +├─────────────────────────────────────────────────────────────────┤ |
| 15 | +│ SCIP Core Layer │ |
| 16 | +├─────────────────────────────────────────────────────────────────┤ |
| 17 | +│ Language Strategies │ |
| 18 | +├─────────────────────────────────────────────────────────────────┤ |
| 19 | +│ Technical Tools Layer │ |
| 20 | +└─────────────────────────────────────────────────────────────────┘ |
| 21 | +``` |
| 22 | + |
| 23 | +## Layer Responsibilities |
| 24 | + |
| 25 | +### 1. MCP Interface Layer (`server.py`) |
| 26 | +**Purpose**: Exposes MCP tools and handles protocol communication |
| 27 | + |
| 28 | +**Key Components**: |
| 29 | +- MCP tool definitions (`@mcp.tool()`) |
| 30 | +- Error handling and response formatting |
| 31 | +- User interaction and guidance |
| 32 | + |
| 33 | +**MCP Tools**: |
| 34 | +- `set_project_path` - Initialize project indexing |
| 35 | +- `find_files` - File discovery with patterns |
| 36 | +- `get_file_summary` - File analysis and metadata |
| 37 | +- `search_code_advanced` - Content search across files |
| 38 | +- `refresh_index` - Manual index rebuilding |
| 39 | +- `get_file_watcher_status` - File monitoring status |
| 40 | +- `configure_file_watcher` - File watcher settings |
| 41 | + |
| 42 | +### 2. Service Layer (`services/`) |
| 43 | +**Purpose**: Business logic orchestration and workflow management |
| 44 | + |
| 45 | +**Key Services**: |
| 46 | +- `ProjectManagementService` - Project lifecycle and initialization |
| 47 | +- `FileWatcherService` - Real-time file monitoring and auto-refresh |
| 48 | +- `IndexManagementService` - Index rebuild operations |
| 49 | +- `CodeIntelligenceService` - File analysis and symbol intelligence |
| 50 | +- `FileDiscoveryService` - File pattern matching and discovery |
| 51 | +- `SearchService` - Advanced code search capabilities |
| 52 | + |
| 53 | +**Architecture Pattern**: Service delegation with clear business boundaries |
| 54 | + |
| 55 | +### 3. SCIP Core Layer (`scip/core/`) |
| 56 | +**Purpose**: Language-agnostic SCIP protocol implementation |
| 57 | + |
| 58 | +**Core Components**: |
| 59 | +- `SCIPSymbolManager` - Standard SCIP symbol ID generation |
| 60 | +- `LocalReferenceResolver` - Cross-file reference resolution |
| 61 | +- `PositionCalculator` - AST/Tree-sitter position conversion |
| 62 | +- `MonikerManager` - External package dependency handling |
| 63 | + |
| 64 | +**Standards Compliance**: Full SCIP protocol buffer implementation |
| 65 | + |
| 66 | +### 4. Language Strategies (`scip/strategies/`) |
| 67 | +**Purpose**: Language-specific code analysis using two-phase processing |
| 68 | + |
| 69 | +**Strategy Pattern Implementation**: |
| 70 | +- `BaseStrategy` - Abstract interface and common functionality |
| 71 | +- `PythonStrategy` - Python AST analysis |
| 72 | +- `JavaScriptStrategy` - JavaScript/TypeScript Tree-sitter analysis |
| 73 | +- `JavaStrategy` - Java Tree-sitter analysis |
| 74 | +- `ObjectiveCStrategy` - Objective-C Tree-sitter analysis |
| 75 | +- `FallbackStrategy` - Generic text-based analysis |
| 76 | + |
| 77 | +**Two-Phase Analysis**: |
| 78 | +1. **Phase 1**: Symbol definition collection |
| 79 | +2. **Phase 2**: Reference resolution and SCIP document generation |
| 80 | + |
| 81 | +### 5. Technical Tools Layer (`tools/`) |
| 82 | +**Purpose**: Low-level technical capabilities |
| 83 | + |
| 84 | +**Tool Categories**: |
| 85 | +- `filesystem/` - File system operations and pattern matching |
| 86 | +- `scip/` - SCIP index operations and symbol analysis |
| 87 | +- `config/` - Configuration and settings management |
| 88 | +- `monitoring/` - File watching and system monitoring |
| 89 | + |
| 90 | +## Data Flow Architecture |
| 91 | + |
| 92 | +### File Analysis Workflow |
| 93 | +``` |
| 94 | +User Request → Service Layer → SCIP Strategy → Core Components → SCIP Documents |
| 95 | +``` |
| 96 | + |
| 97 | +### Index Management Workflow |
| 98 | +``` |
| 99 | +File Changes → File Watcher → Index Management Service → Strategy Factory → Updated Index |
| 100 | +``` |
| 101 | + |
| 102 | +### Search Workflow |
| 103 | +``` |
| 104 | +Search Query → Search Service → Advanced Search Tools → Filtered Results |
| 105 | +``` |
| 106 | + |
| 107 | +## SCIP Implementation Details |
| 108 | + |
| 109 | +### Symbol ID Format |
| 110 | +``` |
| 111 | +scip-{language} {manager} {package} [version] {descriptors} |
| 112 | +``` |
| 113 | + |
| 114 | +**Examples**: |
| 115 | +- Local: `scip-python local myproject src/main.py/MyClass#method().` |
| 116 | +- External: `scip-python pip requests 2.31.0 sessions/Session#get().` |
| 117 | + |
| 118 | +### Language Support Strategy |
| 119 | + |
| 120 | +**Parsing Approaches**: |
| 121 | +- **Python**: Native AST module |
| 122 | +- **JavaScript/TypeScript**: Tree-sitter |
| 123 | +- **Java**: Tree-sitter |
| 124 | +- **Objective-C**: Tree-sitter |
| 125 | +- **Others**: Fallback text analysis |
| 126 | + |
| 127 | +**Supported Code Intelligence**: |
| 128 | +- Symbol definitions (functions, classes, variables) |
| 129 | +- Import/export tracking |
| 130 | +- Cross-file reference resolution |
| 131 | +- External dependency management |
| 132 | +- Position-accurate symbol ranges |
| 133 | + |
| 134 | +## Configuration and Extensibility |
| 135 | + |
| 136 | +### Package Manager Integration |
| 137 | +- **Python**: pip, conda, poetry detection |
| 138 | +- **JavaScript**: npm, yarn package.json parsing |
| 139 | +- **Java**: Maven pom.xml, Gradle build files |
| 140 | +- **Configuration-driven**: Easy addition of new package managers |
| 141 | + |
| 142 | +### File Watcher System |
| 143 | +- **Real-time monitoring**: Watchdog-based file system events |
| 144 | +- **Debounced rebuilds**: 4-6 second batching of rapid changes |
| 145 | +- **Configurable patterns**: Customizable include/exclude rules |
| 146 | +- **Thread-safe**: ThreadPoolExecutor for concurrent rebuilds |
| 147 | + |
| 148 | +## Performance Characteristics |
| 149 | + |
| 150 | +### Indexing Performance |
| 151 | +- **Incremental updates**: File-level granular rebuilds |
| 152 | +- **Parallel processing**: Concurrent file analysis |
| 153 | +- **Memory efficient**: Streaming SCIP document generation |
| 154 | +- **Cache optimization**: Symbol table reuse across phases |
| 155 | + |
| 156 | +### Search Performance |
| 157 | +- **Advanced tools**: ripgrep, ugrep, ag integration |
| 158 | +- **Pattern optimization**: Glob-based file filtering |
| 159 | +- **Result streaming**: Large result set handling |
| 160 | + |
| 161 | +## Error Handling and Reliability |
| 162 | + |
| 163 | +### Fault Tolerance |
| 164 | +- **Graceful degradation**: Continue indexing on individual file failures |
| 165 | +- **Error isolation**: Per-file error boundaries |
| 166 | +- **Recovery mechanisms**: Automatic retry on transient failures |
| 167 | +- **Comprehensive logging**: Debug and audit trail support |
| 168 | + |
| 169 | +### Validation |
| 170 | +- **Input sanitization**: Path traversal protection |
| 171 | +- **Range validation**: SCIP position boundary checking |
| 172 | +- **Schema validation**: Protocol buffer structure verification |
| 173 | + |
| 174 | +## Future Architecture Considerations |
| 175 | + |
| 176 | +### Planned Enhancements |
| 177 | +1. **Function Call Relationships**: Complete call graph analysis |
| 178 | +2. **Type Information**: Enhanced semantic analysis |
| 179 | +3. **Cross-repository Navigation**: Multi-project symbol resolution |
| 180 | +4. **Language Server Protocol**: LSP compatibility layer |
| 181 | +5. **Distributed Indexing**: Horizontal scaling support |
| 182 | + |
| 183 | +### Extension Points |
| 184 | +- **Custom strategies**: Plugin architecture for new languages |
| 185 | +- **Analysis plugins**: Custom symbol analyzers |
| 186 | +- **Export formats**: Multiple output format support |
| 187 | +- **Integration APIs**: External tool connectivity |
| 188 | + |
| 189 | +## Directory Structure |
| 190 | + |
| 191 | +``` |
| 192 | +src/code_index_mcp/ |
| 193 | +├── server.py # MCP interface layer |
| 194 | +├── services/ # Business logic services |
| 195 | +│ ├── project_management_service.py |
| 196 | +│ ├── file_watcher_service.py |
| 197 | +│ ├── index_management_service.py |
| 198 | +│ ├── code_intelligence_service.py |
| 199 | +│ └── ... |
| 200 | +├── scip/ # SCIP implementation |
| 201 | +│ ├── core/ # Language-agnostic core |
| 202 | +│ │ ├── symbol_manager.py |
| 203 | +│ │ ├── local_reference_resolver.py |
| 204 | +│ │ ├── position_calculator.py |
| 205 | +│ │ └── moniker_manager.py |
| 206 | +│ ├── strategies/ # Language-specific strategies |
| 207 | +│ │ ├── base_strategy.py |
| 208 | +│ │ ├── python_strategy.py |
| 209 | +│ │ ├── javascript_strategy.py |
| 210 | +│ │ └── ... |
| 211 | +│ └── factory.py # Strategy selection |
| 212 | +├── tools/ # Technical capabilities |
| 213 | +│ ├── filesystem/ |
| 214 | +│ ├── scip/ |
| 215 | +│ ├── config/ |
| 216 | +│ └── monitoring/ |
| 217 | +├── indexing/ # Index management |
| 218 | +└── utils/ # Shared utilities |
| 219 | +``` |
| 220 | + |
| 221 | +## Key Design Principles |
| 222 | + |
| 223 | +1. **Standards Compliance**: Full SCIP protocol adherence |
| 224 | +2. **Language Agnostic**: Core components independent of specific languages |
| 225 | +3. **Extensible**: Easy addition of new languages and features |
| 226 | +4. **Performance**: Efficient indexing and search operations |
| 227 | +5. **Reliability**: Fault-tolerant with comprehensive error handling |
| 228 | +6. **Maintainability**: Clear separation of concerns and modular design |
| 229 | + |
| 230 | +--- |
| 231 | + |
| 232 | +*Last updated: 2025-01-14* |
| 233 | +*Architecture version: 2.1.0* |
0 commit comments