Skip to content

Commit fc58b18

Browse files
committed
refactor: remove unused analyzers system and add comprehensive documentation
Remove legacy analyzers directory that was completely unused in favor of the SCIP strategies system. Add comprehensive architecture and official SCIP standards documentation. Changes: - Remove entire src/code_index_mcp/analyzers/ directory (9 files) - Update terminology in base_strategy.py and scip_symbol_analyzer.py - Add ARCHITECTURE.md with complete system architecture overview - Add SCIP_OFFICIAL_STANDARDS.md with pure official SCIP protocol standards - Update CLAUDE.local.md with documentation references The analyzers system was redundant with SCIP strategies and had no actual usage in the codebase. All file analysis now goes through the more comprehensive SCIP indexing system.
1 parent 9b19d1e commit fc58b18

File tree

13 files changed

+572
-608
lines changed

13 files changed

+572
-608
lines changed

ARCHITECTURE.md

Lines changed: 233 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,233 @@
1+
# Code Index MCP System Architecture
2+
3+
## Overview
4+
5+
Code Index MCP is a Model Context Protocol (MCP) server that provides intelligent code indexing and analysis capabilities. The system follows SCIP (Source Code Intelligence Protocol) standards and uses a service-oriented architecture with clear separation of concerns.
6+
7+
## High-Level Architecture
8+
9+
```
10+
┌─────────────────────────────────────────────────────────────────┐
11+
│ MCP Interface Layer │
12+
├─────────────────────────────────────────────────────────────────┤
13+
│ Service Layer │
14+
├─────────────────────────────────────────────────────────────────┤
15+
│ SCIP Core Layer │
16+
├─────────────────────────────────────────────────────────────────┤
17+
│ Language Strategies │
18+
├─────────────────────────────────────────────────────────────────┤
19+
│ Technical Tools Layer │
20+
└─────────────────────────────────────────────────────────────────┘
21+
```
22+
23+
## Layer Responsibilities
24+
25+
### 1. MCP Interface Layer (`server.py`)
26+
**Purpose**: Exposes MCP tools and handles protocol communication
27+
28+
**Key Components**:
29+
- MCP tool definitions (`@mcp.tool()`)
30+
- Error handling and response formatting
31+
- User interaction and guidance
32+
33+
**MCP Tools**:
34+
- `set_project_path` - Initialize project indexing
35+
- `find_files` - File discovery with patterns
36+
- `get_file_summary` - File analysis and metadata
37+
- `search_code_advanced` - Content search across files
38+
- `refresh_index` - Manual index rebuilding
39+
- `get_file_watcher_status` - File monitoring status
40+
- `configure_file_watcher` - File watcher settings
41+
42+
### 2. Service Layer (`services/`)
43+
**Purpose**: Business logic orchestration and workflow management
44+
45+
**Key Services**:
46+
- `ProjectManagementService` - Project lifecycle and initialization
47+
- `FileWatcherService` - Real-time file monitoring and auto-refresh
48+
- `IndexManagementService` - Index rebuild operations
49+
- `CodeIntelligenceService` - File analysis and symbol intelligence
50+
- `FileDiscoveryService` - File pattern matching and discovery
51+
- `SearchService` - Advanced code search capabilities
52+
53+
**Architecture Pattern**: Service delegation with clear business boundaries
54+
55+
### 3. SCIP Core Layer (`scip/core/`)
56+
**Purpose**: Language-agnostic SCIP protocol implementation
57+
58+
**Core Components**:
59+
- `SCIPSymbolManager` - Standard SCIP symbol ID generation
60+
- `LocalReferenceResolver` - Cross-file reference resolution
61+
- `PositionCalculator` - AST/Tree-sitter position conversion
62+
- `MonikerManager` - External package dependency handling
63+
64+
**Standards Compliance**: Full SCIP protocol buffer implementation
65+
66+
### 4. Language Strategies (`scip/strategies/`)
67+
**Purpose**: Language-specific code analysis using two-phase processing
68+
69+
**Strategy Pattern Implementation**:
70+
- `BaseStrategy` - Abstract interface and common functionality
71+
- `PythonStrategy` - Python AST analysis
72+
- `JavaScriptStrategy` - JavaScript/TypeScript Tree-sitter analysis
73+
- `JavaStrategy` - Java Tree-sitter analysis
74+
- `ObjectiveCStrategy` - Objective-C Tree-sitter analysis
75+
- `FallbackStrategy` - Generic text-based analysis
76+
77+
**Two-Phase Analysis**:
78+
1. **Phase 1**: Symbol definition collection
79+
2. **Phase 2**: Reference resolution and SCIP document generation
80+
81+
### 5. Technical Tools Layer (`tools/`)
82+
**Purpose**: Low-level technical capabilities
83+
84+
**Tool Categories**:
85+
- `filesystem/` - File system operations and pattern matching
86+
- `scip/` - SCIP index operations and symbol analysis
87+
- `config/` - Configuration and settings management
88+
- `monitoring/` - File watching and system monitoring
89+
90+
## Data Flow Architecture
91+
92+
### File Analysis Workflow
93+
```
94+
User Request → Service Layer → SCIP Strategy → Core Components → SCIP Documents
95+
```
96+
97+
### Index Management Workflow
98+
```
99+
File Changes → File Watcher → Index Management Service → Strategy Factory → Updated Index
100+
```
101+
102+
### Search Workflow
103+
```
104+
Search Query → Search Service → Advanced Search Tools → Filtered Results
105+
```
106+
107+
## SCIP Implementation Details
108+
109+
### Symbol ID Format
110+
```
111+
scip-{language} {manager} {package} [version] {descriptors}
112+
```
113+
114+
**Examples**:
115+
- Local: `scip-python local myproject src/main.py/MyClass#method().`
116+
- External: `scip-python pip requests 2.31.0 sessions/Session#get().`
117+
118+
### Language Support Strategy
119+
120+
**Parsing Approaches**:
121+
- **Python**: Native AST module
122+
- **JavaScript/TypeScript**: Tree-sitter
123+
- **Java**: Tree-sitter
124+
- **Objective-C**: Tree-sitter
125+
- **Others**: Fallback text analysis
126+
127+
**Supported Code Intelligence**:
128+
- Symbol definitions (functions, classes, variables)
129+
- Import/export tracking
130+
- Cross-file reference resolution
131+
- External dependency management
132+
- Position-accurate symbol ranges
133+
134+
## Configuration and Extensibility
135+
136+
### Package Manager Integration
137+
- **Python**: pip, conda, poetry detection
138+
- **JavaScript**: npm, yarn package.json parsing
139+
- **Java**: Maven pom.xml, Gradle build files
140+
- **Configuration-driven**: Easy addition of new package managers
141+
142+
### File Watcher System
143+
- **Real-time monitoring**: Watchdog-based file system events
144+
- **Debounced rebuilds**: 4-6 second batching of rapid changes
145+
- **Configurable patterns**: Customizable include/exclude rules
146+
- **Thread-safe**: ThreadPoolExecutor for concurrent rebuilds
147+
148+
## Performance Characteristics
149+
150+
### Indexing Performance
151+
- **Incremental updates**: File-level granular rebuilds
152+
- **Parallel processing**: Concurrent file analysis
153+
- **Memory efficient**: Streaming SCIP document generation
154+
- **Cache optimization**: Symbol table reuse across phases
155+
156+
### Search Performance
157+
- **Advanced tools**: ripgrep, ugrep, ag integration
158+
- **Pattern optimization**: Glob-based file filtering
159+
- **Result streaming**: Large result set handling
160+
161+
## Error Handling and Reliability
162+
163+
### Fault Tolerance
164+
- **Graceful degradation**: Continue indexing on individual file failures
165+
- **Error isolation**: Per-file error boundaries
166+
- **Recovery mechanisms**: Automatic retry on transient failures
167+
- **Comprehensive logging**: Debug and audit trail support
168+
169+
### Validation
170+
- **Input sanitization**: Path traversal protection
171+
- **Range validation**: SCIP position boundary checking
172+
- **Schema validation**: Protocol buffer structure verification
173+
174+
## Future Architecture Considerations
175+
176+
### Planned Enhancements
177+
1. **Function Call Relationships**: Complete call graph analysis
178+
2. **Type Information**: Enhanced semantic analysis
179+
3. **Cross-repository Navigation**: Multi-project symbol resolution
180+
4. **Language Server Protocol**: LSP compatibility layer
181+
5. **Distributed Indexing**: Horizontal scaling support
182+
183+
### Extension Points
184+
- **Custom strategies**: Plugin architecture for new languages
185+
- **Analysis plugins**: Custom symbol analyzers
186+
- **Export formats**: Multiple output format support
187+
- **Integration APIs**: External tool connectivity
188+
189+
## Directory Structure
190+
191+
```
192+
src/code_index_mcp/
193+
├── server.py # MCP interface layer
194+
├── services/ # Business logic services
195+
│ ├── project_management_service.py
196+
│ ├── file_watcher_service.py
197+
│ ├── index_management_service.py
198+
│ ├── code_intelligence_service.py
199+
│ └── ...
200+
├── scip/ # SCIP implementation
201+
│ ├── core/ # Language-agnostic core
202+
│ │ ├── symbol_manager.py
203+
│ │ ├── local_reference_resolver.py
204+
│ │ ├── position_calculator.py
205+
│ │ └── moniker_manager.py
206+
│ ├── strategies/ # Language-specific strategies
207+
│ │ ├── base_strategy.py
208+
│ │ ├── python_strategy.py
209+
│ │ ├── javascript_strategy.py
210+
│ │ └── ...
211+
│ └── factory.py # Strategy selection
212+
├── tools/ # Technical capabilities
213+
│ ├── filesystem/
214+
│ ├── scip/
215+
│ ├── config/
216+
│ └── monitoring/
217+
├── indexing/ # Index management
218+
└── utils/ # Shared utilities
219+
```
220+
221+
## Key Design Principles
222+
223+
1. **Standards Compliance**: Full SCIP protocol adherence
224+
2. **Language Agnostic**: Core components independent of specific languages
225+
3. **Extensible**: Easy addition of new languages and features
226+
4. **Performance**: Efficient indexing and search operations
227+
5. **Reliability**: Fault-tolerant with comprehensive error handling
228+
6. **Maintainability**: Clear separation of concerns and modular design
229+
230+
---
231+
232+
*Last updated: 2025-01-14*
233+
*Architecture version: 2.1.0*

0 commit comments

Comments
 (0)