Skip to content

quyen-ngv/source-atlas

Repository files navigation

Source Atlas

License: MIT Python 3.8+ Ask DeepWiki

Source Atlas is a powerful multi-language code analyzer that combines Tree-sitter parsing, Language Server Protocol (LSP) integration, and Neo4j graph database to create comprehensive code knowledge graphs.

✨ Features

  • 🌍 Multi-Language Support: Analyze Java, Python, Go, and TypeScript codebases
  • πŸ” Deep Code Analysis: Extract classes, methods, dependencies, and relationships
  • 🧠 LSP Integration: Leverage Language Server Protocol for semantic analysis
  • πŸ“Š Knowledge Graph: Build rich code graphs in Neo4j for advanced querying
  • 🎯 AST-Based: Uses Tree-sitter for accurate syntax parsing
  • ⚑ Incremental Analysis: Track code changes with AST hashing
  • πŸ”— Relationship Tracking: Discover implements, extends, uses, and calls relationships

πŸ—οΈ Architecture

graph TB
    A[Source Code] --> B[Tree-sitter Parser]
    B --> C[AST Analysis]
    C --> D[LSP Service]
    D --> E[Code Analyzer]
    E --> F[Code Chunks]
    F --> G[Neo4j Knowledge Graph]
Loading

Components:

  • Analyzers: Language-specific code analyzers (Java, Python, Go, TypeScript)
  • Extractors: Extract specific code elements (classes, methods, endpoints)
  • LSP Service: Integrates with language servers for semantic information
  • Neo4j Service: Manages code graph database operations
  • Models: Domain models for code chunks, methods, and relationships

πŸ“‹ Prerequisites

  • Python: 3.8 or higher
  • Neo4j: 5.x running locally or remotely
  • Language-specific tools (for the languages you want to analyze):
    • Java: JDK 11+ (for LSP server)
    • Python: Python 3.8+
    • Go: Go 1.16+
    • TypeScript: Node.js 14+

πŸš€ Installation

1. Clone the repository

git clone https://github.com/quyen-ngv/source-atlas.git
cd source-atlas

2. Create virtual environment

python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux/Mac
source .venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Set up Neo4j

Download and install Neo4j Desktop or use Docker:

docker run -d \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your_password \
  neo4j:5.14.0

5. Configure environment

Create a .env file from the template:

cp .env.example .env

Edit .env with your settings:

APP_NEO4J_URL=bolt://localhost:7687
APP_NEO4J_USER=neo4j
APP_NEO4J_PASSWORD=your_password
APP_NEO4J_DATABASE=neo4j

πŸ’» Quick Start

Basic Usage

python -m source_atlas analyze \
  --project-path /path/to/your/project \
  --language java \
  --project-id my-project \
  --output ./output

Using as a Library

from pathlib import Path
from analyzers.analyzer_factory import AnalyzerFactory
from neo4jdb.neo4j_service import Neo4jService

# Create analyzer
analyzer = AnalyzerFactory.create_analyzer(
    language="java",
    root_path="/path/to/project",
    project_id="my-project",
    branch="main"
)

# Analyze project
with analyzer:
    chunks = analyzer.parse_project(Path("/path/to/project"))

# Import to Neo4j
neo4j_service = Neo4jService(
    url="bolt://localhost:7687",
    user="neo4j",
    password="your_password"
)
neo4j_service.neo4j_service.import_code_chunks(
                chunks=chunks,
                batch_size=500,
                main_branch='main',
                base_branch='main',
                pull_request_id=None
            )

πŸ”§ Configuration

Environment Variables

Variable Description Default Required
APP_NEO4J_URL Neo4j connection URL bolt://localhost:7687 Yes
APP_NEO4J_USER Neo4j username neo4j Yes
APP_NEO4J_PASSWORD Neo4j password - Yes
APP_NEO4J_DATABASE Neo4j database name neo4j Yes
NEO4J_MAX_CONNECTION_POOL_SIZE Max connection pool size 50 No
NEO4J_CONNECTION_TIMEOUT Connection timeout (seconds) 30.0 No

See docs/configuration.md for detailed configuration options.

πŸ“š Documentation

🎯 Examples

Analyze a Java Project

python -m source_atlas analyze \
  --project-path ./examples/java_project \
  --language java \
  --project-id example-java \
  --branch main

Query the Knowledge Graph

// Find all classes in a package
MATCH (c:Class {package: "com.example.service"})
RETURN c.className, c.filePath

// Find method call relationships
MATCH (m1:Method)-[:CALLS]->(m2:Method)
RETURN m1.name, m2.name

// Find implementation hierarchies
MATCH (c:Class)-[:IMPLEMENTS]->(i:Class)
RETURN c.fullClassName, i.fullClassName

πŸ—‚οΈ Project Structure

source_atlas/
β”œβ”€β”€ analyzers/          # Language-specific code analyzers
β”‚   β”œβ”€β”€ base_analyzer.py
β”‚   β”œβ”€β”€ java_analyzer.py
β”‚   └── analyzer_factory.py
β”œβ”€β”€ extractors/         # Code element extractors
β”‚   β”œβ”€β”€ java/
β”‚   β”œβ”€β”€ python/
β”‚   β”œβ”€β”€ go/
β”‚   └── typescript/
β”œβ”€β”€ lsp/               # LSP service integration
β”‚   β”œβ”€β”€ lsp_service.py
β”‚   └── implements/
β”œβ”€β”€ models/            # Domain models
β”‚   └── domain_models.py
β”œβ”€β”€ neo4jdb/           # Neo4j integration
β”‚   β”œβ”€β”€ neo4j_service.py
β”‚   └── neo4j_dto.py
β”œβ”€β”€ utils/             # Utility functions
└── config/            # Configuration

🀝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Tree-sitter - Incremental parsing system
  • Neo4j - Graph database platform
  • LSP - Language Server Protocol

πŸ“§ Contact

πŸ› Issues & Support

If you encounter any issues or have questions:


Made with ❀️ by Nguyen Van Quyen

About

a tool to parse source code into a knowledge graph

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors