RAGFS

An agentic FUSE filesystem that makes file management safe and structured for LLM agents. Includes JSON-based operations with undo support, complete audit logging, and AI-powered features like semantic search, auto-organization, and deduplication.

Features

Agent File Operations - Structured file ops with JSON feedback via .ops/ interface
Safety Layer - Soft delete, audit logging, and undo support via .safety/
AI-Powered Management - Auto-organization, deduplication, and cleanup via .semantic/
Semantic Search - Query files by meaning using vector similarity search
Local Embeddings - Runs entirely offline using the gte-small model via Candle
FUSE Integration - Mount indexed directories as a virtual filesystem
Real-time Indexing - Watch directories for changes and update the index automatically
Multimodal Support - Extract content from text, code, markdown, PDF, and images
Code-aware Chunking - Syntax-aware splitting using tree-sitter for source code
Hybrid Search - Combine vector similarity with full-text search
MCP Server - Claude Desktop integration for AI assistants
Comprehensive Testing - 270+ tests across all crates ensuring reliability

Feature Status

Feature	Status	Notes
CLI (index, query, status)	Stable	Core functionality
FUSE mount	Stable	Linux only
Semantic search	Stable	Vector similarity with LanceDB
Hybrid search	Stable	Vector + full-text
Text extraction	Stable	40+ formats
Code chunking	Stable	Tree-sitter based
PDF extraction	Stable	Text + embedded images
Agent operations (.ops/)	Stable	JSON feedback, batch support
Safety layer (.safety/)	Stable	Trash, history, undo
Semantic operations (.semantic/)	Beta	Organize, dedupe, cleanup
Python bindings	Beta	PyO3 based
MCP server	Beta	Claude Desktop integration
Image captioning	Experimental	Optional, requires `vision` feature

Use Cases

Ideal for:

LLM agents managing files (Claude, GPT, local models)
Automated file organization and cleanup
Safe file operations with audit trail
Code repositories (1K-50K files)
Documentation collections
Research notes and papers
Local-first semantic search

Limitations:

Linux only (FUSE requirement)
Embedding model requires ~500MB disk
Large repositories (100K+ files) may need tuning

Requirements

Rust 1.88 or later
Linux with FUSE support (libfuse-dev on Debian/Ubuntu, fuse on Arch)
~500MB disk space for the embedding model (downloaded on first run)

Installation

# Clone the repository
git clone https://github.com/Venere-Labs/ragfs.git
cd ragfs

# Build in release mode
cargo build --release

# Install to ~/.cargo/bin
cargo install --path crates/ragfs

Quick Start

Index a directory

# Index all files in a directory
ragfs index ~/Documents

# Watch for changes (continuous indexing)
ragfs index ~/Documents --watch

Search your files

# Semantic search
ragfs query ~/Documents "machine learning implementation"

# Get more results
ragfs query ~/Documents "authentication logic" --limit 20

# JSON output for scripting
ragfs query ~/Documents "database connection" --format json

Mount as a filesystem

# Create a mount point
mkdir ~/ragfs-mount

# Mount the indexed directory
ragfs mount ~/Documents ~/ragfs-mount --foreground

Check index status

ragfs status ~/Documents

Agent file operations (via FUSE mount)

# Create a file with feedback
echo -e "docs/new.md\n# New Document" > ~/ragfs-mount/.ragfs/.ops/.create
cat ~/ragfs-mount/.ragfs/.ops/.result  # JSON with undo_id

# Delete a file (soft delete to trash)
echo "docs/old.md" > ~/ragfs-mount/.ragfs/.ops/.delete

# Find similar files
echo "src/main.rs" > ~/ragfs-mount/.ragfs/.semantic/.similar
cat ~/ragfs-mount/.ragfs/.semantic/.similar

# Undo an operation
echo "<undo_id>" > ~/ragfs-mount/.ragfs/.safety/.undo

CLI Reference

ragfs [OPTIONS] <COMMAND>

Commands:
  mount   Mount a directory as a RAGFS filesystem
  index   Index a directory (without mounting)
  query   Query the index
  status  Show index status
  config  Manage configuration

Options:
  -c, --config <FILE>    Config file path [default: ~/.config/ragfs/config.toml]
  -v, --verbose          Enable verbose logging
  -f, --format <FORMAT>  Output format: text, json [default: text]
  -h, --help             Print help
  -V, --version          Print version

mount

ragfs mount <SOURCE> <MOUNTPOINT> [OPTIONS]

Arguments:
  <SOURCE>      Source directory to index
  <MOUNTPOINT>  Mount point

Options:
  -f, --foreground  Run in foreground (don't daemonize)
      --allow-other Allow other users to access the mount

index

ragfs index <PATH> [OPTIONS]

Arguments:
  <PATH>  Directory to index

Options:
  -f, --force  Force reindexing of all files
  -w, --watch  Watch for changes after initial indexing

query

ragfs query <PATH> <QUERY> [OPTIONS]

Arguments:
  <PATH>   Path to indexed directory
  <QUERY>  Query string

Options:
  -l, --limit <LIMIT>  Maximum results [default: 10]

status

ragfs status <PATH>

Arguments:
  <PATH>  Path to indexed directory

config

ragfs config <ACTION>

Actions:
  show  Display current configuration
  init  Print sample config file
  path  Print config file path

Architecture

RAGFS is organized as a Rust workspace with specialized crates:

Crate	Description
`ragfs`	CLI application
`ragfs-core`	Core traits and types
`ragfs-fuse`	FUSE filesystem implementation
`ragfs-index`	File indexing engine
`ragfs-chunker`	Document chunking strategies
`ragfs-embed`	Embedding generation (Candle)
`ragfs-extract`	Content extraction
`ragfs-store`	Vector storage (LanceDB)
`ragfs-query`	Query execution

See docs/ARCHITECTURE.md for detailed architecture documentation.

Documentation

Getting Started - 5-minute tutorial
User Guide - Complete CLI reference
Configuration - All config options
Performance Guide - Tuning and optimization
Troubleshooting - Common issues and solutions
Architecture - Technical deep-dive
Architecture Decisions - Why we made these choices
API Reference - Library usage and types
Python Bindings - Python SDK and framework integrations
MCP Server - Claude Desktop integration
Development Guide - Contributing to RAGFS

How It Works

Extraction - Content is extracted from files based on their MIME type
Chunking - Text is split into overlapping chunks (~512 tokens each)
Embedding - Each chunk is converted to a 384-dimensional vector using the gte-small model
Storage - Vectors are stored in LanceDB for efficient similarity search
Search - Queries are embedded and matched against stored vectors using cosine similarity

Storage Locations

Indices: ~/.local/share/ragfs/indices/{hash}/index.lance
Models: ~/.local/share/ragfs/models/

License

Licensed under either of:

Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Contributing

See CONTRIBUTING.md for guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.cargo		.cargo
.github		.github
crates		crates
docs		docs
examples/docker-rag-stack		examples/docker-rag-stack
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Cross.toml		Cross.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md
SECURITY.md		SECURITY.md
cliff.toml		cliff.toml
codecov.yml		codecov.yml
deny.toml		deny.toml
release-plz.toml		release-plz.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAGFS

Features

Feature Status

Use Cases

Requirements

Installation

Quick Start

Index a directory

Search your files

Mount as a filesystem

Check index status

Agent file operations (via FUSE mount)

CLI Reference

mount

index

query

status

config

Architecture

Documentation

How It Works

Storage Locations

License

Contributing

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAGFS

Features

Feature Status

Use Cases

Requirements

Installation

Quick Start

Index a directory

Search your files

Mount as a filesystem

Check index status

Agent file operations (via FUSE mount)

CLI Reference

mount

index

query

status

config

Architecture

Documentation

How It Works

Storage Locations

License

Contributing

About

Topics

Resources

License

Licenses found

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages