Development Guide

How the codebase is structured and how to modify it.

Project Structure

PersonalDataHub/
├── src/
│   ├── index.ts                 # Entry point — loads config, creates DB, starts server
│   ├── cli.ts                   # CLI commands (init, start, stop, status, mcp, reset)
│   ├── config/                  # Configuration system
│   │   ├── types.ts             # TypeScript interfaces (HubConfig, SourceConfig, etc.)
│   │   ├── schema.ts            # Zod schemas for config validation
│   │   └── loader.ts            # YAML loading with ${ENV_VAR} resolution
│   ├── db/                      # Database layer
│   │   ├── db.ts                # SQLite connection (WAL mode, foreign keys)
│   │   ├── schema.ts            # CREATE TABLE statements
│   │   └── encryption.ts        # AES-256-GCM encrypt/decrypt for OAuth tokens
│   ├── auth/                    # Authentication
│   │   ├── oauth-routes.ts      # OAuth flows (Gmail, GitHub)
│   │   └── token-manager.ts     # Encrypted token storage and refresh
│   ├── connectors/              # Source service integrations
│   │   ├── types.ts             # DataRow interface, SourceConnector interface
│   │   ├── gmail/
│   │   │   └── connector.ts     # Gmail fetch, executeAction
│   │   └── github/
│   │       ├── connector.ts     # GitHub access validation, fetch issues/PRs
│   │       └── setup.ts         # Grant/revoke collaborator access
│   ├── filters.ts               # Quick filter types, catalog, and apply logic
│   ├── mcp/                     # MCP server
│   │   ├── server.ts            # Stdio MCP server with source-specific tools
│   │   └── server.test.ts       # MCP server tests
│   ├── audit/
│   │   └── log.ts               # AuditLog class — typed write methods + filtered queries
│   ├── server/                  # HTTP layer
│   │   ├── server.ts            # Hono app setup, mounts API + GUI + OAuth routes
│   │   └── app-api.ts           # POST /pull, POST /propose, GET /sources
│   ├── gui/
│   │   └── routes.ts            # Self-contained HTML GUI with inline JS
│   └── test-utils.ts            # Shared test utilities
├── packages/
│   └── personaldatahub/         # OpenClaw skill
│       └── src/
│           ├── index.ts         # Plugin registration
│           ├── hub-client.ts    # HTTP client for PersonalDataHub API
│           ├── tools.ts         # Tool definitions
│           └── prompts.ts       # System prompt for teaching agents
├── tests/
│   └── e2e/                     # End-to-end integration tests
│       ├── helpers.ts           # Shared setup (mock connector, in-memory DB)
│       ├── gmail-recent-readonly.test.ts
│       ├── gmail-metadata-only.test.ts
│       ├── gmail-full-access-redacted.test.ts
│       └── gmail-staged-action.test.ts
├── systemdesigns/                        # Documentation
├── hub-config.example.yaml
├── package.json
├── tsconfig.json
├── vitest.config.ts
└── eslint.config.js

Tech Stack

Component	Choice
Language	TypeScript (strict, ESM, NodeNext modules)
Runtime	Node.js >= 22
Database	better-sqlite3 (WAL mode)
Encryption	AES-256-GCM (application-level, for OAuth tokens)
HTTP Server	Hono (bound to 127.0.0.1)
Agent Protocol	MCP via @modelcontextprotocol/sdk
Config	YAML + Zod validation
Gmail API	googleapis
GitHub API	octokit
Tests	Vitest
Package Manager	pnpm

Development Commands

# Build
pnpm build

# Watch mode for TypeScript
pnpm dev

# Run all tests
pnpm test

# Run tests in watch mode
pnpm test:watch

# Lint
pnpm lint

Database Schema

Tables in SQLite:

Table	Purpose
`oauth_tokens`	Encrypted OAuth tokens per source
`owner_auth`	Bcrypt-hashed owner password for GUI access
`filters`	Quick filter definitions per source (type, value, enabled)
`staging`	Outbound actions pending owner review
`audit_log`	Every data movement with timestamps and purpose

Key Data Types

DataRow

The normalized shape for all source data:

type DataRow = {
  source: string;           // "gmail", "github"
  source_item_id: string;   // original ID in source system
  type: string;             // "email", "issue", "pr", "commit"
  timestamp: string;        // ISO 8601
  data: Record<string, unknown>;  // all content fields
}

QuickFilter

Stored in the filters table:

type QuickFilter = {
  id: string;
  source: string;     // "gmail", "github"
  type: string;       // "time_after", "from_include", "subject_include", etc.
  value: string;      // filter value (e.g., date, sender name, field name)
  enabled: number;    // 1 = active, 0 = disabled
}

Available filter types: time_after, from_include, subject_include, exclude_sender, exclude_keyword, has_attachment, hide_field.

How to Add a New Source Connector

Create src/connectors/<source>/connector.ts implementing SourceConnector:

interface SourceConnector {
  name: string;
  fetch(boundary: SourceBoundary, params?: Record<string, unknown>): Promise<DataRow[]>;
  executeAction(actionType: string, actionData: Record<string, unknown>): Promise<ActionResult>;
}

Register the connector in src/index.ts by adding it to the connector registry.
Add MCP tools for the new source in src/mcp/server.ts — add a registerXxxTools() function and wire it to the source name in startMcpServer().
Add tests in src/connectors/<source>/<source>.test.ts.
The connector maps source API responses into DataRow[] — all content goes in data, the four fixed fields (source, source_item_id, type, timestamp) are set at the top level.

How Data Flows

Pull request (`POST /app/v1/pull`)

Fetches live data from the connector using the configured boundary
Loads enabled quick filters from the filters table for that source
Applies filters: row predicates first (time_after, from_include, etc.), then field removal (hide_field)
Logs the access to the audit log
Returns the filtered data

Request → fetch data (live from source) → apply filters → audit log → response

MCP tool call (e.g., `read_emails`)

Agent calls the MCP tool via stdio
The MCP server builds an HTTP request body from the tool arguments
Calls POST /app/v1/pull on the local HTTP server
Returns the JSON response as MCP text content

Agent → MCP stdio → fetch(hubUrl/app/v1/pull) → HTTP server → connector → filters → response

Propose action (`POST /app/v1/propose`)

Inserts the action into the staging table with status pending
Logs the proposal to the audit log
Owner reviews in the GUI and approves/rejects
On approval, the connector's executeAction is called

Testing

Tests are co-located with source files (*.test.ts) for unit tests, and in tests/e2e/ for integration tests.

The e2e tests use an in-memory SQLite database and a mock Gmail connector (defined in tests/e2e/helpers.ts) so they run without external services.

To run a specific test file:

npx vitest run tests/e2e/gmail-recent-readonly.test.ts

To run just the MCP server tests:

npx vitest run src/mcp/server.test.ts

OpenClaw Skill

The skill in packages/personaldatahub/ is a standalone package with its own tsconfig.json and test suite. It wraps the PersonalDataHub API endpoints as OpenClaw tools.

To work on it:

cd packages/personaldatahub
pnpm test

The skill has no dependency on the main PersonalDataHub source — it only talks to the Hub over HTTP.

MCP Server

The MCP server (src/mcp/server.ts) provides a stdio transport for MCP-compatible agents. It:

Reads ~/.pdh/config.json for the hub URL
Health-checks the running HTTP server
Queries GET /app/v1/sources to discover connected sources
Registers source-specific tools dynamically (only for connected sources)

To test the MCP server manually:

npx pdh mcp
# Logs registered tools to stderr, then listens on stdio for MCP protocol messages

To add tools for a new source, add a registerXxxTools() function in src/mcp/server.ts and call it conditionally based on the source's connection status.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development Guide

Project Structure

Tech Stack

Development Commands

Database Schema

Key Data Types

DataRow

QuickFilter

How to Add a New Source Connector

How Data Flows

Pull request (`POST /app/v1/pull`)

MCP tool call (e.g., `read_emails`)

Propose action (`POST /app/v1/propose`)

Testing

OpenClaw Skill

MCP Server

FilesExpand file tree

DEVELOPMENT.md

Latest commit

History

DEVELOPMENT.md

File metadata and controls

Development Guide

Project Structure

Tech Stack

Development Commands

Database Schema

Key Data Types

DataRow

QuickFilter

How to Add a New Source Connector

How Data Flows

Pull request (POST /app/v1/pull)

MCP tool call (e.g., read_emails)

Propose action (POST /app/v1/propose)

Testing

OpenClaw Skill

MCP Server

Pull request (`POST /app/v1/pull`)

MCP tool call (e.g., `read_emails`)

Propose action (`POST /app/v1/propose`)