Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions docs/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,36 @@

**Agent Activity**: [Search for Copilot activity](https://github.com/EBISPOT/efo/issues?q=author%3Aapp%2Fcopilot)

## Agentic Infrastructure

### ai-blame
**Repository**: [ai4curation/ai-blame](https://github.com/ai4curation/ai-blame)

Provenance extraction from agent execution traces. Demonstrates line-level attribution for AI-assisted edits.

- [Docs](https://ai4curation.github.io/ai-blame) | [PyPI](https://pypi.org/project/ai-blame/)

### curation-skills
**Repository**: [ai4curation/curation-skills](https://github.com/ai4curation/curation-skills)

Reusable skill packs for ontology and biocuration tasks. Example of structuring agent behavior for consistency.

- [Skills Article](https://anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills)

### noctua-mcp
**Repository**: [geneontology/noctua-mcp](https://github.com/geneontology/noctua-mcp)

MCP server for GO-CAM editing — an example of wrapping a domain-specific API for agent access.

- [PyPI](https://pypi.org/project/noctua-mcp/)

### ICBO 2025 AI Tutorial
**Repository**: [ai4curation/icbo-ai-tutorial](https://github.com/ai4curation/icbo-ai-tutorial)

Tutorial materials for agent-assisted ontology curation workflows, with exercises and social coding patterns.

- [Tutorial Page](https://go.lbl.gov/icbo-2025-ai) | [Recording](https://youtu.be/_9Re39yB7EE) | [Zenodo](https://zenodo.org/records/18653147)

## Documentation Repositories

### AI4Curators Documentation
Expand Down
33 changes: 32 additions & 1 deletion docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,35 @@ GitHub Copilot's coding agent includes a firewall that restricts internet access
- ✅ Works: `https://raw.githubusercontent.com/oborel/obo-relations/refs/heads/master/ro-base.owl`
- ❌ Blocked: `http://purl.obolibrary.org/obo/ro/ro-base.owl`

For more details, see the [GitHub Copilot](reference/clients/github-copilot.md) documentation and GitHub's guide on [customizing the agent firewall](https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent/customize-the-agent-firewall).
For more details, see the [GitHub Copilot](reference/clients/github-copilot.md) documentation and GitHub's guide on [customizing the agent firewall](https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent/customize-the-agent-firewall).

## Agent Harness and Infrastructure

### What is an agent harness?

An [agent harness](glossary.md#agent-harness) is the infrastructure that wraps around an AI agent to manage its execution — context management, tool orchestration, validation, provenance, and human-in-the-loop controls. It's the difference between an agent that sometimes works and one that's reliable and auditable.

The same model behaves differently depending on the harness around it. Getting the harness right matters more than picking the "best" model.

See [Build your agentic harness](how-tos/build-agentic-harness.md) for a practical guide.

### What tools are available for validating agent outputs?

Two key validators:

- **[linkml-term-validator](https://github.com/linkml/linkml-term-validator)** — checks that ontology terms referenced in agent outputs actually exist and are used correctly
- **[linkml-reference-validator](https://github.com/linkml/linkml-reference-validator)** — checks that cited references actually contain the claimed supporting text

Both can be integrated into CI pipelines to automatically validate agent-generated pull requests. See [Agentic tools](reference/agentic-tools.md) for details.

### How do I track which changes an AI agent made?

Use **[ai-blame](https://github.com/ai4curation/ai-blame)**, which extracts provenance and audit trails from agent execution traces. It provides line-level attribution, so you can see exactly which changes the agent made, when, and in what context.

### What MCP servers are available for ontology work?

- **[noctua-mcp](https://github.com/geneontology/noctua-mcp)** — GO-CAM editing via Noctua/Barista
- **[oak-mcp](https://github.com/monarch-initiative/oak-mcp)** — ontology search, traversal, and operations via OAK
- **[owl-mcp](https://github.com/monarch-initiative/owl-mcp)** — general OWL ontology operations

These give agents structured access to domain-specific operations instead of raw file manipulation.
42 changes: 40 additions & 2 deletions docs/glossary.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,38 @@
# Glossary

## Agent Harness

The infrastructure and control plane that wraps around an [AI agent](#ai-agent) to manage its execution. An agent harness handles context management, [tool](#tool) orchestration, validation, provenance, error recovery, and human-in-the-loop controls. It does not replace the agent — it governs how the agent operates. Think of it as the difference between writing a container and running Kubernetes.

In a curation context, a harness typically includes: system instructions (e.g. [CLAUDE.md](#claude-code)), validators (e.g. [linkml-term-validator](#linkml-term-validator), [linkml-reference-validator](#linkml-reference-validator)), provenance tools (e.g. [ai-blame](#ai-blame)), [MCP](#model-context-protocol-mcp) servers for domain-specific tool access, and [GitHub Actions](#github-actions) for lifecycle automation.
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link text "CLAUDE.md" currently points to the glossary anchor for "Claude Code" (#claude-code), which is misleading for readers and makes it harder to find what CLAUDE.md is. Consider either changing the link text to "Claude Code" (if you meant the client) or linking CLAUDE.md to the relevant how-to (e.g., instruct-github-agent) or describing it as inline code without a link.

Suggested change
In a curation context, a harness typically includes: system instructions (e.g. [CLAUDE.md](#claude-code)), validators (e.g. [linkml-term-validator](#linkml-term-validator), [linkml-reference-validator](#linkml-reference-validator)), provenance tools (e.g. [ai-blame](#ai-blame)), [MCP](#model-context-protocol-mcp) servers for domain-specific tool access, and [GitHub Actions](#github-actions) for lifecycle automation.
In a curation context, a harness typically includes: system instructions (e.g. `CLAUDE.md`), validators (e.g. [linkml-term-validator](#linkml-term-validator), [linkml-reference-validator](#linkml-reference-validator)), provenance tools (e.g. [ai-blame](#ai-blame)), [MCP](#model-context-protocol-mcp) servers for domain-specific tool access, and [GitHub Actions](#github-actions) for lifecycle automation.

Copilot uses AI. Check for mistakes.

See also: [Build your agentic harness](how-tos/build-agentic-harness.md)

## Agentic Coding Tool

A coding tool that combines AI capabilities with direct access to development tools and environments, enabling natural language interaction with code bases and automated execution of development tasks.

## AI Agent

An agent is a program that allows AI models to call [tools](#tool) to achieve some objective.
An agent is a program that allows AI models to call [tools](#tool) to achieve some objective.

## ai-blame

A tool that extracts provenance and audit trails from agent execution traces, enabling line-level attribution and post-hoc review for AI-assisted edits. Useful for tracking which changes an AI agent made, when, and in what context.

See [GitHub](https://github.com/ai4curation/ai-blame), [Docs](https://ai4curation.github.io/ai-blame), [PyPI](https://pypi.org/project/ai-blame/)

## CBORG

An AI proxy used by members of Berkeley Lab. CBORG uses [LiteLLM](#litellm) to proxy calls to models.

## Claude

A LLM from Anthropic.
A LLM from Anthropic.

## Curation Skill

A reusable, domain-specific instruction pack that shapes [AI agent](#ai-agent) behavior for ontology and biocuration tasks. Skills make agent behavior more consistent, transparent, and domain-aware by providing structured prompts and tool configurations. See [curation-skills](https://github.com/ai4curation/curation-skills).

## Claude Code

Expand Down Expand Up @@ -44,6 +62,10 @@ A standardized file format used for creating and exchanging ontologies, particul
## Ontology
A collection of concepts that represent some domain, along with the relationships between them.

## Provenance

In the context of AI-assisted curation, provenance refers to tracking the origin, authorship, and context of changes — particularly distinguishing human-authored from AI-generated edits. Tools like [ai-blame](#ai-blame) extract provenance from agent execution traces, enabling line-level attribution and post-hoc review.

## PMID (PubMed ID)
A unique numerical identifier assigned to every record in PubMed, the comprehensive database of biomedical literature.

Expand All @@ -52,6 +74,14 @@ PMIDs are usually written as CURIEs, e.g. PMID:123456
## API Key
A unique code or token that is passed to an Application Programming Interface (API) to authenticate the calling application or user. API keys are used to control access to the API, track usage, and manage permissions.

## linkml-reference-validator

A validator that checks whether supporting text in structured records is actually present in cited references, helping enforce evidence-backed curation in agent-generated outputs. See [GitHub](https://github.com/linkml/linkml-reference-validator), [Docs](https://linkml.io/linkml-reference-validator/), [PyPI](https://pypi.org/project/linkml-reference-validator/)

## linkml-term-validator

A validator that checks [LinkML](https://linkml.io/) schemas and datasets for correct use of external ontologies and controlled terms, improving consistency for agent-generated outputs. See [GitHub](https://github.com/linkml/linkml-term-validator), [Docs](https://linkml.io/linkml-term-validator/), [PyPI](https://pypi.org/project/linkml-term-validator/)

## Knowledge Base (KB)
A centralized repository for storing and managing complex information, both structured and unstructured. In the context of AI and curation, KBs often refer to curated collections of domain-specific knowledge that AI systems can utilize and help maintain.

Expand All @@ -66,6 +96,14 @@ A specification or standard that defines how AI models and external tools or ser

The best way to understand the concept of MCPs is to try them out - the Goose desktop app makes this easy.

Domain-specific MCP servers relevant to curation include:

- [noctua-mcp](https://github.com/geneontology/noctua-mcp) — MCP server for GO-CAM editing via Noctua/Barista
- [oak-mcp](https://github.com/monarch-initiative/oak-mcp) — MCP server for ontology operations via [OAK](#oak-ontology-access-kit)

## OAK (Ontology Access Kit)
A unified Python and CLI toolkit for ontology search, graph operations, mapping generation, and programmatic ontology manipulation. OAK provides a common interface over multiple ontology sources including BioPortal, OLS, local files, and SPARQL endpoints. See [GitHub](https://github.com/INCATools/ontology-access-kit), [Docs](https://incatools.github.io/ontology-access-kit/), [PyPI](https://pypi.org/project/oaklib/)

## ODK (Ontology Development Kit)
A suite of tools, best practices, and standardized workflows for creating, maintaining, and quality control for ontologies, particularly those in the OBO (Open Biomedical Ontologies) ecosystem. More information can be found at the [ODK GitHub repository](https://incatools.github.io/ontology-development-kit/).

Expand Down
135 changes: 135 additions & 0 deletions docs/how-tos/build-agentic-harness.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# Build Your Agentic Harness for Curation

## What is an agent harness?

An [agent harness](../glossary.md#agent-harness) is the infrastructure that wraps around an AI agent to manage its execution. It is not the agent itself — it is the system that governs how the agent operates, ensuring it remains reliable, consistent, and reviewable.

Think of it this way: the same model (e.g. Claude Opus) behaves differently in Claude Code versus Cursor versus a custom API setup. The model is the same. **The harness changes everything.**

In 2025, the focus was on building agents. In 2026, the focus has shifted to building the infrastructure that controls them — context management, tool orchestration, validation, provenance, and human-in-the-loop controls.

## Why curators need a harness

Without a harness, AI-assisted curation is fragile:

- Agents lose track of objectives mid-task
- Agent outputs contain hallucinated ontology terms or fabricated citations
- There's no record of what the agent changed or why
- Different agents behave inconsistently on the same repository
- Errors compound without automated validation gates

A harness addresses each of these failure modes with concrete infrastructure.

## Components of a curation harness

The ai4curation ecosystem already provides the building blocks. Here's how they map to harness components:

### 1. Prompt preset management — System instructions

Every repository should have a `CLAUDE.md` (for Claude Code / GitHub agents) and/or `.goosehints` (for Goose) checked into the root. These files tell the agent what the repository is, what conventions to follow, and what tools to use.

**Examples:**

- [CLAUDE.md in Mondo](https://github.com/monarch-initiative/mondo/blob/master/CLAUDE.md)
- [CLAUDE.md in Uberon](https://github.com/obophenotype/uberon/blob/master/CLAUDE.md)

**See:** [Instruct the GitHub agent](instruct-github-agent.md)

### 2. Tool access — MCP servers

MCP servers give agents structured access to domain-specific operations instead of relying on raw file manipulation:

- **[noctua-mcp](https://github.com/geneontology/noctua-mcp)** — GO-CAM editing via Noctua/Barista
- **[oak-mcp](https://github.com/monarch-initiative/oak-mcp)** — ontology operations via OAK
- **[owl-mcp](https://github.com/monarch-initiative/owl-mcp)** — OWL ontology operations

Without proper tool access, agents resort to ad-hoc text manipulation of ontology files, which is error-prone.

### 3. Validation — Automated quality gates

Validators catch errors in agent outputs before they reach human reviewers:

- **[linkml-term-validator](https://github.com/linkml/linkml-term-validator)** — checks that ontology terms referenced in agent outputs actually exist
- **[linkml-reference-validator](https://github.com/linkml/linkml-reference-validator)** — checks that cited references actually contain the claimed supporting text

These can be run as part of CI/CD pipelines or as pre-commit hooks.

### 4. Provenance — Audit trails

**[ai-blame](https://github.com/ai4curation/ai-blame)** extracts provenance from agent execution traces, enabling line-level attribution. This answers the question: "which changes did the AI make, and which did a human make?"

### 5. Consistent behavior — Agent skills

**[curation-skills](https://github.com/ai4curation/curation-skills)** provides reusable skill packs that make agent behavior consistent and domain-aware across different curation tasks.

### 6. Lifecycle hooks — GitHub Actions

GitHub Actions workflows (e.g. `ai-agent.yml`) automate the agent lifecycle:

- Trigger agents on issue creation or labeling
- Run validation after agent commits
- Gate merging on passing checks

**See:** [Set up GitHub Actions](set-up-github-actions.md)

### 7. Human-in-the-loop — Branch protection

GitHub branch protection rules ensure agents can't merge directly to main. Every agent-generated change goes through a pull request where a human curator reviews it.

## How the components compose

```
┌─────────────────────────────────────────────────────┐
│ Agent Harness │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ CLAUDE.md │ │ curation- │ System │
│ │ .goosehints │ │ skills │ Instructions │
│ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ noctua-mcp │ │ oak-mcp │ Tool │
│ │ owl-mcp │ │ │ Access │
│ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ term- │ │ reference- │ Validation │
│ │ validator │ │ validator │ │
│ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ ai-blame │ │ GitHub │ Provenance │
│ │ │ │ Actions │ & Lifecycle │
│ └──────────────┘ └──────────────┘ │
│ │
│ ┌─────────────────────────────────┐ │
│ │ Branch protection / PR review │ Human │
│ │ │ Oversight │
│ └─────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────┐ │
│ │ AI Agent (Claude, Goose) │ The Agent │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
```

## Getting started

A minimal harness for an ontology repository:

1. **Add system instructions**: Create a `CLAUDE.md` at the repo root ([guide](instruct-github-agent.md))
2. **Set up GitHub Actions**: Add an `ai-agent.yml` workflow ([guide](set-up-github-actions.md))
3. **Enable branch protection**: Require PR reviews before merging
4. **Install validators**: Add `linkml-term-validator` and/or `linkml-reference-validator` to your CI pipeline
5. **Configure MCP servers**: Set up `oak-mcp` or `noctua-mcp` for domain-specific tool access

As you mature your setup, add:

6. **Provenance tracking**: Integrate `ai-blame` for audit trails
7. **Curation skills**: Use `curation-skills` packs for consistent agent behavior

## Further reading

- [Agentic tools reference](../reference/agentic-tools.md) — detailed documentation for each tool
- [Example repositories](../examples.md) — see harnesses in action
- [Agentic tooling on GitHub](https://github.com/topics/ai4curation)
26 changes: 25 additions & 1 deletion docs/how-tos/integrate-ai-into-your-kb.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,34 @@ As an example, this video shows how to configure:
allowfullscreen>
</iframe>

## Tip 4: Validate agent outputs automatically

Agents hallucinate ontology terms and fabricate citations. Add automated validation to catch these before review:

- **[linkml-term-validator](https://github.com/linkml/linkml-term-validator)** — checks that ontology terms in agent outputs actually exist
- **[linkml-reference-validator](https://github.com/linkml/linkml-reference-validator)** — checks that cited references contain the claimed supporting text

These can run as CI checks on agent-generated pull requests.

## Tip 5: Track what the agent changed

Use **[ai-blame](https://github.com/ai4curation/ai-blame)** to extract [provenance](../glossary.md#provenance) from agent execution traces. This gives you line-level attribution — essential for understanding what the agent did and what a human did.

## Tip 6: Use MCP servers for domain-specific tool access

Rather than having agents manipulate ontology files as raw text, give them structured tool access through [MCP](../glossary.md#model-context-protocol-mcp) servers:

- **[noctua-mcp](https://github.com/geneontology/noctua-mcp)** — GO-CAM editing via Noctua/Barista
- **[oak-mcp](https://github.com/monarch-initiative/oak-mcp)** — ontology operations via OAK

## Tip 7: Think in terms of a harness, not just an agent

Effective AI curation isn't about picking the right model — it's about building the right infrastructure around it. This infrastructure is called an [agent harness](../glossary.md#agent-harness). For a complete guide to assembling one, see [Build your agentic harness](build-agentic-harness.md).

## Set up [GitHub actions](../glossary.md#github-actions)

See some of the actions in this org. Again this works best if your content is managed
according to O3 guidelines.
according to O3 guidelines. See [Set up GitHub Actions](set-up-github-actions.md) for details.

## Document and Train

Expand Down
Loading