This guide covers setup, testing, and contributing to the verifiers package.
- Setup
- Project Structure
- Prime CLI Plugin Export
- Running Tests
- Writing Tests
- Contributing
- Common Issues
- Environment Development
- Quick Reference
- Python 3.13 recommended for CI parity with Ty checks
- uv package manager
# Clone and install for development
git clone https://github.com/PrimeIntellect-ai/verifiers.git
cd verifiers
# CPU-only development:
uv sync
# GPU-based trainer development:
uv sync --all-extras
# Install pre-commit hooks (including pre-push Ty gate):
uv run pre-commit installverifiers/
├── verifiers/ # Main package
│ ├── envs/ # Environment classes
│ │ ├── integrations/ # Third-party wrappers (TextArena, ReasoningGym)
│ │ └── experimental/ # Newer environments (MCP, Harbor, etc.)
│ ├── parsers/ # Parser classes
│ ├── rubrics/ # Rubric classes
│ ├── rl/ # Training infrastructure
│ │ ├── inference/ # vLLM server utilities
│ │ └── trainer/ # Trainer implementation
│ ├── cli/ # Prime-facing CLI modules and plugin exports
│ ├── scripts/ # Compatibility wrappers around verifiers/cli commands
│ └── utils/ # Utilities
├── environments/ # Installable environment modules
├── configs/ # Example training configurations
├── tests/ # Test suite
└── docs/ # Documentation
Verifiers exports a plugin consumed by prime so command behavior is sourced from verifiers modules.
Entry point:
from verifiers.cli.plugins.prime import get_plugin
plugin = get_plugin()The plugin exposes:
api_version(current:1)- command modules:
eval_module(verifiers.cli.commands.eval)gepa_module(verifiers.cli.commands.gepa)install_module(verifiers.cli.commands.install)init_module(verifiers.cli.commands.init)setup_module(verifiers.cli.commands.setup)build_module(verifiers.cli.commands.build)
build_module_command(module_name, args)to construct subprocess invocation for a command module
Contributor guidance:
- Add new prime-facing command logic under
verifiers/cli/commands/. - Export new command modules through
PrimeCLIPlugininverifiers/cli/plugins/prime.py. - Keep
verifiers/scripts/*as thin compatibility wrappers that call intoverifiers/cli.
# Run all tests
uv run pytest tests/
# Run with coverage
uv run pytest tests/ --cov=verifiers --cov-report=html
# Run specific test file
uv run pytest tests/test_parser.py
# Stop on first failure with verbose output
uv run pytest tests/ -xvs
# Run tests matching a pattern
uv run pytest tests/ -k "xml_parser"
# Run environment tests
uv run pytest tests/test_envs.py -vv
# Run environment tests across all CPU cores
uv run pytest -n auto tests/test_envs.py -vv
# Run specific environment tests
uv run pytest tests/test_envs.py -k math_pythonThe test suite includes 380+ tests covering parsers, rubrics, environments, and utilities.
class TestFeature:
"""Test the feature functionality."""
def test_basic_functionality(self):
"""Test normal operation."""
# Arrange
feature = Feature()
# Act
result = feature.process("input")
# Assert
assert result == "expected"
def test_error_handling(self):
"""Test error cases."""
with pytest.raises(ValueError):
Feature().process(invalid_input)The test suite provides a MockClient in conftest.py that implements the Client interface:
def test_with_mock(mock_client):
mock_client.set_default_responses(chat_response="test answer")
env = vf.SingleTurnEnv(client=mock_client, model="test", ...)
# Test without real API calls- Test both success and failure cases
- Use descriptive test names that explain what's being tested
- Leverage existing fixtures from
conftest.py - Group related tests in test classes
- Keep tests fast - use mocks instead of real API calls
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make changes following existing patterns
- Add tests for new functionality
- Run tests:
uv run pytest tests/ - Install hooks once per clone:
uv run pre-commit install - Commit and push (hooks run automatically on each commit/push)
- Update docs if adding/changing public APIs
- Submit PR with clear description
- Strict
ruffenforcement via pre-commit hooks tyruns in the pre-push hook viauv run --python 3.13 ty check verifiers- Use type hints for function parameters and returns
- Write docstrings for public functions/classes
- Keep functions focused and modular
- Fail fast, fail loud - no defensive programming or silent fallbacks
- Tests pass locally (
uv run pytest tests/) - Pre-commit and pre-push hooks pass on latest commit/push
- Added tests for new functionality
- Updated documentation if needed
# Ensure package is installed in development mode
uv sync# Install optional dependencies for specific integrations
uv sync --extra ta # for TextArenaEnv
uv sync --extra rg # for ReasoningGymEnv# Debug specific test
uv run pytest tests/test_file.py::test_name -vvs --pdb# Initialize template
prime env init my-environment
# Install locally for testing
prime env install my-environment
# Test your environment
prime eval run my-environment -m openai/gpt-4.1-mini -n 5# my_environment.py
import verifiers as vf
def load_environment(**kwargs):
"""Load the environment."""
dataset = vf.load_example_dataset("dataset_name")
parser = vf.XMLParser(fields=["reasoning", "answer"])
def reward_func(parser, completion, answer, **kwargs):
return 1.0 if parser.parse_answer(completion) == answer else 0.0
rubric = vf.Rubric(
funcs=[reward_func, parser.get_format_reward_func()],
weights=[1.0, 0.2],
parser=parser
)
return vf.SingleTurnEnv(
dataset=dataset,
parser=parser,
rubric=rubric,
**kwargs
)# Development setup
uv sync # CPU-only
uv sync --all-extras # With RL/training extras
uv run pre-commit install # One-time per clone (installs pre-commit + pre-push)
# Run tests
uv run pytest tests/ # All tests
uv run pytest tests/ -xvs # Debug mode
uv run pytest tests/ --cov=verifiers # With coverage
# Run environment tests
uv run pytest tests/test_envs.py -vv # All environments
uv run pytest tests/test_envs.py -k math_python # Specific environment
# Linting
uv run ruff check --fix . # Fix lint errors
uv run ruff format --check verifiers tests # Verify Python formatting
uv run ty check verifiers # Type check (matches CI Ty target)
# Environment tools
prime env init new-env # Create environment
prime env install new-env # Install environment
prime eval run new-env -m openai/gpt-4.1-mini -n 5 # Test environment
prime eval tui # Browse evals in the tree browser| Command | Description |
|---|---|
prime eval run |
Run evaluations on environments |
prime env init |
Initialize new environment from template |
prime env install |
Install environment module |
prime lab setup |
Set up training workspace |
prime eval tui |
Terminal UI for browsing evals and rollout details |
prime rl run |
Launch Hosted Training |
uv run prime-rl |
Launch prime-rl training |
- Environments: Installable modules with
load_environment()function - Parsers: Extract structured data from model outputs
- Rubrics: Define multi-criteria evaluation functions
- Tests: Comprehensive coverage with mocks for external dependencies