|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +This file provides guidance for AI coding agents working on the **MyST-Parser** repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +MyST-Parser is a Sphinx extension and docutils parser for the MyST (Markedly Structured Text) Markdown flavor. It provides: |
| 8 | + |
| 9 | +- An extended [CommonMark](https://commonmark.org)-compliant parser using [`markdown-it-py`](https://markdown-it-py.readthedocs.io/) |
| 10 | +- A [docutils](https://docutils.sourceforge.io/) renderer that converts markdown-it tokens to docutils nodes |
| 11 | +- A [Sphinx](https://www.sphinx-doc.org) extension for using MyST Markdown in Sphinx documentation |
| 12 | + |
| 13 | +MyST is designed for technical documentation and publishing, offering a rich and extensible flavor of Markdown with support for roles, directives, and cross-references. |
| 14 | + |
| 15 | +## Repository Structure |
| 16 | + |
| 17 | +``` |
| 18 | +pyproject.toml # Project configuration and dependencies |
| 19 | +tox.ini # Tox test environment configuration |
| 20 | +
|
| 21 | +myst_parser/ # Main source code |
| 22 | +├── __init__.py # Package init with Sphinx setup() entry point |
| 23 | +├── config/ # Configuration dataclasses |
| 24 | +│ ├── main.py # MdParserConfig dataclass |
| 25 | +│ └── dc_validators.py # Dataclass field validators |
| 26 | +├── parsers/ # Parser implementations |
| 27 | +│ ├── sphinx_.py # Sphinx parser (MystParser) |
| 28 | +│ ├── docutils_.py # Docutils parser and CLI tools |
| 29 | +│ ├── mdit.py # markdown-it-py setup and plugins |
| 30 | +│ ├── directives.py # Directive parsing utilities |
| 31 | +│ └── options.py # Option parsing for directives |
| 32 | +├── mdit_to_docutils/ # Token-to-docutils rendering |
| 33 | +│ ├── base.py # DocutilsRenderer (main renderer) |
| 34 | +│ ├── sphinx_.py # SphinxRenderer (Sphinx-specific) |
| 35 | +│ ├── transforms.py # Docutils transforms |
| 36 | +│ └── html_to_nodes.py # HTML-to-docutils conversion |
| 37 | +├── sphinx_ext/ # Sphinx extension components |
| 38 | +│ ├── main.py # setup_sphinx() and config creation |
| 39 | +│ ├── directives.py # Custom Sphinx directives |
| 40 | +│ ├── myst_refs.py # Reference resolver post-transform |
| 41 | +│ └── mathjax.py # MathJax configuration |
| 42 | +├── inventory.py # Sphinx inventory file handling |
| 43 | +├── mocking.py # Mock objects for directive/role parsing |
| 44 | +├── warnings_.py # Warning types (MystWarnings enum) |
| 45 | +├── cli.py # Command-line interface |
| 46 | +└── _compat.py # Python version compatibility |
| 47 | +
|
| 48 | +tests/ # Test suite |
| 49 | +├── test_sphinx/ # Sphinx integration tests |
| 50 | +│ ├── sourcedirs/ # Test documentation projects |
| 51 | +│ └── test_sphinx_builds.py |
| 52 | +├── test_renderers/ # Renderer unit tests |
| 53 | +├── test_commonmark/ # CommonMark compliance tests |
| 54 | +├── test_html/ # HTML output tests |
| 55 | +├── test_docutils.py # Docutils parser tests |
| 56 | +└── test_anchors.py # Heading anchor tests |
| 57 | +
|
| 58 | +docs/ # Documentation source (MyST Markdown) |
| 59 | +├── conf.py # Sphinx configuration |
| 60 | +├── index.md # Documentation index |
| 61 | +├── syntax/ # Syntax reference documentation |
| 62 | +├── develop/ # Developer documentation |
| 63 | +└── faq/ # FAQ and troubleshooting |
| 64 | +``` |
| 65 | + |
| 66 | +## Development Commands |
| 67 | + |
| 68 | +All commands should be run via [`tox`](https://tox.wiki) for consistency. The project uses `tox-uv` for faster environment creation. |
| 69 | + |
| 70 | +### Testing |
| 71 | + |
| 72 | +```bash |
| 73 | +# Run all tests |
| 74 | +tox |
| 75 | + |
| 76 | +# Run a specific test file |
| 77 | +tox -- tests/test_docutils.py |
| 78 | + |
| 79 | +# Run a specific test function |
| 80 | +tox -- tests/test_docutils.py::test_function_name |
| 81 | + |
| 82 | +# Run tests with a specific Python/Sphinx version |
| 83 | +tox -e py311-sphinx8 |
| 84 | + |
| 85 | +# Run with coverage |
| 86 | +tox -- --cov=myst_parser |
| 87 | + |
| 88 | +# Update regression test fixtures (this will initially produce an error code if the files change) |
| 89 | +# but note, these files must pass for all python/sphinx/docutils versions |
| 90 | +tox -- --regen-file-failure --force-regen |
| 91 | +``` |
| 92 | + |
| 93 | +### Documentation |
| 94 | + |
| 95 | +```bash |
| 96 | +# Build docs (clean) |
| 97 | +tox -e docs-clean |
| 98 | + |
| 99 | +# Build docs (incremental) |
| 100 | +tox -e docs-update |
| 101 | + |
| 102 | +# Build with a specific builder (e.g., linkcheck) |
| 103 | +BUILDER=linkcheck tox -e docs-update |
| 104 | +``` |
| 105 | + |
| 106 | +### Code Quality |
| 107 | + |
| 108 | +```bash |
| 109 | +# Type checking with mypy |
| 110 | +tox -e mypy |
| 111 | + |
| 112 | +# Linting with ruff (auto-fix enabled) |
| 113 | +tox -e ruff-check |
| 114 | + |
| 115 | +# Formatting with ruff |
| 116 | +tox -e ruff-fmt |
| 117 | + |
| 118 | +# Run pre-commit hooks on all files |
| 119 | +pre-commit run --all-files |
| 120 | +``` |
| 121 | + |
| 122 | +## Code Style Guidelines |
| 123 | + |
| 124 | +- **Formatter/Linter**: Ruff (configured in `pyproject.toml`) |
| 125 | +- **Type Checking**: Mypy with strict settings (configured in `pyproject.toml`) |
| 126 | +- **Pre-commit**: Use pre-commit hooks for consistent code style |
| 127 | + |
| 128 | +### Best Practices |
| 129 | + |
| 130 | +- **Type annotations**: Use complete type annotations for all function signatures. Use `TypedDict` for structured dictionaries, dataclasses for configuration. |
| 131 | +- **Docstrings**: Use Sphinx-style docstrings (`:param:`, `:return:`, `:raises:`). Types are not required in docstrings as they should be in type hints. |
| 132 | +- **Function Signatures**: Use `/` and `*` to enforce positional-only and keyword-only arguments where appropriate |
| 133 | +- **Pure functions**: Where possible, write pure functions without side effects. |
| 134 | +- **Error handling**: Use `MystWarnings` enum for warning types. Use `create_warning()` for user-facing warnings. |
| 135 | +- **Testing**: Write tests for all new functionality. Use `pytest-regressions` for output comparison tests. |
| 136 | + |
| 137 | +### Docstring Example |
| 138 | + |
| 139 | +```python |
| 140 | +def parse_directive_text( |
| 141 | + directive_class: type[Directive], |
| 142 | + first_line: str, |
| 143 | + content: str, |
| 144 | + *, |
| 145 | + validate_options: bool = True, |
| 146 | +) -> tuple[list[str], dict[str, Any], list[str], int]: |
| 147 | + """Parse directive text into its components. |
| 148 | +
|
| 149 | + :param directive_class: The directive class to parse for. |
| 150 | + :param first_line: The first line (arguments). |
| 151 | + :param content: The directive content. |
| 152 | + :param validate_options: Whether to validate options against the directive spec. |
| 153 | + :return: Tuple of (arguments, options, body_lines, body_offset). |
| 154 | + :raises MarkupError: If the directive text is malformed. |
| 155 | + """ |
| 156 | + ... |
| 157 | +``` |
| 158 | + |
| 159 | +## Testing Guidelines |
| 160 | + |
| 161 | +### Test Structure |
| 162 | + |
| 163 | +- Tests use `pytest` with fixtures from `conftest.py` files |
| 164 | +- Sphinx integration tests are in `tests/test_sphinx/` |
| 165 | +- Test source directories are in `tests/test_sphinx/sourcedirs/` |
| 166 | +- Regression testing uses `pytest-regressions` for output comparison |
| 167 | +- Use `pytest-param-files` for parameterized file-based tests |
| 168 | + |
| 169 | +### Writing Tests |
| 170 | + |
| 171 | +1. For Sphinx integration tests, create a source directory in `tests/test_sphinx/sourcedirs/` |
| 172 | +2. Use `sphinx-pytest` fixtures for Sphinx application testing |
| 173 | +3. Use `file_regression` fixture for comparing output against stored fixtures |
| 174 | + |
| 175 | +### Test Best Practices |
| 176 | + |
| 177 | +- **Test coverage**: Write tests for all new functionality and bug fixes |
| 178 | +- **Isolation**: Each test should be independent and not rely on state from other tests |
| 179 | +- **Descriptive names**: Test function names should describe what is being tested |
| 180 | +- **Regression testing**: Use `file_regression.check()` for complex output comparisons |
| 181 | +- **Parametrization**: Use `@pytest.mark.parametrize` for multiple test scenarios |
| 182 | +- **Fixtures**: Define reusable fixtures in `conftest.py` |
| 183 | + |
| 184 | +### Example Test Pattern |
| 185 | + |
| 186 | +```python |
| 187 | +import pytest |
| 188 | + |
| 189 | +@pytest.mark.sphinx( |
| 190 | + buildername="html", |
| 191 | + srcdir="path/to/sourcedir", |
| 192 | +) |
| 193 | +def test_example(app, status, warning, get_sphinx_app_output): |
| 194 | + app.build() |
| 195 | + assert "build succeeded" in status.getvalue() |
| 196 | + warnings = warning.getvalue().strip() |
| 197 | + assert warnings == "" |
| 198 | +``` |
| 199 | + |
| 200 | +## Pull Request Requirements |
| 201 | + |
| 202 | +When submitting changes: |
| 203 | + |
| 204 | +1. **Description**: Include a meaningful description or link explaining the change |
| 205 | +2. **Tests**: Include test cases for new functionality or bug fixes |
| 206 | +3. **Documentation**: Update docs if behavior changes or new features are added |
| 207 | +4. **Changelog**: Update `CHANGELOG.md` under the appropriate section |
| 208 | +5. **Code Quality**: Ensure `pre-commit run --all-files` passes |
| 209 | + |
| 210 | +## Architecture Overview |
| 211 | + |
| 212 | +### Parsing Pipeline |
| 213 | + |
| 214 | +The MyST parser follows a three-stage pipeline: |
| 215 | + |
| 216 | +``` |
| 217 | +MyST Markdown → markdown-it tokens → docutils AST → Sphinx/HTML output |
| 218 | +``` |
| 219 | + |
| 220 | +1. **Markdown Parsing** (`myst_parser/parsers/mdit.py`): Uses `markdown-it-py` with MyST plugins to parse Markdown into tokens |
| 221 | +2. **Token Rendering** (`myst_parser/mdit_to_docutils/`): Converts markdown-it tokens to docutils nodes |
| 222 | +3. **Sphinx Integration** (`myst_parser/sphinx_ext/`): Integrates with Sphinx for documentation builds |
| 223 | + |
| 224 | +### Key Components |
| 225 | + |
| 226 | +#### Configuration (`myst_parser/config/main.py`) |
| 227 | + |
| 228 | +The `MdParserConfig` dataclass centralizes all MyST configuration options: |
| 229 | + |
| 230 | +- Defines all `myst_*` configuration values with types and defaults |
| 231 | +- Uses dataclass validators for type checking |
| 232 | +- Sphinx config values are auto-registered with `myst_` prefix |
| 233 | + |
| 234 | +#### Parsers (`myst_parser/parsers/`) |
| 235 | + |
| 236 | +- `MystParser` (in `sphinx_.py`): The Sphinx parser class that integrates with Sphinx's build system |
| 237 | +- `Parser` (in `docutils_.py`): The standalone docutils parser for non-Sphinx use |
| 238 | +- `create_md_parser()` (in `mdit.py`): Factory function to create configured markdown-it-py instances |
| 239 | + |
| 240 | +#### Renderers (`myst_parser/mdit_to_docutils/`) |
| 241 | + |
| 242 | +- `DocutilsRenderer` (in `base.py`): Base renderer that converts tokens to docutils nodes. Contains render methods for each token type (e.g., `render_heading`, `render_paragraph`). |
| 243 | +- `SphinxRenderer` (in `sphinx_.py`): Extends `DocutilsRenderer` with Sphinx-specific functionality (e.g., cross-references, domains). |
| 244 | + |
| 245 | +#### Sphinx Extension (`myst_parser/sphinx_ext/`) |
| 246 | + |
| 247 | +- `setup_sphinx()` (in `main.py`): Registers the parser, config values, and transforms with Sphinx |
| 248 | +- `MystReferenceResolver` (in `myst_refs.py`): Post-transform that resolves MyST-style references |
| 249 | +- Custom directives like `figure-md` (in `directives.py`) |
| 250 | + |
| 251 | +### Sphinx Integration Flow |
| 252 | + |
| 253 | +```mermaid |
| 254 | +flowchart TB |
| 255 | + subgraph init["Initialization"] |
| 256 | + setup["setup() in __init__.py"] |
| 257 | + setup_sphinx["setup_sphinx()"] |
| 258 | + end |
| 259 | +
|
| 260 | + subgraph config["Configuration"] |
| 261 | + builder_init["builder-inited"] |
| 262 | + create_config["create_myst_config()"] |
| 263 | + end |
| 264 | +
|
| 265 | + subgraph parse["Parse Phase"] |
| 266 | + source["Source .md file"] |
| 267 | + mdit["markdown-it-py parser"] |
| 268 | + tokens["Token stream"] |
| 269 | + renderer["DocutilsRenderer/SphinxRenderer"] |
| 270 | + doctree["Docutils doctree"] |
| 271 | + end |
| 272 | +
|
| 273 | + subgraph resolve["Resolution Phase"] |
| 274 | + post_transform["MystReferenceResolver"] |
| 275 | + resolved["Resolved doctree"] |
| 276 | + end |
| 277 | +
|
| 278 | + setup --> setup_sphinx |
| 279 | + setup_sphinx --> builder_init |
| 280 | + builder_init --> create_config |
| 281 | +
|
| 282 | + source --> mdit --> tokens --> renderer --> doctree |
| 283 | + doctree --> post_transform --> resolved |
| 284 | +
|
| 285 | + create_config -.->|"MdParserConfig"| renderer |
| 286 | +
|
| 287 | + style create_config fill:#e1f5fe |
| 288 | + style renderer fill:#e1f5fe |
| 289 | + style post_transform fill:#e1f5fe |
| 290 | +``` |
| 291 | + |
| 292 | +## Key Files |
| 293 | + |
| 294 | +- `pyproject.toml` - Project configuration, dependencies, and tool settings |
| 295 | +- `myst_parser/__init__.py` - Package entry point with `setup()` for Sphinx |
| 296 | +- `myst_parser/config/main.py` - `MdParserConfig` dataclass with all configuration options |
| 297 | +- `myst_parser/parsers/sphinx_.py` - `MystParser` class for Sphinx integration |
| 298 | +- `myst_parser/mdit_to_docutils/base.py` - `DocutilsRenderer` with token-to-node rendering |
| 299 | +- `myst_parser/sphinx_ext/main.py` - `setup_sphinx()` function for Sphinx setup |
| 300 | +- `myst_parser/warnings_.py` - `MystWarnings` enum for warning types |
| 301 | + |
| 302 | +## Debugging |
| 303 | + |
| 304 | +- Build docs with `-T` flag for full tracebacks: `tox -e docs-clean -- -T ...` |
| 305 | +- Use `myst-docutils-html` CLI for standalone parsing without Sphinx |
| 306 | +- Check `docs/_build/` for build outputs |
| 307 | +- Use `tox -- -v --tb=long` for verbose test output with full tracebacks |
| 308 | +- Set `myst_debug_match = True` in Sphinx config to log matched syntax |
| 309 | + |
| 310 | +## Common Patterns |
| 311 | + |
| 312 | +### Adding a New Syntax Extension |
| 313 | + |
| 314 | +1. Create a markdown-it-py plugin or use an existing one from `mdit-py-plugins` |
| 315 | +2. Register it in `myst_parser/parsers/mdit.py` |
| 316 | +3. Add configuration option in `MdParserConfig` if needed |
| 317 | +4. Add render method in `DocutilsRenderer` (e.g., `render_my_syntax`) |
| 318 | +5. Document in `docs/syntax/` |
| 319 | +6. Add tests in `tests/` |
| 320 | + |
| 321 | +### Adding a Configuration Option |
| 322 | + |
| 323 | +1. Add field to `MdParserConfig` in `myst_parser/config/main.py` |
| 324 | +2. Add validator if needed in `myst_parser/config/dc_validators.py` |
| 325 | +3. Document in `docs/configuration.md` |
| 326 | +4. Add tests for the new option |
| 327 | + |
| 328 | +### Adding a Sphinx Directive |
| 329 | + |
| 330 | +1. Create directive class in `myst_parser/sphinx_ext/directives.py` |
| 331 | +2. Register in `setup_sphinx()` in `myst_parser/sphinx_ext/main.py` |
| 332 | +3. Document in appropriate docs section |
| 333 | +4. Add tests in `tests/test_sphinx/` |
| 334 | + |
| 335 | +## Reference Documentation |
| 336 | + |
| 337 | +- [markdown-it-py Repository](https://github.com/ExecutableBookProject/markdown-it-py) |
| 338 | +- [markdown-it-py Documentation](https://markdown-it-py.readthedocs.io/) |
| 339 | +- [Docutils Repository](https://github.com/live-clones/docutils) |
| 340 | +- [Docutils Documentation](https://docutils.sourceforge.io/) |
| 341 | +- [Docutils release log](https://docutils.sourceforge.io/RELEASE-NOTES.html) |
| 342 | +- [Sphinx Repository](https://github.com/sphinx-doc/sphinx) |
| 343 | +- [Sphinx Extension Development](https://www.sphinx-doc.org/en/master/extdev/index.html) |
0 commit comments