Extensible (SysML2-like) Schema Proposal #1646

chrisjsewell · 2026-02-09T13:56:22Z

chrisjsewell
Feb 9, 2026
Maintainer

Summary

This document outlines a plan to make sphinx-needs schemas composable and extensible, inspired by SysML2.
The suggested implementation proceeds in phases:

migrate fields from needs.extra_options to needs.fields ✅,
migrate links from needs.extra_links to needs.links 🚧, and
enable per-type schemas with type inheritance (future).

Goal

We would like to allow for composable and extensible schemas for defining "needs" in Sphinx-needs, similar to how SysML2 allows for core definitions to be extended and specialized.

An "ideal" configuration format could look something like this:

# internally sphinx-needs defines some initial "core" fields ("title", "status", etc) and links ("links") that all needs inherit

# at the top-level the user can choose to redefine core fields/links or add new ones
# note, overriding must obey the `Liskov substitution principle <https://en.wikipedia.org/wiki/Liskov_substitution_principle>`__, by not weakening the type constraints of the original definition

[needs.fields.status]
# redefining the "status" field
# type is still string, but we restrict the allowed values
schema.enum = ["draft", "approved", "rejected"]
# setting a default value
default = "draft"

[needs.fields.custom1]
# adding a new field
description = "Custom field"
schema.type = "number"

[needs.links.links]
# redefining the "links" link
# type is still array(string), but we restrict the maximum number of items
schema.maxItems = 10

[needs.links.customLink]
# adding a new link
description = "A custom link"

# each type of need inherits from the top-level schema, but can further specialize and extend it

[needs.types.requirement]
# redefining the "custom1" field for requirements
fields.custom1.schema.minimum = 0.0
# adding a new field specific to requirements
fields.requirementType.schema.type = "string"

# need types can also inherit from other need types

[needs.types.testcase]
inherits = "requirement"
# redefining the "requirementType" field for testcases
fields.requirementType.schema.enum = ["unit", "integration", "system"]
# adding a new field specific to testcases
fields.testcaseSpecificField.schema.type = "boolean"

# we may also want to restrict link definitions for specific need types
[needs.types.testcase.links.myLink]
# this means this link can only point to needs of type "requirement", or its subtypes
linksTo = "requirement"

This would then lead to needs like:

.. requirement:: A sample requirement
   :custom1: 42.0
   :requirementType: anything

   This is a sample requirement need.

.. testcase:: A sample test case
   :custom1: 10.0
   :requirementType: unit
   :testcaseSpecificField: true
   :myLink: REQ-001

   This is a sample test case need.

Where the requirementType field is only available for needs of type requirement or its subtypes the custom1 field has different constraints depending on the need type, and the testcaseSpecificField is only available for needs of type testcase.

Inspiration from SysML2

This proposed configuration format draws significant inspiration from SysML2, the next-generation systems modeling language. SysML2 provides a powerful approach to defining and specializing system elements that we aim to adapt for sphinx-needs.

Key SysML2 Concepts Adopted

Specialization and Inheritance: In SysML2, elements can specialize other elements, inheriting their attributes and constraints while adding or refining them. Our inherits key mirrors this concept:
```
requirement def Requirement specializes CoreRequirement { ... }
```
becomes:
```
[needs.types.testcase]
inherits = "requirement"
```
Attribute Redefinition: SysML2 allows redefine to override attributes with stricter types or constraints, following the Liskov substitution principle. For example:
```
redefine attribute status : RequirementStatus [0..1] = draft;
```
Our schema allows similar redefinition by narrowing constraints:
```
[needs.fields.status]
schema.enum = ["draft", "approved", "rejected"]
```
Default Values: In SysML2, attributes can have default values specified with the = value syntax (e.g., attribute priority : Integer [0..1] = 3;). We mirror this with the default key:
```
[needs.fields.status]
default = "draft"
```
Constraints: SysML2 supports inline constraints on attributes (e.g., priority >= 1 and priority <= 5). We express these using JSON Schema validation:
```
schema.minimum = 1
schema.maximum = 5
```
Link Definitions: SysML2 defines connections between elements using link def, specifying source and target types:
```
link def RequirementRelation {
    end source : Requirement [1];
    end target : Requirement [1];
}
```
We adopt this with linksTo and linksFrom to constrain which need types a link can reference:
```
[needs.types.testcase.links.myLink]
linksTo = "requirement"
```

Additional SysML2 Concepts to Consider

The following SysML2 concepts are not yet adopted but could be valuable for future enhancements:

Concept	SysML2	Potential sphinx-needs Adoption
Multiplicity	`[0..1]`, `[1..*]` for cardinality	Already partially supported via JSON Schema `minItems`/`maxItems`; could add shorthand syntax
Abstract Types	`abstract requirement def` cannot be instantiated	Add `abstract = true` to `needs.types` to define base types that can only be specialized
Derived Attributes	`derived attribute x = ...` computed from other attributes	Relates to existing dynamic functions; could formalize as schema-level derived fields
Packages/Namespaces	`package Requirements { ... }` for organization	Could support schema namespacing for large projects with multiple schema files
Requirement Relationships	Built-in `satisfy`, `verify`, `refine`, `trace`	Already has `links`, but could add semantic meaning to link types (e.g., for traceability matrices)
Views and Viewpoints	Define different perspectives on the model	Could formalize different "views" of needs (e.g., for different stakeholders) beyond current filtering
Variability	Variation points for product lines	Already has variants feature; could align terminology/concepts

Benefits of the SysML2-Inspired Approach

Separation of concerns: Like SysML2, we separate the definition (schema) from the instantiation (the actual need directives in RST files)
Composability: Need types can be built upon each other, reducing duplication and enforcing consistency
Type safety: Constraints are validated at the schema level, catching errors early
Extensibility: Users can extend the core schema without modifying the base definitions, just as SysML2 allows extending standard libraries

Implementation

Sphinx Build Lifecycle Integration

Understanding when schema operations occur in the Sphinx build is crucial for this implementation. Based on the event flow (see the sphinx-needs AGENTS.md for details):

Build Phase	Event	Schema-Related Operations
Configuration	`config-inited`	Load and validate schema config, resolve `FieldSchema`/`LinkSchema` instances
Environment Setup	`env-before-read-docs`	Create validation schema, resolve type inheritance
Document Parsing	Directive parsing	Validate need fields/links against type-specific schema
Post-Processing	`write-started`	Run `post_process_needs_data`: extend, resolve functions, back-links, constraints, before needs become immutable
Output	`build-finished`	Export schemas to `needs.json`

Key Data Structures

The proposal builds on existing data structures:

From sphinx_needs/data.py:

NeedsCoreFields: Defines core field parameters (description, schema, allow_default, allow_df, allow_extend, etc.). Per-type schemas would extend this with type-specific field definitions.
NeedItem: The runtime representation of a need. Per-type schemas may require a more flexible type (or remain a single type with optional fields).
SphinxNeedsData: Central access point for needs data in the build environment. Provides get_schema() to access the current FieldsSchema. For per-type schemas, would need additional methods like get_schema_for_type(type_name).

From sphinx_needs/needs_schema.py:

FieldSchema: Immutable dataclass defining a single field's metadata—name, description, JSON Schema constraints, nullability, whether it allows defaults/extend/dynamic functions/variants, and default values. Provides convert_directive_option() for parsing RST option values and type_check() for validation.
LinkSchema: Similar to FieldSchema but specialized for link fields (arrays of need IDs).
FieldsSchema: Container class holding all field and link schemas. Provides methods to add fields and iterate over them.

How the schema is built (create_schema in sphinx_needs/needs.py):

During env-before-read-docs, the create_schema() function constructs the FieldsSchema,
from information obtained from the NeedsCoreFields and from user defined configuration.

For per-type schemas, this process would need to:

Build a base schema as above
For each need type in needs.types, create a specialized schema that inherits from the base (or another type)
Store schemas per-type, accessible via SphinxNeedsData.get_schema_for_type(type_name)

Current sphinx-needs v6

In the current version of sphinx-needs (v6), the schema is "monolithic", in that the same set of fields and links apply to all need types, with only limited per-type customization.

The core fields are defined in NeedsCoreFields (see sphinx_needs/data.py) with parameters controlling behavior:

add_to_field_schema: Whether the field appears in user-facing schema
allow_default: Whether custom defaults can be set
allow_df: Whether dynamic functions are allowed
allow_extend: Whether needextend can modify the field
allow_variants: Whether variant parsing applies

These parameters would need to be respected by per-type schema definitions.

User extensions are limited:

needs.extra_options: A singular list of fields that cannot override core fields
needs.extra_links: A singular list of link types that can override core link behavior, but in a limited way

There are also several scattered configuration options that affect field behavior globally, relying on the monolithic schema:

needs.global_options
needs.statuses
needs.tags
needs.variant_options

Processing and analysis of needs is also dependent on this monolithic schema, for example:

Need filtering
Need visualizations

Moving from `needs.extra_options` to `needs_fields`

The key change is migrating from needs_extra_options (a simple list of option names) to needs_fields (a dictionary-based format). This new format allows for:

More structured and extensible field definitions
Core field overriding/specialization (analogous to SysML2's redefine)
Consolidation of field-related configuration (needs_global_options, needs_statuses, needs_tags, needs_variant_options) into a single per-field definition

Old configuration (scattered across multiple options):

# Field definitions were just a list of names
needs.extra_options = ["priority", "verified"]

# Statuses and tags defined separately
needs.statuses = [{name = "draft"}, {name = "approved"}]
needs.tags = [{name = "security"}, {name = "usability"}]

# Defaults defined in a separate global_options section
[needs.global_options.priority]
default = 3

[needs.global_options.status]
default = "draft"

# Variant parsing defined globally
needs.variant_options = ["priority"]

New configuration (consolidated per-field):

[needs.fields.priority]
description = "Priority level"
schema.type = "integer"
schema.minimum = 1
schema.maximum = 5
default = 3
parse_variants = true

[needs.fields.verified]
schema.type = "boolean"
default = false

[needs.fields.status]
# Specializing the core "status" field
schema.enum = ["draft", "approved"]
default = "draft"

[needs.fields.tags]
# Specializing the core "tags" field
schema.items.enum = ["security", "usability"]

Internally, this is powered by the FieldSchema dataclass (see sphinx_needs/needs_schema.py), which serves as a single source of truth for each field's schema. Rather than scattering field configuration across multiple config options and accessing them directly during processing, all field metadata is resolved into FieldSchema instances at configuration time. These instances then drive all downstream processing—validation, filtering, visualization, and ensure consistency.

Relevant modules:

sphinx_needs/needs_schema.py — FieldSchema and LinkSchema dataclass definitions
sphinx_needs/needs.py — Configuration loading and schema resolution
sphinx_needs/schema/process.py — Schema validation processing

Moving from `needs.extra_links` to `needs.links`

Similar to the needs_fields migration, we are also moving link configuration from needs_extra_links to a new needs.links dictionary-based format. This work has begun with commit 4cf3021, which centralizes link schema definitions onto the LinkSchema class.

Key changes so far:

Added schema field to LinkSchema with validation in __post_init__
Updated schema processing to use LinkSchema.schema instead of extracting from config
Aligned the internal representation of link fields with how regular need fields are handled via FieldSchema

Planned migration:

# Old configuration
[[needs.extra_links]]
option = "tests"
incoming = "is_tested_by"

[[needs.extra_links]]
option = "implements"
incoming = "is_implemented_by"
schema.maxItems = 5

# New configuration (planned)
[needs.links.tests]
incoming = "is_tested_by"

[needs.links.implements]
incoming = "is_implemented_by"
schema.maxItems = 5

This change brings links in line with fields, using LinkSchema (see sphinx_needs/needs_schema.py) as the single source of truth for link metadata. Like FieldSchema, the LinkSchema instances will drive all downstream processing, ensuring that link validation, back-link generation, and visualization use consistent schema information.

Work to do:

Add missing fields to LinkSchema: The current LinkSchema only has schema-related fields. We need to add:
- outgoing: The label for the outgoing link (e.g., "tests")
- incoming: The label for the incoming/back link (e.g., "is_tested_by")
- copy: Whether to copy the link to child needs
- allow_dead_links: Whether to allow links to non-existent needs
- style: Styling information for visualization
- style_part: Styling for need parts
- style_start: Start arrow style
- style_end: End arrow style
Migrate codebase from needs_config.extra_links to LinkSchema: After configuration resolution, all access to link information should use LinkSchema instances instead of the raw config dictionaries. Files to update include:
- sphinx_needs/layout.py — link name lookups and iteration
- sphinx_needs/utils.py — link type iteration
- sphinx_needs/api/need.py — link validation and back-link generation
- sphinx_needs/roles/need_outgoing.py — link lookup dictionary
- sphinx_needs/directives/needtable.py — link type iteration
- sphinx_needs/directives/needflow/ — link visualization (_directive.py, _graphviz.py, _plantuml.py)
- sphinx_needs/directives/needgantt.py — link type validation
- sphinx_needs/directives/needsequence.py — link type names
- sphinx_needs/directives/need.py — dead link checking
- sphinx_needs/directives/list2need.py — link type list
- sphinx_needs/directives/needreport.py — link config reporting
Deprecate direct access to needs_config.extra_links: Once all code uses LinkSchema, mark direct config access as deprecated change needs_config.extra_links to a private attribute (e.g., _extra_links) to signal it should not be used directly anymore.

Moving from monolithic need schema to per-type schemas

This will be a larger effort, as many parts of the codebase currently assume a single schema for all needs,
and it will also be challenging to maintain backward compatibility.

Ingestion of needs directives

When ingesting needs directives from RST files, we will need to determine the need type (from the directive name) and then validate the fields and links against that need type's schema.

Storage of need data may also need to be updated to consider the need type.

Interaction with `needextend`

The needextend directive modifies needs after initial parsing (during extend_needs_data in post-processing). With per-type schemas:

Field validation: needextend should only allow fields valid for the target need's type
allow_extend flag: Core fields already have an allow_extend parameter; per-type schemas should inherit/override this

Interaction with dynamic functions

Dynamic functions (resolve_functions in post-processing) can set field values programmatically. With per-type schemas:

Return type validation: Dynamic function results should be validated against the field's schema
Type-aware functions: Functions may need access to the need's type to return appropriate values
allow_df flag: Core fields have this; per-type schemas should respect/override it

Filtering (a.k.a need querying)

One of the biggest challenges will be updating how need filtering works.

Current implementation

See sphinx_needs/filter_common.py.

The filter_needs_and_parts function evaluates Python expressions against a context built from each need's data:

filter_context: dict[str, Any] = {**need}
result = eval(filter_string, filter_context)

Users write filters like status == "approved" and "security" in tags.

Problems with per-type schemas

NameError for missing fields: If testcaseSpecificField only exists on testcase needs, evaluating testcaseSpecificField == True raises NameError for other need types. Currently avoided because all needs share the same monolithic schema.
Type guards don't help: You might think type == "testcase" and testcaseSpecificField == True would work, but Python resolves all variable names before evaluating logic—the NameError occurs during compilation, not evaluation, so short-circuit and doesn't help.
Performance limitations:
- No indexing on commonly-filtered fields
- No query optimization (e.g., filtering by indexed type first)
- eval() overhead for each need

How SysML2 handles this

SysML2 uses type-scoped, OCL-inspired queries:

Requirement.allInstances() -> select(r | r.verificationStatus == Failed)

This provides type scoping (only Requirement instances considered) and type safety (compiler validates verificationStatus exists on Requirement).

Possible approaches

Approach	Description	Pros	Cons
`get()` function	`get("field") == value`	Simple	Verbose, still uses `eval()`
Pre-populate with `None`	Add all fields to context with `None` default	Backward compatible	Hides schema errors
Catch `NameError`	Treat as "field not present" → `False`	Simple	Hides typos in field names
AST pre-parsing	Analyze expression before evaluation	Enables optimization, better errors	Complexity
Custom DSL	`testcase[field = true]`	Type-safe, optimizable	Breaking change

Recommended approach

Short term: Pre-populate the filter context with None for all fields from all schemas. This maintains backward compatibility while enabling per-type schemas.

Long term: Invest in AST-based expression analysis:

Pre-parse with Python's ast module (already partially done in filter_common.py)
Analyze to identify referenced fields, validate against schemas, detect type guards
Build optimized evaluation plan with index support
Optionally introduce a simpler DSL with a migration path (support both syntaxes, auto-detect, provide conversion tool)

Visualizations / renderings

Many visualizations (needflow, needgantt, needsequence) currently assume a monolithic schema. These will need updates to handle per-type schemas:

For renderings of actual needs.
n needflow diagrams, link definitions may need to vary by need type. We will need to update the visualization code to reference the appropriate FieldSchema and LinkSchema based on each need's type.
For needgantt and needsequence, any field-specific logic (e.g., date fields for scheduling) will need to be aware of per-type schemas.
For needtable, we will need to ensure that column definitions can reference fields that may not exist on all need types, and handle missing data gracefully.

Exchange formats (`needs.json`)

The needs.json export will need to include schema information for each need type,
so that consumers of the JSON can understand the fields and links available for each need.
This may involve adding a schemas section to the JSON output that describes each need type's schema.

{
  "versions": { ... },
  "schemas": {
    "requirement": {
      "inherits": null,
      "fields": {
        "requirementType": { "type": "string" }
      },
      "links": { ... }
    },
    "testcase": {
      "inherits": "requirement",
      "fields": {
        "testcaseSpecificField": { "type": "boolean" }
      }
    }
  },
  "needs": { ... }
}

Consideration for how needs.json are consumed in sphinx-needs (from needs.external_needs and needsimport) will also be necessary:

Import validation: Imported needs should be validated against the local schema (or the imported schema if included)
Schema merging: When importing from projects with different schemas, how to handle conflicts?
External needs: External needs may reference types not defined locally

Sealing and immutability

After post-processing, needs are "sealed" (made immutable) in the ensure_post_process_needs_data callbacke function (which calls post_process_needs_data).
Per-type schema validation should complete before sealing:

All needextend modifications applied
All dynamic functions resolved
Back-links generated (respecting per-type link constraints)
Constraints processed (including per-type JSON Schema validation)
Final per-type schema validation (before sealing)
Needs sealed — any further modification raises an error

Orthogonal Consideration: Type-Namespaced Need IDs

An orthogonal change to consider is namespacing need IDs by type—IDs unique within their type rather than globally.

Current Behavior

Currently, sphinx-needs requires all need IDs to be globally unique. With type-namespaced IDs, simple sequential IDs could be used per type:

.. canonical ID: requirement.001
.. requirement:: First requirement
   :id: 001

.. canonical ID: testcase.001
.. testcase:: First test case
   :id: 001

How SysML2 Handles This

In SysML2, elements are identified within their containing package/namespace:

requirement def Requirement specializes Core { ... }

package nc {
    requirement def TestCase specializes Core { ... }
}

Proposed Configuration

[needs]
namespaced_ids = true  # Opt-in for backward compatibility

[needs.types.requirement]
# ID "001" becomes "requirement.001"

[needs.types.testcase]
namespace = ["nc"]
# ID "001" becomes "nc.testcase.001"

need parts

Need parts could also be worked into this namespacing scheme, e.g., requirement.001.part1, for:

.. requirement:: A requirement with parts
   :id: 001

   A requirement with a part: :np:`part1`

ID Resolution (Sphinx-Inspired)

Inspired by Sphinx Python domain resolution, which searches for targets by progressively adding qualification (module, class). We could adapt this:

Syntax	Meaning
`nc.testcase.001`	Fully qualified, always unambiguous
`001`	Short ID, resolves across all namespaces and warns if multiple matches

Example usage:

.. testcase:: Test for requirement
   :id: 001
   :tests: nc.testcase.001
   :tests: 001
   
   This test verifies :need:`nc.testcase.001` (fully qualified).
   See also :need:`001` (attempt to match across all namespaces).

Trade-offs

Pros	Cons
Simpler sequential IDs per type	Added resolution complexity
Type visible in full ID	Migration for existing projects
Scales better for large projects	Potential ambiguity with short IDs
Aligns with SysML2 namespacing	-

Recommendation

This is orthogonal to the extensible schema work but becomes more natural if per-type schemas are adopted. Implement as opt-in (needs.namespaced_ids = true) to maintain backward compatibility.

chrisjsewell · 2026-02-12T21:59:01Z

chrisjsewell
Feb 12, 2026
Maintainer Author

Plan: Type-Aware `FieldsSchema` Retrieval

Related: Discussion #1646 — Extensible (SysML2-like) Schema Proposal

Goal

Enable FieldsSchema to return field data based on need type, laying the groundwork for per-type schemas (phase 3 of the extensible schema proposal).

Scope of this plan: All need types share the same field names and base value types, but other per-field configuration can differ by type (e.g. constraints, defaults, nullability, parse_dynamic_functions, parse_variants, allow_extend).

Proposed API Changes to `FieldsSchema`

Add an optional need_type parameter to retrieval methods. When None, return the base (global) schema — preserving full backward compatibility.

class FieldsSchema:
    def get_core_field(self, name: str, need_type: str | None = None) -> FieldSchema | None: ...
    def get_extra_field(self, name: str, need_type: str | None = None) -> FieldSchema | None: ...
    def get_link_field(self, name: str, need_type: str | None = None) -> LinkSchema | None: ...
    def get_any_field(self, name: str, need_type: str | None = None) -> FieldSchema | LinkSchema | None: ...
    def iter_core_fields(self, need_type: str | None = None) -> Iterable[FieldSchema]: ...
    def iter_extra_fields(self, need_type: str | None = None) -> Iterable[FieldSchema]: ...
    def iter_link_fields(self, need_type: str | None = None) -> Iterable[LinkSchema]: ...
    def iter_extra_field_names(self, need_type: str | None = None) -> Iterable[str]: ...
    def iter_link_field_names(self, need_type: str | None = None) -> Iterable[str]: ...

Internally, store per-type overrides in a dict[str, dict[str, FieldSchema | LinkSchema]] keyed by (type_name, field_name). Lookup falls back to the base schema when no type-specific override exists.

Files to Modify

Tier 1 — Schema Core (do first)

`sphinx_needs/needs_schema.py` — `FieldsSchema` class

What changes: Add per-type override storage, add need_type parameter to all get_* / iter_* methods, add add_*_override(type_name, field) methods for populating type-specific data.
Considerations:
- All existing callers pass no need_type → fully backward compatible.
- The base field set (names + base types) is shared; overrides only carry narrowed FieldSchema/LinkSchema instances produced by inherit_schema().
- Must keep FieldsSchema cheap to query — per-type dicts with fallback, no deep copies per call.

`sphinx_needs/needs.py` — `create_schema()`

What changes: After building the base schema, iterate needs_config.types and, for each type that declares field overrides, call inherit_schema() to produce specialized FieldSchema/LinkSchema instances and register them via add_*_override().
Considerations:
- Type inheritance chains (inherits = "requirement") must be resolved in topological order.
- inherit_schema() already enforces Liskov-safe narrowing; reuse it.
- Config format for per-type overrides (needs.types.<type>.fields.*) doesn't exist yet — this step is a no-op until that config is added, but the storage must be ready.

Tier 2 — Need Creation & Validation (high value, type is available)

`sphinx_needs/api/need.py` — `generate_need()`

Call site	Current method	Line(s)	Change
Validate kwargs	`iter_link_field_names()`, `iter_extra_field_names()`	~181–182	Pass `need_type`
Convert/validate extra fields	`iter_extra_fields()`	~266	Pass `need_type`
Convert/validate link fields	`iter_link_fields()`	~285	Pass `need_type`
Core field conversion	`get_core_field(name)` via `_convert_type_core()` (×11)	~221–253	Accept and forward `need_type`
Core field defaults	`get_core_field("status")`, etc. (×10)	~338–383	Pass `need_type`
Extra field defaults	`get_extra_field(k)`	~386	Pass `need_type`
Link field defaults	`get_link_field(k)`	~392	Pass `need_type`
Copy links	`iter_link_fields()` via `_copy_links()`	~394	Pass `need_type`
Final extras validation	`get_extra_field(k)`	~489	Pass `need_type`
Nullable check	`get_core_field(name)` via `_convert_to_none_str_func()`	~662	Accept and forward `need_type`

Considerations:
- need_type is already a parameter of generate_need() — no plumbing needed to get it.
- This is the highest-value change: all ingestion validation and default resolution becomes type-aware.
- Helper functions (_convert_type_core, _convert_to_none_str_func, _copy_links) need a need_type parameter added.

`sphinx_needs/directives/need.py` — `NeedDirective.run()`

Call site	Current method	Line(s)	Change
Build link/extra option sets	`iter_link_field_names()`, `iter_extra_field_names()`	~121–122	Pass `self.name` (= need type)
Dead link checking	`get_link_field()`	~462	Pass need type from need data

Considerations:
- The directive name is the need type — trivial to thread through.
- Controls which options are accepted in the RST/MyST directive.
- _check_dead_links() (called during post-processing) has access to each need's type.

`sphinx_needs/directives/list2need.py` — `List2NeedDirective`

Call site	Current method	Line(s)	Change
Build defined options set	`iter_extra_field_names()`	~95	Pass need type
Build link keys set	`iter_link_field_names()`	~96	Pass need type
Process link options	`get_link_field()`	~180–181	Pass need type

Considerations:
- Need type is available via the type option on the directive.
- Structurally identical to NeedDirective option parsing.

Tier 3 — Post-Processing (type is available via need data)

`sphinx_needs/functions/functions.py` — `_resolve_functions_fields_iteration()`

Call site	Current method	Line(s)	Change
Get field for type checking	`get_any_field(field_name)`	~287	Pass `need["type"]`

Considerations:
- Easy change; need data (including type) is available in the resolve loop.
- Enables type-specific constraint validation on dynamic function return values.

`sphinx_needs/directives/needextend.py`

Call site	Current method	Line(s)	Change
Validate extended field exists	`get_any_field(key)`	~110	See below

Considerations:
- Major design decision: needextend targets are determined by a filter that may match needs of different types. Two options:
  1. Validate per-need at apply time (recommended): when extending each matched need, validate the field against that need's type. This is the safest approach and aligns with the Liskov principle.
  2. Validate against union at parse time: check the field exists in any type's schema. Less strict but catches typos early.
- For the initial implementation (all types share all field names), this is a non-issue — the change is purely mechanical. It becomes meaningful once types can have different field sets.

Tier 4 — Layout & Rendering (type available per-need)

`sphinx_needs/layout.py` — `NeedLayout`

Call site	Current method	Line(s)	Change
Schema stored on init	—	~114	No change needed
Build link name list	`iter_link_fields()` (×2, for names and `_back` names)	~646–647	Pass `self.need["type"]`
Validate link name	`get_link_field(name)`	~679	Pass `self.need["type"]`
Iterate links for `meta_links_all`	`iter_link_fields()`	~707	Pass `self.need["type"]`

Considerations:
- self.need contains type — straightforward.
- Enables rendering only the links relevant to the need's type.

Tier 5 — Import / External Needs

`sphinx_needs/external_needs.py`

Call site	Current method	Line(s)	Change
Build known keys	`iter_extra_field_names()`, `iter_link_field_names()`	~128, ~134–136	Potentially per-need
Iterate link fields for prefix editing	`iter_link_fields()`	~142	Potentially per-need

Considerations:
- Currently builds one known_keys set before the import loop.
- With type-aware schemas: either compute per-need (slower) or use union (fast, current behavior).
- For the initial plan (all types share field names), union is equivalent — no functional change needed.

`sphinx_needs/directives/needimport.py`

Call site	Current method	Line(s)	Change
Build known keys	`iter_extra_field_names()`, `iter_link_field_names()`	~199, ~206	Same as above
Iterate link fields for prefix editing	`iter_link_fields()` (×2, for names and `_back` names)	~208–214	Same as above

Considerations: Same pattern as external_needs.py.

Tier 6 — Visualization Directives (type usually NOT available)

These directives display or connect needs of multiple types simultaneously. They should use the base schema (no need_type argument) to get the union of all fields.

File	Method(s) used	Change needed
`sphinx_needs/directives/needtable.py`	`iter_link_fields()`	None initially; later may need to handle missing fields gracefully per-type
`sphinx_needs/directives/needflow/_directive.py`	`iter_link_field_names()`	None — use base schema
`sphinx_needs/directives/needflow/_graphviz.py`	`iter_link_field_names()` (×2)	None — use base schema
`sphinx_needs/directives/needflow/_plantuml.py`	`iter_link_field_names()`, `iter_link_fields()`	None — use base schema
`sphinx_needs/directives/needgantt.py`	`get_any_field()`, `iter_link_field_names()`	None — use base schema
`sphinx_needs/directives/needsequence.py`	`iter_link_field_names()`	None — use base schema

Considerations:
- These are safe to leave unchanged for this plan since need_type=None returns the base schema.
- Future work: when types can have different field sets, needtable must handle None/missing values gracefully for columns referencing type-specific fields.

Tier 7 — Schema Export, Validation & Resolution

`sphinx_needs/schema/process.py`

Call site	Current method	Line(s)	Change
Build JSON Schema	`iter_extra_fields()`, `iter_core_fields()`, `iter_link_fields()`	~42–65	Could produce per-type schemas

Considerations:
- For the initial plan (shared field names), the base schema suffices.
- Future: emit per-type schemas in the JSON Schema output for stricter per-type validation.

`sphinx_needs/schema/resolve.py`

Call site	Current method	Line(s)	Change
Validate link references in schemas	`get_link_field()`	~186	No change (config-level validation)
Inject field types into schema definitions	`iter_extra_fields()`	~424	No change
Inject link types into schema definitions	`iter_link_fields()`	~433	No change
Inject core types into schema definitions	`iter_core_fields()`	~443	No change

Considerations:
- Operates at env-before-read-docs time, validating/resolving schema configuration.
- Config validation is inherently type-agnostic (validates base definitions).
- When per-type overrides are added, this file would need to validate that type-specific overrides are Liskov-compatible with the base schema (handled by inherit_schema()).

`sphinx_needs/needsfile.py`

Call site	Current method	Line(s)	Change
Build `needs.json` schema section	`iter_extra_fields()`, `iter_link_fields()`	~43–65	Could include per-type schema info

Considerations:
- Future: add a "schemas" section to needs.json keyed by type name.
- Not needed for the initial plan.

Tier 8 — Roles & Miscellaneous

File	Method(s) used	Type available?	Change
`sphinx_needs/roles/need_outgoing.py`	`get_link_field()`	Yes (`ref_need` available)	Pass need type
`sphinx_needs/directives/needreport.py`	`iter_extra_field_names()`, `iter_link_field_names()`	No (reporting all)	None — use base schema
`sphinx_needs/utils.py`	`iter_link_fields()`	Yes (need available)	Pass need type

Recommended Implementation Order

Phase A — Foundation (no behavioral change)
  1. Add per-type storage to FieldsSchema + need_type parameter to all methods
  2. Update create_schema() to populate type overrides (no-op until config exists)
  3. All tests pass with no changes — pure additive

Phase B — Thread need_type through creation path
  4. api/need.py — generate_need() and helpers
  5. directives/need.py — NeedDirective.run() and _check_dead_links()
  6. directives/list2need.py — List2NeedDirective
  7. All tests pass — behavior unchanged since no type overrides configured

Phase C — Thread need_type through post-processing
  8. functions/functions.py — resolve_functions
  9. directives/needextend.py — extend_needs_data (per-need validation)

Phase D — Thread need_type through rendering
  10. layout.py, roles/need_outgoing.py, utils.py

Phase E — Config & export (enables actual per-type config)
  11. Add needs.types.<type>.fields.* config parsing
  12. Update schema/resolve.py validation
  13. Update needsfile.py / schema/process.py for per-type export

Key Considerations

Backward Compatibility

need_type=None always returns the base schema → all existing code works unchanged.
Per-type overrides are purely opt-in via new config keys.
No existing config format changes required.

`needextend` Challenge

Filters can match needs of different types.
Recommendation: validate fields per-need at apply time (in extend_needs_data), not at directive parse time. This is already the natural flow since extensions are applied during post-processing when the target need's type is known.

Filtering / `NameError` Problem (future)

When types have different field sets, eval()-based filters will raise NameError for fields not on a given type.
Short-term mitigation: pre-populate filter context with None for all fields from all types.
Long-term: AST-based expression analysis.
Not in scope for this plan (all types share field names).

Visualization Directives

These always show mixed types → use base schema (union).
Future: needtable needs graceful handling of None for type-specific columns.

Performance

Per-type dict lookup with fallback is O(1) — negligible overhead.
No deep copies of FieldSchema per call; overrides are pre-built at schema creation time.

Quick Wins

FieldsSchema API addition (Phase A) — zero risk, purely additive.
api/need.py (Phase B) — highest value; need_type already in scope; enables type-specific defaults and validation immediately once config exists.
directives/list2need.py (Phase B) — same pattern as NeedDirective; need type available via directive option.
functions/functions.py (Phase C) — single call site, need type readily available.

Pain Points

needextend — filter-to-type ambiguity requires per-need validation; adds complexity to extend_needs_data.
Import paths (external_needs.py, needimport.py) — currently build a single known_keys set before the loop; may need restructuring for per-type validation.
Visualization directives — when field sets diverge across types, tables/flows must handle missing fields without crashing. See appendix below for detailed analysis.

Appendix: Visualization Directives and Missing Fields

Context: Today every NeedItem is initialized with every extra field and link field (set to None / []), so need["some_field"] never raises KeyError. When types can have different field sets, a NeedItem of type requirement would not have a key for a field defined only on type testcase. This section catalogues where that would break and how to fix it.

Current field-access patterns

Component	Access pattern	Behavior if key missing	Accesses extra fields?
`NeedItem.__getitem__`	`need[key]`	Raises `KeyError`	N/A (foundation)
`NeedItem.get`	`need.get(key)`	Returns `None`	N/A (foundation)
needtable cell rendering (`row_col_maker`)	`key in need_info and need_info[key] is not None`	Empty cell (silent)	Yes — any column name
needtable sorting	`need[key]`	Crashes (`KeyError`)	Yes — sort key
needgantt duration/completion	`need[option]`	Crashes (`KeyError`)	Yes — configurable fields
needflow graphviz (nodes + edges)	`need_info[field]`	Crashes (`KeyError`)	No — core + links only
needflow plantuml (Jinja template)	`template.render(**need_info, ...)`	Jinja `undefined` → empty string	Yes — template-dependent
needflow plantuml (connections)	`need_info[link_type.name]`	Crashes (`KeyError`)	No — link fields only
needsequence	`sender[link_type]`, `msg_need["id"]`, etc.	Crashes (`KeyError`)	No — core + links only
layout `meta()`	`try: self.need[name] except KeyError: ""`	Graceful (returns `""`)	Yes — any field
layout `meta_all()`	Iterates all keys → delegates to `meta()`	Graceful	Yes — all fields
layout `meta_links_all()`	`self.need[type_key]` (truthy check)	Crashes (`KeyError`) if link missing	Links only
layout string replacement	`if item not in self.need: raise`	Raises `SphinxNeedLayoutException`	Yes — any `[[field]]`
filter context	`{**need}` → `eval(expr, ctx)`	`NameError` in eval	All fields in context

Risk tiers

Already safe (no changes needed):

layout.meta() and layout.meta_all() — already catch KeyError and handle None/empty.
row_col_maker in needtable — uses key in need_info guard; missing fields produce empty cells.
Plantuml Jinja templates — Jinja's default undefined renders as empty string.

Core-only access (safe as long as core fields remain universal):

needflow graphviz, needsequence — only access id, title, type, status, style, etc. and link fields. These would remain on all types even with per-type schemas, so no risk under the current plan (shared field names). Only becomes a concern if link sets diverge across types.

Needs fixing (would crash on missing extra fields):

needtable sorting — need[key] with no guard.
needgantt duration/completion — need[option] with no guard.
layout.meta_links_all() — self.need[type_key] with no guard if link sets diverge.
layout string replacement — raises on unknown [[field]].

Proposed fixes

Strategy 1: `NeedItem.get()` everywhere (minimal, targeted)

Change the ~4 crash-prone sites to use need.get(key, default) instead of need[key]:

# needtable sorting — return a sort-neutral default for missing fields
def sort(need):
    value = need.get(key, "")   # was: need[key]
    ...

# needgantt — treat missing duration/completion as "not applicable"
duration = need.get(duration_option, None)
if not duration:
    ...  # existing fallback logic already handles None/falsy

# layout.meta_links_all — skip links not on this need's type
for link in self.needs_schema.iter_link_fields():
    type_key = link.name
    if type_key in self.need and self.need[type_key] and type_key not in exclude:
        ...

Pros: Minimal diff, no architectural change, immediately safe.
Cons: Every new visualization site must remember to use .get() — easy to regress.

Strategy 2: `NeedItem.getitem` returns a sentinel for missing fields

Make NeedItem.__getitem__ return a typed sentinel (e.g. FIELD_NOT_PRESENT) instead of raising KeyError for fields that don't exist on this need's type. Callers can then check explicitly:

value = need[key]
if value is FIELD_NOT_PRESENT:
    ...  # skip or render placeholder

Pros: Centralised — no crash regardless of access pattern.
Cons: Every need[key] comparison (== None, truthiness, string ops) must be audited to handle the sentinel. High risk of subtle bugs. Changes the contract of __getitem__.

Strategy 3: Keep `KeyError` but add a `has_field()` method

Add NeedItem.has_field(key) -> bool that checks whether the field is defined for this need's type schema. Callers that operate on mixed types guard with it:

if need.has_field(key):
    value = need[key]
    ...

Pros: Explicit opt-in, clear semantics, no change to __getitem__ contract.
Cons: Still requires callers to remember to check — but the crash on missing field acts as a loud reminder.

Recommended approach

Strategy 1 for the initial plan (all types share field names — this is a no-op in practice, but hardens the code). Strategy 3 for the future when field sets genuinely diverge: add has_field() to NeedItem and audit all visualization sites listed above.

The layout.meta() pattern (try/except KeyError → "") is the gold standard. For needtable and needgantt, switching to .get() with sensible defaults is a ~10-line change per file and eliminates the crash risk entirely.

Filter context (`eval`-based filtering)

The filter system unpacks all need fields into a dict and calls eval(). A field absent from a need type would not appear in the eval namespace, causing NameError if the filter expression references it.

Short-term: Pre-populate the filter context with None for all fields from all types (union). This is cheap (one dict update per need) and fully backward-compatible — filters like custom_field == "x" simply evaluate to False for needs that don't have custom_field (since None == "x" is False).

Long-term: AST-based expression pre-analysis that can detect which fields are referenced, validate them against the need's type schema, and short-circuit evaluation for needs where a referenced field doesn't exist. This is a larger project and out of scope for the initial plan.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extensible (SysML2-like) Schema Proposal #1646

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Extensible (SysML2-like) Schema Proposal #1646

Uh oh!

chrisjsewell Feb 9, 2026 Maintainer

Summary

Goal

Inspiration from SysML2

Key SysML2 Concepts Adopted

Additional SysML2 Concepts to Consider

Benefits of the SysML2-Inspired Approach

Implementation

Sphinx Build Lifecycle Integration

Key Data Structures

Current sphinx-needs v6

Moving from needs.extra_options to needs_fields

Moving from needs.extra_links to needs.links

Moving from monolithic need schema to per-type schemas

Ingestion of needs directives

Interaction with needextend

Interaction with dynamic functions

Filtering (a.k.a need querying)

Current implementation

Problems with per-type schemas

How SysML2 handles this

Possible approaches

Recommended approach

Visualizations / renderings

Exchange formats (needs.json)

Sealing and immutability

Orthogonal Consideration: Type-Namespaced Need IDs

Current Behavior

How SysML2 Handles This

Proposed Configuration

need parts

ID Resolution (Sphinx-Inspired)

Trade-offs

Recommendation

Replies: 1 comment

Uh oh!

chrisjsewell Feb 12, 2026 Maintainer Author

Plan: Type-Aware FieldsSchema Retrieval

Goal

Proposed API Changes to FieldsSchema

Files to Modify

Tier 1 — Schema Core (do first)

sphinx_needs/needs_schema.py — FieldsSchema class

sphinx_needs/needs.py — create_schema()

Tier 2 — Need Creation & Validation (high value, type is available)

sphinx_needs/api/need.py — generate_need()

sphinx_needs/directives/need.py — NeedDirective.run()

sphinx_needs/directives/list2need.py — List2NeedDirective

Tier 3 — Post-Processing (type is available via need data)

sphinx_needs/functions/functions.py — _resolve_functions_fields_iteration()

sphinx_needs/directives/needextend.py

Tier 4 — Layout & Rendering (type available per-need)

sphinx_needs/layout.py — NeedLayout

Tier 5 — Import / External Needs

sphinx_needs/external_needs.py

sphinx_needs/directives/needimport.py

Tier 6 — Visualization Directives (type usually NOT available)

Tier 7 — Schema Export, Validation & Resolution

sphinx_needs/schema/process.py

sphinx_needs/schema/resolve.py

sphinx_needs/needsfile.py

Tier 8 — Roles & Miscellaneous

Recommended Implementation Order

Key Considerations

Backward Compatibility

needextend Challenge

Filtering / NameError Problem (future)

Visualization Directives

Performance

Quick Wins

Pain Points

Appendix: Visualization Directives and Missing Fields

Current field-access patterns

Risk tiers

Proposed fixes

Strategy 1: NeedItem.get() everywhere (minimal, targeted)

Strategy 2: NeedItem.__getitem__ returns a sentinel for missing fields

Strategy 3: Keep KeyError but add a has_field() method

chrisjsewell
Feb 9, 2026
Maintainer

Moving from `needs.extra_options` to `needs_fields`

Moving from `needs.extra_links` to `needs.links`

Interaction with `needextend`

Exchange formats (`needs.json`)

chrisjsewell
Feb 12, 2026
Maintainer Author

Plan: Type-Aware `FieldsSchema` Retrieval

Proposed API Changes to `FieldsSchema`

`sphinx_needs/needs_schema.py` — `FieldsSchema` class

`sphinx_needs/needs.py` — `create_schema()`

`sphinx_needs/api/need.py` — `generate_need()`

`sphinx_needs/directives/need.py` — `NeedDirective.run()`

`sphinx_needs/directives/list2need.py` — `List2NeedDirective`

`sphinx_needs/functions/functions.py` — `_resolve_functions_fields_iteration()`

`sphinx_needs/directives/needextend.py`

`sphinx_needs/layout.py` — `NeedLayout`

`sphinx_needs/external_needs.py`

`sphinx_needs/directives/needimport.py`

`sphinx_needs/schema/process.py`

`sphinx_needs/schema/resolve.py`

`sphinx_needs/needsfile.py`

`needextend` Challenge

Filtering / `NameError` Problem (future)

Strategy 1: `NeedItem.get()` everywhere (minimal, targeted)

Strategy 2: `NeedItem.getitem` returns a sentinel for missing fields

Strategy 3: Keep `KeyError` but add a `has_field()` method

Filter context (`eval`-based filtering)